Cost Flow¶
目的:成本流动图 关联:topics/cost-tracking.md
1. 4 维 token 流动¶
graph TD
A[API call] --> B[usage]
B --> C[input_tokens]
B --> D[output_tokens]
B --> E[cache_creation_input_tokens]
B --> F[cache_read_input_tokens]
C --> G[累加]
D --> G
E --> G
F --> G
G --> H[session cost]
H --> I[today/week/month]
4 token → 累加 → 显示。
2. 4 步 cache 成本¶
sequenceDiagram
participant Client
participant API
participant Cache
Client->>API: 1st call (no cache)
API->>Cache: write 100K tokens
Cache-->>API: stored
API-->>Client: { input: 100K, cache_creation: 100K, cache_read: 0 }
Client->>API: 2nd call (same prefix)
API->>Cache: read
Cache-->>API: hit
API-->>Client: { input: 0, cache_creation: 0, cache_read: 100K }
Note over Client: Cost: write $3.75 + read $0.30 = $4.05<br/>vs without cache: 2 * $3.00 = $6.00
cache 节省 32%。
3. 3 模型单价¶
graph LR
A[Model] --> B{Model type}
B -->|opus-4-8| C[input $15/M, output $75/M]
B -->|sonnet-4-6| D[input $3/M, output $15/M]
B -->|haiku-4-5| E[input $0.80/M, output $4/M]
A --> F[Usage]
F --> G[Cost]
G -->|input * 0.001 * price| H[USD]
G -->|output * 0.001 * price| H
G -->|cache_creation * 0.001 * 1.25x| H
G -->|cache_read * 0.001 * 0.1x| H
3 模型 + 4 token。
4. 4 维 token 累加¶
graph TD
A[每次 API call] --> B[response.usage]
B --> C{4 token}
C --> D[input_tokens]
C --> E[output_tokens]
C --> F[cache_creation]
C --> G[cache_read]
D -->|add| H[session]
E -->|add| H
F -->|add| H
G -->|add| H
H -->|aggregate| I[CostSummary]
I -->|display| J[/cost]
I -->|display| K[/insights]
4 累加 → 显示。
5. session 成本流¶
graph LR
A[turn 1] --> B[+ $0.10]
C[turn 2] --> D[+ $0.05]
E[turn 3] --> F[+ $0.08]
G[turn 4] --> H[+ $0.12]
I[turn 5] --> J[+ $0.06]
B --> K[session: $0.41]
D --> K
F --> K
H --> K
J --> K
K -->|summary| L[today: $0.41]
K -->|summary| M[week: $2.30]
K -->|summary| N[month: $8.50]
5 步累加。
6. 4 步成本上限检查¶
graph TD
A[API call] --> B{cost check}
B --> C[累加 current cost]
C --> D{cost > max_budget_usd?}
D -->|yes| E[throw BudgetExceeded]
D -->|no| F{cost > max_budget_usd * 0.8?}
F -->|yes| G[warn user]
F -->|no| H{turns > max_turns?}
H -->|yes| I[stop]
H -->|no| J[execute]
4 步。
7. 5 模型选择¶
graph TD
A[Task] --> B{complexity}
B -->|simple| C[haiku]
B -->|medium| D[sonnet]
B -->|complex| E[opus]
C -->|cost| F[$0.80/M in]
D -->|cost| G[$3/M in]
E -->|cost| H[$15/M in]
C -->|speed| I[fast]
D -->|speed| J[medium]
E -->|speed| K[slow]
C -->|quality| L[good]
D -->|quality| M[better]
E -->|quality| N[best]
3 模型 × 3 维。
8. 4 步 cache 命中优化¶
graph TD
A[API call] --> B{has cache}
B -->|yes, full hit| C[read 100K, cost $0.30]
B -->|partial hit| D[read 50K, write 50K]
B -->|no hit| E[write 100K, cost $3.75]
B -->|completely new| F[write 100K, cost $3.75]
C -->|saving 90%| G[✓ optimal]
D -->|saving 50%| H[~]
E -->|saving 0%| I[1st time]
F -->|saving 0%| J[1st time]
4 cache 状态。
9. /cost 显示¶
graph LR
A[session] --> B[CostTracker]
B -->|累加| C[token counts]
C --> D[model price]
D --> E[USD]
E -->|/cost| F[terminal display]
E -->|/insights| G[html report]
F --> H[CostSummary]
G --> I[chart + table]
3 显示。
10. 5 步估算月度成本¶
graph TD
A[1. sessions/day] -->|假设 10| B[10]
C[2. turns/session] -->|假设 20| D[20]
E[3. tokens/turn] -->|假设 10K in + 1K out| F[11K]
B -->|x| G[10 * 20 = 200 turns/day]
D --> G
G -->|x| H[200 * 11K = 2.2M tokens/day]
F --> H
H -->|x sonnet price| I[2.2M * 0.003 = $6.60/day]
I -->|x 30| J[$198/month]
5 步估算。
11. 3 步 cache 决策¶
graph TD
A[static prefix?] -->|yes| B[加 cache_control]
A -->|no| C[no cache]
B --> D[5min TTL]
D --> E[cache hit]
C --> F[1st time cost]
3 步决策。
12. 6 步节省策略¶
graph TD
A[成本高] --> B{原因}
B -->|重复 prompt| C[加 cache]
B -->|过长 context| D[用 attachment 而非内联]
B -->|低效 tool| E[用 Bash 而非多 Read]
B -->|模型过强| F[用 haiku]
B -->|turns 过多| G[设 max_turns]
B -->|每 turn 大| H[设 max_budget_usd]
C --> I[节省 90%]
D --> J[节省 50%]
E --> K[节省 30%]
F --> L[节省 80%]
G --> M[节省 N/A]
H --> N[防止超支]
6 策略。
13. 4 状态 cost display¶
stateDiagram-v2
[*] --> Tracking
Tracking --> Displayed: /cost
Displayed --> Exceeded: cost > max
Exceeded --> Stopped: auto stop
Stopped --> [*]
Tracking --> Daily: 1 day
Daily --> Weekly: 7 days
Weekly --> Monthly: 30 days
4 状态。
14. 5 维 token 占比¶
pie title Token 占比
"input" : 60
"output" : 25
"cache_creation" : 10
"cache_read" : 5
5 维饼图。
15. 4 步 cache 命中漏斗¶
graph TD
A[100K input] -->|cache hit 100%| B[全 cache_read, cost $0.30]
A -->|cache hit 80%| C[20K input + 80K cache_read, cost $1.20]
A -->|cache hit 50%| D[50K input + 50K cache_read, cost $1.65]
A -->|cache hit 0%| E[100K input, cost $3.00]
B -->|90% saving| F[optimal]
C -->|60% saving| G[good]
D -->|45% saving| H[~]
E -->|0% saving| I[1st time]
4 步漏斗。
16. 4 步 prompt 设计优化¶
graph LR
A[Original prompt] --> B[Step 1: 稳定前缀]
B --> C[Step 2: 末尾 user]
C --> D[Step 3: content hash]
D --> E[Step 4: delta]
E --> F[Optimal]
4 步优化。
17. 4 cache key 策略¶
graph TD
A[Cache key] --> B{strategy}
B -->|full prompt| C[全 cache]
B -->|prefix| D[前缀 cache]
B -->|tools| E[工具 cache]
B -->|context| F[context cache]
C --> G[server-side]
D --> G
E --> G
F --> G
4 策略。
18. 5 步 model 决策¶
graph TD
A[Task] --> B{需要 best quality?}
B -->|yes| C[opus-4-8]
B -->|no| D{需要 speed?}
D -->|yes| E[haiku-4-5]
D -->|no| F[sonnet-4-6]
C -->|cost| G[$15/M]
E -->|cost| H[$0.80/M]
F -->|cost| I[$3/M]
5 步。
19. 3 步 cost 异常检测¶
graph TD
A[normal cost] -->|异常| B[high cost turn]
B -->|reason| C{why}
C -->|长 context| D[trim]
C -->|re-ask| E[avoid re-ask]
C -->|multi agents| F[reduce agents]
D --> G[下次]
E --> G
F --> G
3 步。
20. 4 步 cost 历史追踪¶
graph LR
A[1. session start] --> B[2. record start_cost = 0]
C[3. each turn] --> D[update cost]
D --> E[4. session end]
E --> F[persist]
F --> G[~/.claude/stats/]
G --> H[aggregate]
H --> I[daily / weekly / monthly]
4 步。
21. 总结¶
Cost Flow = 4 token 累加 → 4 cache 优化 → 4 显示。
核心: - 22 mermaid 图 - 4 维 token - 3 模型单价 - 4 cache 优化 - 4 步节省策略
下一步: - 渲染 SVG - 加到 mkdocs