Cost Tracking for Claude Agent SDK Applications
A practical walkthrough of tracking token usage and cost in the Claude Agent SDK: the result message total_cost_usd estimate, per-step and per-model usage, deduplicating parallel tool calls, accumulating across calls, and cache tokens.
Open the source and read safety notes before installing.
Safety notes
- total_cost_usd / costUSD are client-side estimates from a bundled price table, not authoritative billing; do not bill end users or trigger financial decisions from them.
- Estimates can drift when pricing changes or the SDK version does not recognize a model; use the Usage and Cost API or Console for real billing.
- Both success and error result messages include usage and cost; read cost regardless of subtype so failed runs are still accounted for.
Privacy notes
- Usage data is token counts and cost, not content; it is safe to log, though it can reveal activity volume.
- Per-model and end-user attribution may be sent to an observability backend if you also enable telemetry; govern that data accordingly.
- The SDK uses prompt caching automatically; cache token fields reveal reuse patterns but not content.
Prerequisites
- The Claude Agent SDK installed for Python or TypeScript.
- An async loop over query() results so you can read assistant and result messages.
- For authoritative billing, access to the Usage and Cost API or the Console.
Schema details
- Install type
- copy
- Troubleshooting
- No
Full copyable content
Use this guide to read token usage and estimated cost from the Claude Agent SDK, broken down per step, per model, and accumulated across calls.About this resource
Overview
The Claude Agent SDK reports detailed token usage for each interaction. This guide covers reading cost and usage correctly, especially with parallel tool use and multi-step conversations.
Scopes
- query() call: one
query()invocation; produces oneresultmessage. - Step: one request/response cycle within a call; produces assistant messages with usage.
- Session: multiple
query()calls linked by a session id (resume); each call reports its own cost.
Get the total for a call
The result message includes total_cost_usd (estimated) and cumulative usage:
for await (const message of query({ prompt: "Summarize this project" })) {
if (message.type === "result") {
console.log(`Total cost: $${message.total_cost_usd}`);
}
}
This is a client-side estimate from a bundled price table, not authoritative billing. Use the Usage and Cost API or the Console for real charges.
Per-step and per-model usage
Each assistant message carries usage (input/output tokens) and an id. Parallel
tool calls in one turn share an id, so deduplicate by id to avoid double
counting:
const seen = new Set();
let input = 0, output = 0;
for await (const message of query({ prompt: "..." })) {
if (message.type === "assistant" && !seen.has(message.message.id)) {
seen.add(message.message.id);
input += message.message.usage.input_tokens;
output += message.message.usage.output_tokens;
}
}
The result message's modelUsage (TS) / model_usage (Python) breaks cost and
tokens down per model, useful when subagents run a cheaper model.
Accumulate across calls
The SDK has no session-level total; sum each call's total_cost_usd yourself when
running multiple query() calls.
Cache tokens and failures
The SDK uses prompt caching automatically. The usage object includes
cache_creation_input_tokens (written, higher rate) and cache_read_input_tokens
(read, reduced rate); track them to understand caching savings. Both success and
error results include cost, so always read it regardless of subtype. To extend
cache TTL to one hour on API-key/Bedrock/Vertex/Foundry, set
ENABLE_PROMPT_CACHING_1H.
Source
- Track cost and usage: https://code.claude.com/docs/en/agent-sdk/cost-tracking
Source citations
Signals
Loading live community signals…
A short, calm digest of reviewed Claude resources. Unsubscribe any time.