Skip to content

Reading responses

Complete returns an AssistantMessage; streaming delivers the same value as EventDone.Message. This page covers what to read back from it: content, why generation stopped, token usage and cost, and non-fatal diagnostics.

Content and metadata

The two accessors cover most reads:

response.Text()      // all text blocks joined
response.ToolCalls() // every tool call, in order

The message also carries provider metadata: Provider, Model, the provider's own ResponseModel and ResponseID, and a Timestamp. ErrorMessage holds the provider or runtime error string on a failed response.

fmt.Printf("model=%s response=%s id=%s\n",
    response.Model, response.ResponseModel, response.ResponseID)
if response.ErrorMessage != "" {
    log.Printf("provider error: %s", response.ErrorMessage)
}

To walk individual blocks instead of the joined text — for example to render thinking and text differently — type-switch over response.Content:

for _, block := range response.Content {
    switch b := block.(type) {
    case *llm.TextContent:
        fmt.Println("text:", b.Text)
    case *llm.ThinkingContent:
        fmt.Println("thinking:", b.Thinking)
    case *llm.ToolCall:
        fmt.Printf("tool call: %s(%v)\n", b.Name, b.Arguments)
    }
}

Stop reasons

StopReason explains why generation stopped. Branch on it before using the response — especially before executing tool calls.

StopReason Meaning Typical handling
StopReasonStop Normal completion Use response.Text()
StopReasonToolUse The model wants tool results Run the tool loop
StopReasonLength Output hit the MaxTokens cap Continue the turn or raise the cap
StopReasonError Provider or runtime failure Inspect ErrorMessage; do not execute tool calls
StopReasonAborted Request was cancelled Stop; the context was cancelled
switch response.StopReason {
case llm.StopReasonStop:
    fmt.Println(response.Text())
case llm.StopReasonToolUse:
    runTools(response.ToolCalls()) // see the tool loop
case llm.StopReasonLength:
    log.Println("truncated: raise MaxTokens or continue the turn")
case llm.StopReasonError, llm.StopReasonAborted:
    log.Printf("stopped early: %s %s", response.StopReason, response.ErrorMessage)
}

Token usage and cost

Usage records token consumption for the response. Cached tokens are reported separately so cache hits are visible:

Field Meaning
Input Prompt tokens billed at the full input rate
Output Generated tokens, including reasoning tokens
CacheRead Input tokens served from the provider cache
CacheWrite Input tokens written to the cache
TotalTokens Sum reported for the response

Usage.Cost is a UsageCost with the same breakdown in currency units (Input, Output, CacheRead, CacheWrite, and Total), computed from the model's pricing when the response is assembled.

fmt.Printf("tokens=%d (cached %d) cost=$%.6f\n",
    response.Usage.TotalTokens,
    response.Usage.CacheRead,
    response.Usage.Cost.Total,
)

To price a usage record yourself — for example to re-cost stored history against a different model — call CalculateCost:

cost := llm.CalculateCost(model, response.Usage)
fmt.Printf("input=$%.6f output=$%.6f total=$%.6f\n",
    cost.Input, cost.Output, cost.Total)

To track spend across a multi-turn conversation, accumulate Cost.Total from each response:

var spent float64
for _, turn := range responses {
    spent += turn.Usage.Cost.Total
}
fmt.Printf("conversation cost: $%.4f\n", spent)

Detect context overflow

IsContextOverflow reports whether a response exceeded the model's context window. It recognises explicit provider errors as well as silent overflows where the provider truncates input instead of failing. Use it to trigger history compaction or summarization before the next turn.

if llm.IsContextOverflow(response, model.ContextWindow) {
    // Drop or summarize old messages, then retry.
}

Diagnostics

Diagnostics records non-fatal events that occurred while producing the response, such as tool arguments recovered from malformed JSON. It is nil for a clean response. Each Diagnostic carries a Type, a Timestamp, an optional Message, and structured Details.

for _, d := range response.Diagnostics {
    if d.Type == llm.DiagnosticToolArgumentsRecovered {
        log.Printf("recovered tool arguments: mode=%v call=%v",
            d.Details["mode"], d.Details["toolCallId"])
    }
}

Inspect diagnostics before executing a tool with side effects; see stream diagnostics for the recovery modes.