Reading responses¶
Complete returns an AssistantMessage; streaming delivers the same value as
EventDone.Message. This page covers what to read back from it: content, why
generation stopped, token usage and cost, and non-fatal diagnostics.
Content and metadata¶
The two accessors cover most reads:
The message also carries provider metadata: Provider, Model, the provider's
own ResponseModel and ResponseID, and a Timestamp. ErrorMessage holds
the provider or runtime error string on a failed response.
fmt.Printf("model=%s response=%s id=%s\n",
response.Model, response.ResponseModel, response.ResponseID)
if response.ErrorMessage != "" {
log.Printf("provider error: %s", response.ErrorMessage)
}
To walk individual blocks instead of the joined text — for example to render
thinking and text differently — type-switch over response.Content:
for _, block := range response.Content {
switch b := block.(type) {
case *llm.TextContent:
fmt.Println("text:", b.Text)
case *llm.ThinkingContent:
fmt.Println("thinking:", b.Thinking)
case *llm.ToolCall:
fmt.Printf("tool call: %s(%v)\n", b.Name, b.Arguments)
}
}
Stop reasons¶
StopReason explains why generation stopped. Branch on it before using the
response — especially before executing tool calls.
StopReason |
Meaning | Typical handling |
|---|---|---|
StopReasonStop |
Normal completion | Use response.Text() |
StopReasonToolUse |
The model wants tool results | Run the tool loop |
StopReasonLength |
Output hit the MaxTokens cap |
Continue the turn or raise the cap |
StopReasonError |
Provider or runtime failure | Inspect ErrorMessage; do not execute tool calls |
StopReasonAborted |
Request was cancelled | Stop; the context was cancelled |
switch response.StopReason {
case llm.StopReasonStop:
fmt.Println(response.Text())
case llm.StopReasonToolUse:
runTools(response.ToolCalls()) // see the tool loop
case llm.StopReasonLength:
log.Println("truncated: raise MaxTokens or continue the turn")
case llm.StopReasonError, llm.StopReasonAborted:
log.Printf("stopped early: %s %s", response.StopReason, response.ErrorMessage)
}
Token usage and cost¶
Usage records token consumption for the response. Cached tokens are reported
separately so cache hits are visible:
| Field | Meaning |
|---|---|
Input |
Prompt tokens billed at the full input rate |
Output |
Generated tokens, including reasoning tokens |
CacheRead |
Input tokens served from the provider cache |
CacheWrite |
Input tokens written to the cache |
TotalTokens |
Sum reported for the response |
Usage.Cost is a UsageCost with the same breakdown in currency units
(Input, Output, CacheRead, CacheWrite, and Total), computed from the
model's pricing when the response is assembled.
fmt.Printf("tokens=%d (cached %d) cost=$%.6f\n",
response.Usage.TotalTokens,
response.Usage.CacheRead,
response.Usage.Cost.Total,
)
To price a usage record yourself — for example to re-cost stored history against
a different model — call CalculateCost:
cost := llm.CalculateCost(model, response.Usage)
fmt.Printf("input=$%.6f output=$%.6f total=$%.6f\n",
cost.Input, cost.Output, cost.Total)
To track spend across a multi-turn conversation, accumulate Cost.Total from
each response:
var spent float64
for _, turn := range responses {
spent += turn.Usage.Cost.Total
}
fmt.Printf("conversation cost: $%.4f\n", spent)
Detect context overflow¶
IsContextOverflow reports whether a response exceeded the model's context
window. It recognises explicit provider errors as well as silent overflows where
the provider truncates input instead of failing. Use it to trigger history
compaction or summarization before the next turn.
if llm.IsContextOverflow(response, model.ContextWindow) {
// Drop or summarize old messages, then retry.
}
Diagnostics¶
Diagnostics records non-fatal events that occurred while producing the
response, such as tool arguments recovered from malformed JSON. It is nil for
a clean response. Each Diagnostic carries a Type, a Timestamp, an optional
Message, and structured Details.
for _, d := range response.Diagnostics {
if d.Type == llm.DiagnosticToolArgumentsRecovered {
log.Printf("recovered tool arguments: mode=%v call=%v",
d.Details["mode"], d.Details["toolCallId"])
}
}
Inspect diagnostics before executing a tool with side effects; see stream diagnostics for the recovery modes.