Models and protocols¶
types.go
defines the types needed to describe a model: the protocols, the neutral options
callers set, and the Model that ties an endpoint to its capabilities and price.
Where those models live and how they are registered and retrieved is implemented
in models.go
and catalog.go.
This page covers a single model first — how it is defined, how it decodes by
protocol, and how its capabilities are queried — then how those models are stored
and retrieved together.
Neutral types¶
Several settings are small string types, each a closed set of constants. Keeping them as named types — rather than bare strings — lets the compiler catch typos and keeps the public API self-documenting.
type Protocol string // "openai-completions", "anthropic-messages"
type ModelInput string // "text", "image"
type ModelThinkingLevel string // off, minimal, low, medium, high, xhigh
type ThinkingDisplay string // summarized, omitted
Protocol names a wire protocol and, through it, the adapter that speaks it.
ModelInput names a modality; a model lists the ones it accepts in Model.Input,
and an image sent to a text-only model is downgraded rather than rejected.
ModelThinkingLevel is provider-independent. A model declares how each level maps
to its own dialect through Model.ThinkingLevelMap, which an adapter consults
when building the request. ThinkingDisplay is narrower: it does not change
whether the model reasons or what it is billed, only what comes back —
summarized returns readable thinking, omitted keeps the signature but drops
the text. Only the Anthropic protocol honors it today.
Pricing¶
ModelCost stores prices in US dollars per million tokens, split by how each
token is billed:
type ModelCost struct {
Input float64 // fresh input tokens
Output float64 // generated tokens
CacheRead float64 // tokens served from the prompt cache
CacheWrite float64 // tokens written into the prompt cache
}
The four categories line up with the Usage counters on a response.
CalculateCost works from that: prices are per million tokens, so each category
costs price ÷ 1,000,000 × that category's tokens, and the four sum to the total.
Cache reads and writes are priced apart from fresh input because providers bill
them differently.
The Model¶
Model is grouped into four concerns. The comments in the source mark the
boundaries:
type Model struct {
// Identity
ID, Name, Provider string
// Routing
Protocol Protocol
BaseURL string
Headers map[string]string
// Capabilities
Reasoning bool
ThinkingLevelMap map[ModelThinkingLevel]*string
Input []ModelInput
ContextWindow int64
MaxTokens int64
// Pricing and per-provider quirks
Cost ModelCost
Compatibility ModelCompatibility
}
Protocol is the routing discriminator: Client.Stream uses it to pick an
adapter. BaseURL and Headers are what let a compatible vendor reuse a
protocol — point the base URL at the vendor's endpoint, add any required headers,
and the same adapter serves it. ContextWindow is the total token budget,
MaxTokens the cap on generation; both feed the request and the
overflow check.
ThinkingLevelMap uses a pointer value on purpose. A nil marks a level as
unsupported; a missing key falls back to the provider default. The two cases are
distinct, and a pointer is what lets the map express both — a plain string
could not tell "explicitly off" from "not configured."
Vendor compatibility¶
Vendors that implement the same protocol still differ in small ways, and those
differences live on a per-protocol compatibility struct (Model.Compatibility).
It is an optional set of overrides: left empty, the adapter follows the reference
behavior throughout; a vendor fills in fields only where it actually deviates.
The Anthropic side is short, because most Anthropic-compatible vendors need no overrides at all:
type AnthropicMessagesCompatibility struct {
SupportsTemperature *bool
SupportsCacheControl *bool
SupportsCacheControlTools *bool
ForceAdaptiveThinking *bool
AllowEmptySignature *bool
}
The OpenAI side carries more, because "OpenAI-compatible" covers a wide range of endpoints:
type OpenAICompletionsCompatibility struct {
SupportsStore *bool
SupportsDeveloperRole *bool
SupportsReasoningEffort *bool
MaxTokensField string // "max_tokens" vs "max_completion_tokens"
SupportsStrictMode *bool
RequiresThinkingAsText *bool // send thinking as a leading text block
ThinkingFormat string
// ... and a few more
}
The booleans are pointers for a reason. A plain bool has two states and cannot
tell "the vendor explicitly does not support this" from "unspecified, use the
default." A *bool has three: true, false, and nil — and nil is the
default path. The string fields (MaxTokensField, ThinkingFormat) name a
variant directly, with the empty string meaning "use the reference behavior."
Decoding by protocol¶
Both compatibility structs satisfy one interface, whose single method reports which protocol the configuration describes:
This keeps Model independent of any one protocol. The cost is that the compat
field has an interface type, and JSON carries no tag for which concrete struct it
holds — so decoding has to choose. Model.UnmarshalJSON makes that choice with
Protocol as the discriminator:
- The
modelAliastype drops theUnmarshalJSONmethod, so unmarshalling into it does not recurse back into this function.compatis held back asjson.RawMessageto decode in a second pass. Protocolwas already decoded by the first pass, so it is available to select the concrete compatibility type.
The field that drives routing at request time is the same field that selects the type at decode time. A model serializes to JSON and restores without a separate type tag, because its protocol already carries that information. This is the runtime equivalent of a type that would be conditional on the protocol at compile time in a language with that feature.
Adapting the thinking level¶
Callers use a neutral ModelThinkingLevel, but models differ in which levels they
support. Two functions bring a requested level down to what the target model
actually accepts.
SupportedThinkingLevels gives the set of levels a model accepts: a non-reasoning
model supports only off; a reasoning model enumerates them in off → minimal →
low → medium → high → xhigh order, where a level explicitly mapped to nil in
ThinkingLevelMap counts as unsupported, and xhigh is included only when
explicitly mapped — the top level stays closed by default unless the model
declares it.
ClampThinkingLevel snaps any requested level to the nearest supported one, in
this order: take it directly if supported; otherwise step up through the order to
the first supported level; failing that, step down instead; and finally fall back
to the lowest supported level (or off). A caller can thus request any level and
always get one the target model can handle, without comparing each model's
capability table itself.
The registry¶
ModelRegistry is the storage and lookup hub for models — built-in models are
registered here at startup, and every runtime query goes through it. It has just
two fields: a read-write lock and a two-level "provider → model ID → model" map:
type ModelRegistry struct {
mu sync.RWMutex
models map[string]map[string]Model // provider → model ID → Model
}
The outer key is the provider, the inner key a model ID under it. Providers are grouped independently of protocol — models under the same provider may use different protocols and still nest side by side:
models ┌─ "anthropic" ─┬─ "claude-opus-4-8" → Model{protocol: anthropic-messages}
│ └─ "claude-sonnet-4-6" → Model{protocol: anthropic-messages}
│
└─ "deepseek" ──┬─ "deepseek-v4-flash" → Model{protocol: openai-completions}
└─ "deepseek-v4-pro" → Model{protocol: openai-completions}
The sync.RWMutex guards every read and write of models — queries take a read
lock, registrations a write lock — so the registry can be shared across goroutines
without external locking.
Registration and validation¶
Register writes a model, but runs a fixed sequence before it lands in the table:
- Non-empty check — provider, ID, and protocol must all be present; if any is empty it returns an error and registration stops.
- Compatibility check — if the model carries a compatibility configuration
(
Compatibility),validateModelCompatibilitychecks it further: the concrete type has to be one of the known protocol compatibility structs, and the protocol it declares has to match the model's ownProtocol. In other words, a model markedanthropic-messagescannot carry an OpenAI compatibility configuration — such a mismatch is caught at registration time rather than surfacing later at request time. - Acquire the write lock —
mu.Lockmakes concurrent registrations mutually exclusive. - Lazily create the sub-map — if the provider's inner map does not yet exist, it is created first.
- Store a deep copy — what is written is
cloneModel(model), not the caller's original; registering the same ID again under a provider overwrites the existing entry.
All validation comes before the lock and the write: an invalid model is rejected before the registry is touched, so no half-built state is left behind.
Retrieving a model¶
Lookup comes at two levels. A registry instance has three methods, working against any registry:
func (r *ModelRegistry) Get(provider, modelID string) (Model, bool)
func (r *ModelRegistry) Providers() []string
func (r *ModelRegistry) Models(provider string) []Model
Getfetches a single model by provider and model ID: on a hit it returns the model andtrue, on a miss the zero value andfalse.Providerslists every provider ID in the registry, lexically sorted.Modelslists every model under one provider, sorted by model ID.
All three return sorted results, so iteration order is stable and reproducible.
The package-level functions wrap those three methods over the built-in registry
builtInModelRegistry, which most callers use directly — sparing them from holding
a registry instance of their own:
func LookupModel(provider, modelID string) (Model, bool) // Get; false when missing
func GetModel(provider, modelID string) Model // Get; panics when missing
func GetProviders() []string // Providers
func GetModels(provider string) []Model // Models
Fetching a single model comes in two forms, LookupModel and GetModel,
differing only in how they handle a miss: GetModel suits identifiers hard-coded
in source that ought to exist, where a miss is a program error; LookupModel suits
identifiers from config or external input, where the caller must handle a miss
itself. GetProviders and GetModels pass straight through to Providers and
Models with the same semantics.
Whichever entry point is used, what comes back is a deep copy. A Model holds
slices, maps, and pointer fields; handing back the table's original would let a
caller's edits reach other holders, so an independent copy is returned to rule out
that implicit coupling.
Built-in models¶
The built-in models in that registry are not fetched at runtime; they ship with
the binary.
catalog.generated.json
is produced by go generate (cmd/genmodels) from upstream catalog data —
Models.dev as the primary source, plus the live catalogs and
pricing from OpenRouter and Vercel AI Gateway — and emits only models whose
protocol the package implements (openai-completions and anthropic-messages).
The result is committed alongside the source and embedded into the binary with
//go:embed:
So neither the build nor startup depends on the network or the working directory.
The registry is populated during program startup: builtInModelRegistry is a
package-level variable initialized before main — it first decodes the embedded
JSON, then Registers each model into the table:
var builtInModelRegistry = newBuiltInModelRegistry()
func builtInModels() []Model {
var models []Model
if err := json.Unmarshal(generatedCatalogJSON, &models); err != nil {
panic(...) // a broken catalog means there is nothing to start from
}
return models
}
A decode or registration failure panics rather than returns an error. The embedded catalog is a build-time artifact: at runtime it is either intact or the build itself is broken, with no degraded state to run in between — so the program is made to fail early.
Source: types.go,
catalog.go,
and models.go.