Models

Context window

The maximum number of tokens a model can process in one request.

Frontier models in 2026 commonly support 200k-2M token context windows. Larger windows do not eliminate the need for retrieval — quality of long-context recall degrades meaningfully past 50-100k tokens, costs scale linearly, and latency scales worse than linearly. Treat context window as a soft constraint, not a license to dump everything.

Related terms

Tokens
RAG

Building with Context window?

We ship production AI systems built around concepts like this every quarter. Send a brief and get a written proposal in 48 hours.

Send a brief →