Context Window

What is it?

Context Window in AI systems refers to the amount of text (tokens) that an LLM can consider at once when generating a response. It defines the “working memory” of the model for a single interaction. The context window includes the current prompt, conversation history, and any injected documents or retrieved information. If the total input exceeds this limit, older information is truncated and lost from the model’s immediate awareness.

It usually involves storing past interactions, facts, or summaries in a database and retrieving them when relevant. It bridges the gap between the fleeting “now” of the Context Window and the permanent knowledge of the model’s training weights.

Why is it Important?

  • Continuity: Allows an AI to maintain a persona or remember user preferences over days or years.
  • Personalization: The AI learns about the specific user, not just generic knowledge.
  • Cost Efficiency: Instead of feeding the entire conversation history (which is expensive and limited) into every prompt, we only retrieve relevant memories.

Technical View

Memory is often tiered into Short-term (Context) and Long-term (Vector Store/Database).

Updated: