dragon.ai.agent.memory.context_manager.ContextManager

class ContextManager[source]

Bases: object

Turn-based memory manager for the ToolDispatcher agentic loop.

Parameters

config:

A resolved MemoryConfig instance (never None — the caller should check MemoryConfig.resolve() first).

summarizer_engine:

Optional separate LLM engine dedicated to summarization. When provided, maybe_summarize() calls this engine instead of the main agent LLM engine, keeping the primary inference queue free for reasoning and tool-calling.

Typical HPC setup: main agent uses a 70B model on 4 GPUs; summarizer uses an 8B instruct model on 1 GPU — faster, no contention with the agentic loop.

When None (default), summarization reuses the agent’s own llm_engine — same behavior as LangChain / LlamaIndex defaults.

__init__(config: MemoryConfig, summarizer_engine: Any = None) None [source]

Methods

__init__(config[, summarizer_engine])

enforce_window(messages, num_initial)

In-place prune old tool-call turns from the message list.

maybe_summarize(messages, num_initial, ...)

Summarize old turns via the LLM if conditions are met.

should_summarize(messages, num_initial)

Cheap check: would maybe_summarize() actually trigger?

__init__(config: MemoryConfig, summarizer_engine: Any = None) None [source]
enforce_window(messages: List [Dict [str , Any ]], num_initial: int ) None [source]

In-place prune old tool-call turns from the message list.

Classifies messages into three zones:

  • Zone A (indices 0..num_initial-1): System prompt(s) and original user task. Never pruned.

  • Zone B: The most recent max_kept_turns turn-pairs at the tail. Never pruned.

  • Zone C: Everything between Zone A and Zone B — older tool-call exchanges. Replaced with a synthetic note (or summarized if strategy="summarize" — handled by maybe_summarize()).

A turn-pair is one assistant message with tool_calls plus all following role: "tool" messages until the next assistant message.

Parameters

messages:

The copy_prompts list, mutated in-place.

num_initial:

Number of messages in the protected initial zone (system prompts + user task).

should_summarize(messages: List [Dict [str , Any ]], num_initial: int ) bool [source]

Cheap check: would maybe_summarize() actually trigger?

Returns True when strategy is MemoryStrategy.SUMMARIZE and the number of pruneable turns has reached the threshold. This lets callers gate expensive trace spans so that no span is emitted when summarization is a no-op.

async maybe_summarize(messages: List [Dict [str , Any ]], num_initial: int , llm_engine: Any ) dict | None [source]

Summarize old turns via the LLM if conditions are met.

Only active when strategy=MemoryStrategy.SUMMARIZE. Triggers when the number of pruneable turns (Zone C) reaches summarize_after_turns.

When triggered:

  1. Extract Zone C messages as text.

  2. Call the LLM with a summarization prompt (single-shot, no tools).

  3. Replace Zone C messages with a summary system message.

Parameters

messages:

The copy_prompts list, mutated in-place.

num_initial:

Number of messages in the protected initial zone.

llm_engine:

The agent’s main LLM engine (fallback). Used for summarization only when no dedicated summarizer_engine was provided at construction time.

Returns

dict | None

When summarization fires, returns {"input": <context_text>, "output": <summary_text>} so callers (e.g. tracing) can record what was summarized. Returns None when summarization did not trigger.