Agent Framework

The Dragon AI Agent Framework provides a multi-agent orchestration system for executing LLM-powered DAG workflows on HPC clusters. It combines Dragon’s distributed memory, process management, and communication objects with a structured-output agentic loop.

For architecture and design details, see Agent Framework — Developer Guide. For a hands-on tutorial, see Dragon AI Agent Framework — User Guide.

Python Reference

Core Agent Classes

Abstract base class for persistent agents and the concrete SubAgent implementation with LLM tool-calling support.

DragonAgent

Stateless, persistent base agent running on a node.

`SubAgent`	A stateless sub-agent that processes tasks via LLM reasoning + tools.
`create_sub_agent`(config[, tool_registry, ...])	Entry point for a sub-agent process.

Configuration

Dataclasses for agent identity, orchestrator settings, MCP server connections, pipeline DAG definitions, memory strategies, and dispatch metadata.

`AgentConfig`	Immutable configuration for a single Dragon agent.
`MCPServerConfig`	Configuration for a single MCP server connection.
`OrchestratorConfig`	Top-level configuration for `DAGOrchestrator`.

`Pipeline`	User-configurable DAG topology for a multi-agent workflow.
`PipelineNode`	A single node in a user-defined pipeline DAG.
`TaskResult`	Lightweight token passed between Dragon Batch node functions.
`TaskStatus`	Canonical status values for agents (written to DDict) and TaskResult tokens.

`MemoryConfig`	Configures how an agent manages its conversation history (work memory).
`MemoryStrategy`	Memory management strategy for the agentic tool-call loop.

DispatchHeader

Typed header sent by the batch dispatcher to an agent's input queue.

Tool System

Abstract tool interface, automatic callable-to-tool wrapping, a registry with decorator support, and MCP server integration with scoped tool naming.

BaseTool

Base class for all tools available to Dragon agents.

FunctionTool

Wrap any plain Python callable as a BaseTool.

ToolRegistry

Registry of BaseTool instances available to an agent.

MCPServerClient

Client that owns one persistent connection to a single MCP server.

Reasoning

The structured-output agentic loop and Pydantic models for LLM response parsing.

ToolDispatcher

Drives the structured-output agentic loop for a single LLM engine.

`ResponseModel`	Top-level union that forces the LLM to choose between a tool request and a final answer on every turn.
`ToolCall`
`ToolRequest`
`FinalResponse`

Orchestration

DAG-based workflow execution using Dragon Batch, with automatic dependency resolution and lifecycle management.

DAGOrchestrator

Build and execute multi-agent DAG workflows on Dragon Batch.

make_dispatcher_fn(agent_queue, node, ...)

Create a batch-compatible dispatcher closure for node.

Memory

Conversation history management with configurable pruning and summarization strategies.

ContextManager

Turn-based memory manager for the ToolDispatcher agentic loop.

Human-in-the-Loop

Approval gate for tool calls requiring human oversight, with a TCP bridge for external access from outside the Dragon runtime.

request_human_approval(ddict, hitl_queue, ...)

Pause the current coroutine until a human operator approves or rejects.

`HumanApprovalRequest`	Payload sent to the HITL client when an agent requests human approval.
`HumanApprovalResponse`	Payload sent back from the HITL client after the operator decides.

HitlTcpBridge

Bridge between the intra-runtime Dragon HITL Queue and an external TCP client.

Observability

Span-based tracing with DDict-backed storage, TCP streaming to external viewers, and a rich terminal UI.

DictTracingProcessor

Write trace spans to a Dragon Distributed Dictionary.

`TracingProcessor`	Abstract base for trace backends.
`Span`	A single timed operation in the trace tree.
`Trace`	Top-level container for a pipeline run trace.

`SpanKind`	Classification of trace span types.
`MsgType`	Message type verbs for the TCP bridge ↔ trace viewer protocol.

TraceTcpBridge

Bridge between intra-runtime DDict trace data and external TCP viewer.

Communication

Abstract communication protocol and the Dragon Queue-based implementation used for inter-process messaging.

CommunicationProtocol

Abstract communication protocol used by Dragon agents.

DragonQueueProtocol

Communication protocol backed by Dragon Queue and Dragon Distributed Dictionary.

Message

A message sent between dispatchers and agents.

DDict Access

Typed wrapper over raw DDict key operations for structured reads and writes.

DDictAccessor

Thin, error-handled wrapper around a raw Dragon DDict instance.

Errors

Structured exception hierarchy for agent errors, tool failures, and observability warnings.

`AgentError`	Base exception for all Dragon AI Agent internal errors.
`ToolExecutionError`	A tool call (MCP or local) raised an exception.
`AgentLoopError`	The agentic tool-calling loop failed.
`HITLBridgeError`	HITL TCP bridge communication failure.
`CompletionSignalError`	completion_event.set() failed — the batch dispatcher will hang.
`AgentObservabilityWarning`	Non-fatal warning for DDict/trace write failures.