Agent Framework

The Dragon AI Agent Framework provides a multi-agent orchestration system for executing LLM-powered DAG workflows on HPC clusters. It combines Dragon’s distributed memory, process management, and communication objects with a structured-output agentic loop.

For architecture and design details, see Agent Framework — Developer Guide. For a hands-on tutorial, see Dragon AI Agent Framework — User Guide.

Python Reference

Core Agent Classes

Abstract base class for persistent agents and the concrete SubAgent implementation with LLM tool-calling support.

DragonAgent

Stateless, persistent base agent running on a node.

SubAgent

A stateless sub-agent that processes tasks via LLM reasoning + tools.

create_sub_agent(config[, tool_registry, ...])

Entry point for a sub-agent process.

Configuration

Dataclasses for agent identity, orchestrator settings, MCP server connections, pipeline DAG definitions, memory strategies, and dispatch metadata.

AgentConfig

Immutable configuration for a single Dragon agent.

MCPServerConfig

Configuration for a single MCP server connection.

OrchestratorConfig

Top-level configuration for DAGOrchestrator.

Pipeline

User-configurable DAG topology for a multi-agent workflow.

PipelineNode

A single node in a user-defined pipeline DAG.

TaskResult

Lightweight token passed between Dragon Batch node functions.

TaskStatus

Canonical status values for agents (written to DDict) and TaskResult tokens.

MemoryConfig

Configures how an agent manages its conversation history (work memory).

MemoryStrategy

Memory management strategy for the agentic tool-call loop.

DispatchHeader

Typed header sent by the batch dispatcher to an agent's input queue.

Tool System

Abstract tool interface, automatic callable-to-tool wrapping, a registry with decorator support, and MCP server integration with scoped tool naming.

BaseTool

Base class for all tools available to Dragon agents.

FunctionTool

Wrap any plain Python callable as a BaseTool.

ToolRegistry

Registry of BaseTool instances available to an agent.

MCPServerClient

Client that owns one persistent connection to a single MCP server.

Reasoning

The structured-output agentic loop and Pydantic models for LLM response parsing.

ToolDispatcher

Drives the structured-output agentic loop for a single LLM engine.

ResponseModel

Top-level union that forces the LLM to choose between a tool request and a final answer on every turn.

ToolCall

ToolRequest

FinalResponse

Orchestration

DAG-based workflow execution using Dragon Batch, with automatic dependency resolution and lifecycle management.

DAGOrchestrator

Build and execute multi-agent DAG workflows on Dragon Batch.

make_dispatcher_fn(agent_queue, node, ...)

Create a batch-compatible dispatcher closure for node.

Memory

Conversation history management with configurable pruning and summarization strategies.

ContextManager

Turn-based memory manager for the ToolDispatcher agentic loop.

Human-in-the-Loop

Approval gate for tool calls requiring human oversight, with a TCP bridge for external access from outside the Dragon runtime.

request_human_approval(ddict, hitl_queue, ...)

Pause the current coroutine until a human operator approves or rejects.

HumanApprovalRequest

Payload sent to the HITL client when an agent requests human approval.

HumanApprovalResponse

Payload sent back from the HITL client after the operator decides.

HitlTcpBridge

Bridge between the intra-runtime Dragon HITL Queue and an external TCP client.

Observability

Span-based tracing with DDict-backed storage, TCP streaming to external viewers, and a rich terminal UI.

DictTracingProcessor

Write trace spans to a Dragon Distributed Dictionary.

TracingProcessor

Abstract base for trace backends.

Span

A single timed operation in the trace tree.

Trace

Top-level container for a pipeline run trace.

SpanKind

Classification of trace span types.

MsgType

Message type verbs for the TCP bridge ↔ trace viewer protocol.

TraceTcpBridge

Bridge between intra-runtime DDict trace data and external TCP viewer.

Communication

Abstract communication protocol and the Dragon Queue-based implementation used for inter-process messaging.

CommunicationProtocol

Abstract communication protocol used by Dragon agents.

DragonQueueProtocol

Communication protocol backed by Dragon Queue and Dragon Distributed Dictionary.

Message

A message sent between dispatchers and agents.

DDict Access

Typed wrapper over raw DDict key operations for structured reads and writes.

DDictAccessor

Thin, error-handled wrapper around a raw Dragon DDict instance.

Errors

Structured exception hierarchy for agent errors, tool failures, and observability warnings.

AgentError

Base exception for all Dragon AI Agent internal errors.

ToolExecutionError

A tool call (MCP or local) raised an exception.

AgentLoopError

The agentic tool-calling loop failed.

HITLBridgeError

HITL TCP bridge communication failure.

CompletionSignalError

completion_event.set() failed — the batch dispatcher will hang.

AgentObservabilityWarning

Non-fatal warning for DDict/trace write failures.