.. _cbook_agent_full_pipeline:

Full Production Pipeline
++++++++++++++++++++++++++

This is the capstone example, combining **all features** from the previous five
examples into a production-ready reference implementation. It demonstrates:
multi-agent DAG, HITL approval gates, memory strategies, MCP tool servers, and
real-time tracing / observability. This is your template for building complex
agentic systems on Dragon.

**Prerequisites:** Read Examples 01–05 first.

**What you'll learn:**

* How to integrate all five techniques into one system
* How to enable tracing to log every LLM call, tool invocation, approval, and
  memory operation to a structured JSONL file
* How to use the trace viewer to inspect and replay execution
* Best practices for production agent deployment on HPC clusters

**Architecture:**

* Node 0, GPUs [0,1]: Main inference service (large reasoning LLM)
* Node 1, GPU [0]: Dedicated summarizer LLM
* DAG: planner → runner → analyzer → reporter → save_report (function node)
* All agents have memory management, some have HITL approval, one has MCP tools
* Tracing enabled for full observability

Main Code
=========

Below is the complete example:

.. literalinclude:: ../../examples/dragon_ai/ai_agent/06_full_pipeline.py
    :language: python
    :linenos:
    :caption: **06_full_pipeline.py: All features combined with tracing**


Key Concepts
============

**Feature Integration:**

1. **Multi-agent DAG** (from 02): Four agents form a pipeline with automatic
   upstream result flow
2. **HITL Approval** (from 03): Planner and runner require human review for
   tool calls
3. **Memory Management** (from 04): Planner uses sliding window, runner uses
   summarization with dedicated small LLM
4. **MCP Tools** (from 05): Reporter can call tools from remote MCP servers
5. **Tracing** (new feature): Every operation logged to `./traces/{run_id}.jsonl`

**Trace Structure:**

Each line in the trace file is a JSON object:

.. code-block:: json

    {"timestamp": "...", "type": "llm_call", "agent": "planner", "model": "llama-70b", ...}
    {"timestamp": "...", "type": "tool_call", "agent": "planner", "tool": "propose_strategy", ...}
    {"timestamp": "...", "type": "hitl_approval", "agent": "planner", "status": "approved", ...}
    {"timestamp": "...", "type": "memory_op", "agent": "runner", "op": "summarize", ...}
    {"timestamp": "...", "type": "dag_transition", "from": "planner", "to": "runner", ...}

**Trace Viewer:**

Interactively explore the trace to understand agent reasoning, tool execution
paths, approval decisions, and memory operations in real-time.

Installation
============

See Example 01 (same dependencies).

System Description
===================

Tested on HPE Cray EX:

* **Node 0** (main inference): 2 Nvidia A100 GPUs, AMD EPYC 64-core CPU
* **Node 1** (summarizer): 1 Nvidia A100 GPU, AMD EPYC 64-core CPU
* Total: 2 compute nodes required

How to Run
==========

**Step 1: Edit configuration**

Open ``06_full_pipeline.py`` and set:

* ``MODEL_NAME`` (main reasoning model)
* ``SUMMARIZER_MODEL_NAME`` (small model for summarization)
* MCP server URLs (if using external tools)

**Step 2: Generate auth token (if using MCP)**

.. code-block:: console

    echo "your-mcp-server-token" > token.txt

**Step 3: Allocate nodes**

.. code-block:: console

    salloc --nodes=2 --exclusive

**Step 4: Run**

.. code-block:: console

    # Without MCP:
    dragon 06_full_pipeline.py

    # With MCP (e.g., Jupyter server):
    dragon 06_full_pipeline.py http://jupyter-mcp-server:8888

**Example output:**

.. code-block:: console

    $ dragon 06_full_pipeline.py
    Starting 2-node infrastructure...
    Node 0: Main inference service (70B model) initialized
    Node 1: Summarizer service (7B model) initialized
    Agent 'planner' started
    Agent 'runner' started
    Agent 'analyzer' started
    Agent 'reporter' started (with MCP: jupyter)
    Task: 'Execute complex analysis workflow'
    Planner proposing strategy... ✓
    HITL: Approve planner's tool call? [A/R/F]: A
    Runner executing strategy... ✓
    HITL: Approve runner's tool call? [A/R/F]: A
    Analyzer reviewing results... ✓
    Reporter generating final report... ✓
    Function node: save_report() executed
    Workflow completed
    Trace written to: ./traces/20260421-163045.jsonl

**Step 5: Inspect trace (optional, in another terminal)**

.. code-block:: console

    python -m dragon.ai.agent.observability --trace-file ./traces/20260421-163045.jsonl

**HITL approval client (in another terminal, during run)**

.. code-block:: console

    python -m dragon.ai.agent.hitl_client --tcp 127.0.0.1:9000

Use to approve/reject tool calls in real-time or add feedback.

Next Steps
==========

* **Deploy to production** using this example as a blueprint
* **Customize memory strategies** for your task's characteristics
* **Add more MCP servers** for access to domain-specific tools
* **Monitor trace files** to detect failure patterns and refine agents
* **Replay traces** for debugging and testing recovery scenarios