dragon.ai.inference.batching.BatchItem

class BatchItem[source]

Bases: object

A single item to be batched.

Handles both plain-text prompts and chat-format conversations. Chat items carry optional tools, json_schema_override, and continue_final_message; plain-text items leave them at their defaults.

__init__(user_prompt: str | List [Dict [str , Any ]], formatted_prompt: str | List [Dict [str , Any ]], response_queue: Queue, latency_metrics: Tuple [float , float , float ], tools: List [Dict [str , Any ]] | None = None, json_schema_override: dict | None = None, continue_final_message: bool = False)[source]

Initialize a BatchItem instance.

Parameters:
  • user_prompt (str ) – Raw user input.

  • formatted_prompt (str ) – Formatted prompt with system instructions.

  • response_queue (dragon.native.Queue) – dragon.native.Queue used to send the response.

  • latency_metrics (tuple [float , float , float ]) – Tuple of (entry_time, cpu_latency, guard_latency).

  • tools (list [dict [str , Any]] | None) – Optional tool definitions for chat requests.

  • json_schema_override (dict | None) – Per-request JSON schema for guided decoding.

  • continue_final_message (bool ) – Whether to continue the final assistant message instead of adding a generation prompt.

Methods

__init__(user_prompt, formatted_prompt, ...)

Initialize a BatchItem instance.

__init__(user_prompt: str | List [Dict [str , Any ]], formatted_prompt: str | List [Dict [str , Any ]], response_queue: Queue, latency_metrics: Tuple [float , float , float ], tools: List [Dict [str , Any ]] | None = None, json_schema_override: dict | None = None, continue_final_message: bool = False)[source]

Initialize a BatchItem instance.

Parameters:
  • user_prompt (str ) – Raw user input.

  • formatted_prompt (str ) – Formatted prompt with system instructions.

  • response_queue (dragon.native.Queue) – dragon.native.Queue used to send the response.

  • latency_metrics (tuple [float , float , float ]) – Tuple of (entry_time, cpu_latency, guard_latency).

  • tools (list [dict [str , Any]] | None) – Optional tool definitions for chat requests.

  • json_schema_override (dict | None) – Per-request JSON schema for guided decoding.

  • continue_final_message (bool ) – Whether to continue the final assistant message instead of adding a generation prompt.