dragon.ai.inference.batching.BatchItem
- class BatchItem[source]
Bases:
objectA single item to be batched.
Handles both plain-text prompts and chat-format conversations. Chat items carry optional
tools,json_schema_override, andcontinue_final_message; plain-text items leave them at their defaults.- __init__(user_prompt: str | List [Dict [str , Any ]], formatted_prompt: str | List [Dict [str , Any ]], response_queue: Queue, latency_metrics: Tuple [float , float , float ], tools: List [Dict [str , Any ]] | None = None, json_schema_override: dict | None = None, continue_final_message: bool = False)[source]
Initialize a BatchItem instance.
- Parameters:
user_prompt (str ) – Raw user input.
formatted_prompt (str ) – Formatted prompt with system instructions.
response_queue (dragon.native.Queue) – dragon.native.Queue used to send the response.
latency_metrics (tuple [float , float , float ]) – Tuple of (entry_time, cpu_latency, guard_latency).
tools (list [dict [str , Any]] | None) – Optional tool definitions for chat requests.
json_schema_override (dict | None) – Per-request JSON schema for guided decoding.
continue_final_message (bool ) – Whether to continue the final assistant message instead of adding a generation prompt.
Methods
__init__(user_prompt, formatted_prompt, ...)Initialize a BatchItem instance.
- __init__(user_prompt: str | List [Dict [str , Any ]], formatted_prompt: str | List [Dict [str , Any ]], response_queue: Queue, latency_metrics: Tuple [float , float , float ], tools: List [Dict [str , Any ]] | None = None, json_schema_override: dict | None = None, continue_final_message: bool = False)[source]
Initialize a BatchItem instance.
- Parameters:
user_prompt (str ) – Raw user input.
formatted_prompt (str ) – Formatted prompt with system instructions.
response_queue (dragon.native.Queue) – dragon.native.Queue used to send the response.
latency_metrics (tuple [float , float , float ]) – Tuple of (entry_time, cpu_latency, guard_latency).
tools (list [dict [str , Any]] | None) – Optional tool definitions for chat requests.
json_schema_override (dict | None) – Per-request JSON schema for guided decoding.
continue_final_message (bool ) – Whether to continue the final assistant message instead of adding a generation prompt.