dragon.ai.inference.batching.Batch

class Batch[source] 

Bases: object

A collection of items to be processed together.

__init__(items: List [BatchItem], batch_id: int , created_at: float )[source] 

Initialize a Batch instance.

Parameters:

items (list [BatchItem]) – List of BatchItem instances.
batch_id (int ) – Unique identifier for the batch.
created_at (float ) – Timestamp when the batch was created.

Methods

__init__(items, batch_id, created_at)

Initialize a Batch instance.

Attributes

`continue_final_message_list`	Extract chat-template continuation flags from the batch.
`formatted_prompts`	Extract formatted prompts from batch items.
`json_schema_list`	Extract per-request JSON schema overrides from the batch.
`latency_metrics`	Extract latency metrics from batch items.
`response_queues`	Extract response queues from batch items.
`size`	Get the batch size.
`tools_list`	Extract per-request tool definitions from the batch.
`user_prompts`	Extract user prompts from batch items.

__init__(items: List [BatchItem], batch_id: int , created_at: float )[source] 

Initialize a Batch instance.

Parameters:

items (list [BatchItem]) – List of BatchItem instances.
batch_id (int ) – Unique identifier for the batch.
created_at (float ) – Timestamp when the batch was created.

property size: int 

Get the batch size.

Returns:: Number of items in the batch.
Return type:: int

property user_prompts: List [str ]

Extract user prompts from batch items.

Returns:: List of user prompts.
Return type:: list [str ]

property formatted_prompts: List [str ]

Extract formatted prompts from batch items.

Returns:: List of formatted prompts.
Return type:: list [str ]

property response_queues: List [Queue]

Extract response queues from batch items.

Returns:: List of response queues.
Return type:: list [dragon.native.Queue]

property latency_metrics: List [Tuple [float , float , float ]]

Extract latency metrics from batch items.

Returns:: List of latency metrics tuples.
Return type:: list [tuple [float , float , float ]]

property tools_list: List [List [Dict [str , Any ]] | None ]

Extract per-request tool definitions from the batch.

Chat requests may carry OpenAI-style tool schemas. The batch preserves one entry per request so the LLM process can apply each request’s tool definitions when formatting chat messages.

Returns:: Tool definition lists, or None for requests without tools.
Return type:: list [list [dict ] | None]

property json_schema_list: List [dict | None ]

Extract per-request JSON schema overrides from the batch.

A schema entry enables vLLM guided or structured decoding for that request. None means the request should use free-form generation.

Returns:: JSON schema dictionaries, or None for free-form requests.
Return type:: list [dict | None]

property continue_final_message_list: List [bool ]

Extract chat-template continuation flags from the batch.

When True, the LLM process asks the tokenizer to continue the final assistant message instead of adding a new generation prompt.

Returns:: One continuation flag per batch item.
Return type:: list [bool ]