dragon.ai.inference.batching.Batch
- class Batch[source]
Bases:
objectA collection of items to be processed together.
- __init__(items: List [BatchItem], batch_id: int , created_at: float )[source]
Initialize a Batch instance.
Methods
__init__(items, batch_id, created_at)Initialize a Batch instance.
Attributes
Per-request
continue_final_messageflags.Extract formatted prompts from batch items.
Per-request JSON schema overrides for guided decoding.
Extract latency metrics from batch items.
Extract response queues from batch items.
Get the batch size.
Per-request tool definitions.
Extract user prompts from batch items.
- __init__(items: List [BatchItem], batch_id: int , created_at: float )[source]
Initialize a Batch instance.
- property response_queues: List [Queue]
Extract response queues from batch items.
- Returns:
List of response queues.
- Return type:
list [dragon.native.Queue]
- property latency_metrics: List [Tuple [float , float , float ]]
Extract latency metrics from batch items.