dragon.ai.inference.llm_proxy.ResponseQueuePool

class ResponseQueuePool[source]

Bases: object

Bounded pool of reusable, minimal Dragon Queues.

Lazily allocates up to pool_size single-slot queues (maxsize=1, block_size=2048) and hands them out on demand. When all queues are in use, acquire() awaits until one is returned via release() — providing natural backpressure without an external semaphore.

Async-safe via asyncio.Queue .

Parameters:
  • pool_size (int ) – Maximum number of queues to keep alive.

  • block_size (int ) – Block size for each pooled queue’s backing channel.

__init__(pool_size: int = 32, block_size: int = 2048) None [source]

Methods

__init__([pool_size, block_size])

acquire()

Return a Dragon Queue from the pool.

release(queue)

Return queue to the pool for reuse.

shutdown()

Destroy all pooled queues.

Attributes

pool_available

Number of idle pooled queues ready for immediate reuse.

__init__(pool_size: int = 32, block_size: int = 2048) None [source]
property pool_available: int

Number of idle pooled queues ready for immediate reuse.

async acquire()[source]

Return a Dragon Queue from the pool.

  • If an idle queue is available it is returned immediately.

  • If the pool has not yet reached pool_size, a new queue is created (off the event loop via asyncio.to_thread).

  • If the pool is exhausted (all queues in use), awaits until a queue is returned via release().

Returns:

A Dragon Queue ready for a single get/put cycle.

Return type:

dragon.native.Queue

async release(queue) None [source]

Return queue to the pool for reuse.

Parameters:

queue – The Dragon Queue to release.

async shutdown() None [source]

Destroy all pooled queues. Call once during process teardown.