dragon.native.pool.Pool

class Pool

Bases: object

A Dragon native Pool relying on native Process Group and Queues

The interface resembles the Python Multiprocessing.Pool interface and focuses on the most generic functionality. Here we directly support the asynchronous API. The synchronous version of these calls are supported by calling get on the objects returned from the asynchronous functions. By using a Dragon native Process Group to coordinate the worker processes this implementation addresses scalability limitations with the patched base implementation of Multiprocessing.Pool.

At this time, signal.SIGUSR2 is used to signal workers to shutdown. Users should not use or register a handler for this signal if they want to use close to shutdown the pool. Using close guarantees that all work submitted to the pool is finished before the signal is sent while terminate sends the signal immediately. If workers don’t immediately exit, terminate will send escalating signals (SIGINT, SIGTERM, then SIGKILL) and wait patience amount of time for processes to exit before sending the next signal. The user is expected to call join following both of these calls. If join is not called, zombie processes may be left and leave the runtime in a corrupted state.

__init__(processes: int = None, initializer: callable = None, initargs: tuple = (), maxtasksperchild: int = None, *, policy: Policy = None, processes_per_policy: int = None)

Init method

Parameters:
  • processes (int , optional) – number of worker processes, defaults to None

  • initializer (callable, optional) – initializer function, defaults to None

  • initargs (tuple , optional) – arguments for initializer function, defaults to ()

  • maxtasksperchild (int , optional) – maximum tasks each worker will perform, defaults to None

  • policy (dragon.infrastructure.policy.Policy or list of dragon.infrastructure.policy.Policy) – determines the placement and resources of processes. If a list of policies is given that the number of processes_per_policy must be specified.

  • processes_per_policy (int ) – determines the number of processes to be placed with a specific policy if a list of policies is provided

Raises:

ValueError – raised if number of worker processes is less than 1

Methods

__init__([processes, initializer, initargs, ...])

Init method

apply_async(func[, args, kwds, callback, ...])

Equivalent to calling func(*args, **kwds) in a non-blocking way.

close()

This method starts a thread that waits for all submitted jobs to finish.

join([patience])

Waits for all workers to return.

kill()

This sends the signal.SIGKILL and sets the threading event immediately.

map_async(func, iterable[, chunksize, ...])

Apply func to each element in iterable.

terminate([patience])

This sets the threading event immediately and sends signal.SIGINT, signal.SIGTERM, and signal.SIGKILL successively with patience time between the sending of each signal until all processes have exited.

__init__(processes: int = None, initializer: callable = None, initargs: tuple = (), maxtasksperchild: int = None, *, policy: Policy = None, processes_per_policy: int = None)

Init method

Parameters:
  • processes (int , optional) – number of worker processes, defaults to None

  • initializer (callable, optional) – initializer function, defaults to None

  • initargs (tuple , optional) – arguments for initializer function, defaults to ()

  • maxtasksperchild (int , optional) – maximum tasks each worker will perform, defaults to None

  • policy (dragon.infrastructure.policy.Policy or list of dragon.infrastructure.policy.Policy) – determines the placement and resources of processes. If a list of policies is given that the number of processes_per_policy must be specified.

  • processes_per_policy (int ) – determines the number of processes to be placed with a specific policy if a list of policies is provided

Raises:

ValueError – raised if number of worker processes is less than 1

terminate(patience: float = 60) None

This sets the threading event immediately and sends signal.SIGINT, signal.SIGTERM, and signal.SIGKILL successively with patience time between the sending of each signal until all processes have exited. Calling this method before blocking on results may lead to some work being undone. If all work should be done before stopping the workers, close should be used.

Parameters:

patience (float , optional) – timeout to wait for processes to join after each signal, defaults to 60 to prevent indefinite hangs

kill() None

This sends the signal.SIGKILL and sets the threading event immediately. This is the most dangerous way to shutdown pool workers and will likely leave the runtime in a corrupted state. Calling this method before blocking on results may lead to some work being undone. If all work should be done before stopping the workers, close should be used.

close() None

This method starts a thread that waits for all submitted jobs to finish. This thread then sends signal.SIGUSR2 to the process group and sets the threading event to close the handle_results thread. Waiting and then sending the signals within the thread allows close to return without all work needing to be done.

Raises:

ValueError – raised if close or terminate have been previously called

join(patience: float = None) None

Waits for all workers to return. The user must have called close or terminate prior to calling join. By default this blocks indefinitely for processes to exit. If a patience is given, then once the patience has passed signal.SIGINT, signal.SIGTERM, and signal.SIGKILL will be sent successively with patience time between the sending of each signal. It is recommended that if a patience is set to a value other than None that a user assume the runtime is corrupted after the join completes.

Parameters:

patience (float , optional) – timeout to wait for processes to join after each signal, defaults to None

Raises:
  • ValueError – raised if close or terminate were not previously called

  • ValueError – raised if join has already been called and the process group is closed

apply_async(func: callable, args: tuple = (), kwds: dict = {}, callback: callable = None, error_callback: callable = None) ApplyResult

Equivalent to calling func(*args, **kwds) in a non-blocking way. A result is immediately returned and then updated when func(*args, **kwds) has completed.

Parameters:
  • func (callable) – user function to be called by worker function

  • args (tuple , optional) – input args to func, defaults to ()

  • kwds (dict , optional) – input kwds to func, defaults to {}

  • callback (callable, optional) – user provided callback function, defaults to None

  • error_callback (callable, optional) – user provided error callback function, defaults to None

Raises:

ValueError – raised if pool has already been closed or terminated

Returns:

A result that has a get method to retrieve result of func(*args, **kwds)

Return type:

ApplyResult

map_async(func: callable, iterable: Iterable , chunksize: int = None, callback: callable = None, error_callback: callable = None) MapResult

Apply func to each element in iterable. The results are collected in a list that is returned immediately. It is equivalent to map if the get method is called immediately on the returned result.

Parameters:
  • func (callable) – user provided function to call on elements of iterable

  • iterable (iterable) – input args to func

  • chunksize (int , optional) – size of work elements to be submitted to input work queue, defaults to None

  • callback (callable, optional) – user provided callback function, defaults to None

  • error_callback (callable, optional) – user provided error callback function, defaults to None

Returns:

A result that has a get method that returns an iterable of the output from applying func to each element of iterable.

Return type:

MapResult