dragon.infrastructure.policy

Classes

Policy

The purpose of the Policy object is to enable fine-tuning of process distribution across nodes, devices, cores, and etc.

class Policy

The purpose of the Policy object is to enable fine-tuning of process distribution across nodes, devices, cores, and etc. For example spawning a process that should only run on certain CPU cores, or only have access to specific GPU devices on a node, or how to spread a collection of processes over nodes (e.g. Round-robin, Blocking, etc).

There is a default global Policy in place that utilizes no core or accelerator affinity, and places processes across nodes in a round-robin fashion. Using Policy is meant to fine-tune this default policy, and any non-specific attributes will use the default policy values.

For a full example of setup for launching processes through Global Services, see examples/dragon_gs_client/pi_demo.py When launching a process simply create a policy with the desired fine-tuning, and pass it in as a paramter.

class Placement

Which node to assign a policy to.

Local and Anywhere will be useful later for multi-system communication Right now Placement will have little effect unless HOST_NAME or HOST_ID are used, which will try to place a policy on the specified node

LOCAL - Local to current system of nodes ANYWHERE - Place anywhere HOST_NAME - Place on node with specific name HOST_ID - Place on node with specific ID DEFAULT - Defaults to ANYWHERE

class Distribution

Pattern to use to distribute policies across nodes

ROUNDROBIN BLOCK DEFAULT - Defaults to roundrobin

class Device

Which type of device the affinity policy will apply to

class Affinity

Device Affinity distribution

ANY - Place on any available core or GPU device SPECIFIC - Place to specific CPU core affinities or GPU device DEFAULT - Defaults to Any

class WaitMode

Channel WaitMode type

__init__(placement: ~dragon.infrastructure.policy.Policy.Placement = Placement.DEFAULT, host_name: str = '', host_id: int = -1, distribution: ~dragon.infrastructure.policy.Policy.Distribution = Distribution.DEFAULT, device: ~dragon.infrastructure.policy.Policy.Device = Device.DEFAULT, affinity: ~dragon.infrastructure.policy.Policy.Affinity = Affinity.DEFAULT, cpu_affinity: list[int] = <factory>, gpu_env_str: str = '', gpu_affinity: list[int] = <factory>, wait_mode: ~dragon.infrastructure.policy.Policy.WaitMode = WaitMode.DEFAULT, refcounted: bool = True) None