dragon.infrastructure.policy

Classes

Policy

The purpose of the Policy object is to enable fine-tuning of process distribution across nodes, devices, cores, and etc.

class Policy

Bases: object

The purpose of the Policy object is to enable fine-tuning of process distribution across nodes, devices, cores, and etc. For example spawning a process that should only run on certain CPU cores, or only have access to specific GPU devices on a node, or how to spread a collection of processes over nodes (e.g. Round-robin, Blocking, etc).

There is a default global Policy in place that utilizes no core or accelerator affinity, and places processes across nodes in a round-robin fashion. Using Policy is meant to fine-tune this default policy, and any non-specific attributes will use the default policy values.

For a full example of setup for launching processes through Global Services, see examples/dragon_gs_client/pi_demo.py When launching a process simply create a policy with the desired fine-tuning, and pass it in as a paramter.

class Placement

Bases: IntEnum

Which node to assign a policy to.

Local and Anywhere will be useful later for multi-system communication Right now Placement will have little effect unless HOST_NAME or HOST_ID are used, which will try to place a policy on the specified node

LOCAL - Local to current system of nodes ANYWHERE - Place anywhere HOST_NAME - Place on node with specific name HOST_ID - Place on node with specific ID DEFAULT - Defaults to ANYWHERE

LOCAL = -5

ANYWHERE = -4

HOST_NAME = -3

HOST_ID = -2

DEFAULT = -1

class Distribution

Bases: IntEnum

Pattern to use to distribute policies across nodes

ROUNDROBIN BLOCK DEFAULT - Defaults to roundrobin

ROUNDROBIN = 1

BLOCK = 2

DEFAULT = 3

class WaitMode

Bases: IntEnum

Channel WaitMode type

IDLE = 1

SPIN = 2

DEFAULT = 3

placement: Placement = -1

host_name: str = ''

host_id: int = -1

distribution: Distribution = 3

cpu_affinity: list[int]

gpu_env_str: str = ''

gpu_affinity: list[int]

wait_mode: WaitMode = 3

refcounted: bool = True

property sdesc

get_sdict()

classmethod from_sdict(sdict)

classmethod global_policy()

__init__(placement: ~dragon.infrastructure.policy.Policy.Placement = Placement.DEFAULT, host_name: str = '', host_id: int = -1, distribution: ~dragon.infrastructure.policy.Policy.Distribution = Distribution.DEFAULT, cpu_affinity: list[int] = <factory>, gpu_env_str: str = '', gpu_affinity: list[int] = <factory>, wait_mode: ~dragon.infrastructure.policy.Policy.WaitMode = WaitMode.DEFAULT, refcounted: bool = True) → None