dragon.infrastructure.policy
Classes
The purpose of the Policy object is to enable fine-tuning of process distribution across nodes, devices, cores, and etc. |
- class Policy
Bases:
object
The purpose of the Policy object is to enable fine-tuning of process distribution across nodes, devices, cores, and etc. For example spawning a process that should only run on certain CPU cores, or only have access to specific GPU devices on a node, or how to spread a collection of processes over nodes (e.g. Round-robin, Blocking, etc).
There is a default global Policy in place that utilizes no core or accelerator affinity, and places processes across nodes in a round-robin fashion. Using Policy is meant to fine-tune this default policy, and any non-specific attributes will use the default policy values.
For a full example of setup for launching processes through Global Services, see
examples/dragon_gs_client/pi_demo.py
When launching a process simply create a policy with the desired fine-tuning, and pass it in as a paramter.- class Placement
Bases:
IntEnum
Which node to assign a policy to.
Local and Anywhere will be useful later for multi-system communication Right now Placement will have little effect unless HOST_NAME or HOST_ID are used, which will try to place a policy on the specified node
LOCAL - Local to current system of nodes ANYWHERE - Place anywhere HOST_NAME - Place on node with specific name HOST_ID - Place on node with specific ID DEFAULT - Defaults to ANYWHERE
- LOCAL = -5
- ANYWHERE = -4
- HOST_NAME = -3
- HOST_ID = -2
- DEFAULT = -1
- class Distribution
Bases:
IntEnum
Pattern to use to distribute policies across nodes
ROUNDROBIN BLOCK DEFAULT - Defaults to roundrobin
- ROUNDROBIN = 1
- BLOCK = 2
- DEFAULT = 3
- distribution: Distribution = 3
- property sdesc
- get_sdict()
- classmethod from_sdict(sdict)
- classmethod global_policy()
- __init__(placement: ~dragon.infrastructure.policy.Policy.Placement = Placement.DEFAULT, host_name: str = '', host_id: int = -1, distribution: ~dragon.infrastructure.policy.Policy.Distribution = Distribution.DEFAULT, cpu_affinity: list[int] = <factory>, gpu_env_str: str = '', gpu_affinity: list[int] = <factory>, wait_mode: ~dragon.infrastructure.policy.Policy.WaitMode = WaitMode.DEFAULT, refcounted: bool = True) None