dragon.infrastructure.parameters
Launch parameters for Dragon infrastructure and managed processes
Dragon processes that get launched need a number of parameters explaining the size of the allocation, which node they are on, and other information needed to bootstrap communication to the infrastructure itself.
It isn’t reasonable to provide these on the command line, so they are instead passed through environment variables managed by this module.
Functions
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Classes
Launch Parameters for Dragon processes. |
|
Used to encapsulate policy decisions. |
|
Defines a parameter and how it is to be checked. |
Exceptions
- class TypedParm
Bases:
object
Defines a parameter and how it is to be checked.
- Attributes:
name: string name of parameter, as env variable comes out as DRAGON_name cast: callable to cast the string parameter to what it should be check: had better be True, call on output of cast default: default value for parameter; default/default is None
- __init__(name, cast, check, default=None)
- nocast(x)
- nocheck(dummy)
- nonnegative(x)
- positive(x)
- check_mode(mode)
- check_pool(pool)
- cast_wait_mode(wait_mode=WaitMode(0))
- cast_return_when_mode(return_when=ReturnWhen(1))
- check_base64(strdata)
- typecast(ty)
- class LaunchParameters
Bases:
object
Launch Parameters for Dragon processes.
This class contains the basic runtime parameters for Dragon infrastructure and managed processes.
The parameter names are available in the environment of every Dragon managed process with the prefix
DRAGON_
. E.g.DRAGON_MY_PUID
forMY_PUID
. Setting some of these parameters will alter runtime behaviour. E.g. settingDRAGON_DEBUG=1
in the environment before starting the runtime, will setup the runtime in debug mode and result in verbose log files.The parameter names are available as attributes with lower cased names in Python.
The following launch parameters are defined
- Parameters:
MODE (str) – Startup mode of the infrastructure, one of
'crayhpe-multi'
,'single'
,'test'
, defaults to'test'
.INDEX (int) – Internal non-negative index of the current node, defaults to 0.
DEFAULT_PD (str) – Base64 encoded serialized descriptor of the default user memory pool, defaults to ‘’.
INF_PD (str) – Base64 encoded serialized descriptor of the default infrastructure memory pool, defaults to ‘’.
LOCAL_SHEP_CD (str) – Base64 encoded serialized descriptor of the channel to send to the local services on the same node, defaults to ‘’.
LOCAL_BE_CD (str) – Base64 encoded serialized descriptor of the channel to send to the launcher backend on the same node, defaults to ‘’.
GS_CD (str) – Base64 encoded serialized descriptor of the channel to send to global services, defaults to ‘’.
GS_RET_CD (str) – Base64 encoded serialized descriptor of the channel to receive from global services, defaults to ‘’.
SHEP_RET_CD (str) – Base64 encoded serialized descriptor of the channel to receive from the local services on the same node, defaults to ‘’.
DEFAULT_SEG_SZ (int) – Size of the default user managed memory pool in bytes, defaults to 2**32.
INF_SEG_SZ (int) – Size of the default infrastructure managed memory pool in bytes, defaults to 2**30.
MY_PUID (int) – Unique process ID (
puid
) given to this process by global services, defaults to 1.TEST (int) – if in test mode, defaults to 0.
DEBUG (int) – if in debug mode, defaults to 0.
BE_CUID (int) – Unique channel ID (
cuid
) of the launcher backend on this node, defaults to 2**60.INF_WAIT_MODE (str) – Default wait mode of infrastructure objects. One of
'IDLE_WAIT'
,'SPIN_WAIT'
, or'ADAPTIVE_WAIT'
. Defaults to'IDLE_WAIT'
.USER_WAIT_MODE (str) – Default wait mode of user objects. One of
'IDLE_WAIT'
,'SPIN_WAIT'
, , or'ADAPTIVE_WAIT'
. defaults to'ADAPTIVE_WAIT'
.INF_RETURN_WHEN_MODE (str) – Default return mode for infrastructure objects. One of
'WHEN_IMMEDIATE'
,'WHEN_BUFFERED'
,'WHEN_DEPOSITED'
,'WHEN_RECEIVED'
, defaults to'WHEN_BUFFERED'
.USER_RETURN_WHEN_MODE (str) – Default return mode for user objects. One of
'WHEN_IMMEDIATE'
,'WHEN_BUFFERED'
,'WHEN_DEPOSITED'
,'WHEN_RECEIVED'
, defaults to'WHEN_DEPOSITED'
.GW_CAPACITY (int) – Positive capacity of gateway channels, defaults to 2048.
HSTA_MAX_EJECTION_MB (int) – Size in MB of buffers used for network receive operations. This controls network ejection rate, defaults to 8.
HSTA_MAX_GETMSG_MB (int) – Size in MB of buffers used for local
'getmsg
’ operations. This controls memory consumption rate for messages with'GETMSG
’ protocol, defaults to 8.HSTA_FABRIC_BACKEND (str) – Select the fabric backend to use. Possible options are
'ofi
’,'ucx
’, and'mpi
’, defaults to'ofi
’.HSTA_TRANSPORT_TYPE (str) – Select if HSTA uses point-to-point operations or RDMA Get to send large messages. Possible options are
'p2p
’ and'rma
’, defaults to'p2p
’.PMOD_COMMUNICATION_TIMEOUT (int) – Timeout in seconds for PMOD communication with child MPI processes, defaults to 30.
BASEPOOL (str) – Default implementation of multiprocessing pool. One of
'NATIVE'
or'PATCHED'
, defaults to'NATIVE'
.
- classmethod init_class_vars()
- __init__(**kwargs)
- classmethod all_parm_names()
- classmethod remove_node_local_evars(env)
- static from_env(init_env=None)
- env()
Returns dictionary of environment variables for this object.
An entry is created for parameters that are not None and are listed in LaunchParameters.Parms, with the correctly prefixed name of the environment variable and so on.
The reason for this option is to let a process or actor construct the launch parameters it wants to have for another process it is getting ready to start.
- set_num_gateways_per_node(num_gateways=0)
- NODE_LOCAL_PARAMS = frozenset({'BE_CUID', 'DEFAULT_PD', 'INF_PD', 'LOCAL_BE_CD', 'LOCAL_SHEP_CD'})
- PARMS = [<dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>, <dragon.infrastructure.parameters.TypedParm object>]
- gw_env_vars = frozenset({})
- reload_this_process()
- class Policy
Bases:
object
Used to encapsulate policy decisions.
An instance of this class can be used to configure the policy decisions made within code as shown. Right now it includes the wait mode. But it is intended to include other configurable policy decisions in the future like affinity.
There are two constant instances of Policy that are created as POLICY_INFRASTRUCTURE and POLICY_USER as defaults for both infrastructure and user program resources.
- __init__(wait_mode, return_from_send_when)
- property wait_mode
- property return_when