Infrastructure Messages
Services in the Dragon runtime interact with each other using messages transported with a variety of different means (mostly Channels). Although typically there will be a Python API to construct and send these messages, the messages themselves constitute the true internal interface. To that end, they are a convention.
Here we document this internal message API.
Generalities about Messages
Many messages are part of a request/response pair. At the level discussed here, successful and error responses are carried by the same response type but the fields present or valid may change according to whether the response is successful or an error.
All messages are serialized using JSON; the reason for this is to allow non-Python actors to interact with the Dragon infrastructure as well as allowing the possibility for the different components of the Dragon runtime to be implemented in something other than Python.
The serialization method may change in the future as an optimization; JSON is chosen here as a reasonable cross language starting point. In particular, the fields discussed here for basic message parsing (type and instance tagging, and errors) are defined in terms of integer valued enumerations suitable for a fixed field location and width binary determination.
The top level of the JSON structure will be a dictionary (map) where the ‘_tc’ key’s value is an integer
corresponding to an element of the dragon.messages.MsgTypes
enumeration class. The other fields of this
map will be defined on a message by message basis.
The canonical form of the message will be taken on the JSON level, not the string level. This means that messages should not be compared or sorted as strings. Internally the Python interface will construct inbound messages into class objects of distinct type according to the ‘_tc’ field in the initializer by using a factory function. The reason to have a lot of different message types instead of just one type differentiated by the value of the ‘_tc’ field is that this allows IDEs and documentation tools to work better by having explicit knowledge of what fields and methods are expected in any particular message.
Common Fields
These fields are common to every message described below depending on their category of classification.
The _tc TypeCode Field
The _tc
typecode field is used during parsing to determine the type of a received message. The message
format is JSON. The _tc
field can be used to determine of all expected fields of a message are indeed in
the message to verify its format.
Beyond the _tc
typecode field, there are other fields expected to belong to every message.
The tag Field
Every message has a 64 bit positive unsigned integer tag
field. Together with the identity of the sender
is implicit or explicitly identified in some other field, this serves to uniquely identify the message.
Ideally, this would be throughout the life of the runtime, but it’s enough to say that no two messages should be simultaneously relevant with the same (sender, tag).
Message Categories
- There are three general categories of messages:
request messages, where one entity asks another one to do something
- response messages, returning the result to the requester
the name of these messages normally ends “Response”
other messages, mostly to do with infrastructure sequencing
Every request message will contain p_uid
and r_c_uid
fields. These fields denote the unique process
ID of the entity that created the request and the unique channel ID to send the response to. See
Conventional IDs for more details on process identification.
The ref Field
Every response message generated in reply to a request message will contain a ref
field that echos the
tag
field of the request message that caused it to be generated. If a response message is generated
asynchronously - for instance, as part of a startup notification or some event such as a process exiting
unexpectedly - then the ref
field will be 0, which is an illegal value for the tag
field.
The err Field
Every response message will also have an err
field, holding an integer enumeration particular to the
response. This enumeration will be the class attribute Errors
in the response message class. The
enumeration element with value 0 will always indicate success. Which fields are present in the response
message may depend on the enumeration value.
Format of Messages List
Class Name
- type enum
member of
dragon.messages.MsgTypes
enum class- purpose
succinct description of purpose, sender and receiver, where and when typically seen. Starts with ‘Request’ if it is a request message and ‘Response’ if it is a response message.
- fields
Fields in the message, together with description of how they are to get parsed out. Some fields may themselves be dictionary structures used to initialize member objects.
Fields in common to every message (such as
tag
,ref
, anderr
) are not mentioned here.An expected type will be mentioned on each field where it is an obvious primitive (such as a string or an integer). Where a type isn’t mentioned, it is assumed to be another key-value map to be interpreted as the initializer for some more complex type.
- response
If a request kind of message, gives the class name of the corresponding response message, if applicable
- request
If a request kind of message, gives the class name of the corresponding response message.
- see also
Other related messages
Global Services Messages API
These messages are the ones underlying the Global Services API and are sent only to and from Global Services.
All global services request messages have an integer p_uid
field that contains the process UID of the
requester and an integer r_c_uid
field that contains the channel ID to send the response to. This field
is not echoed in the corresponding response. For details on IDs, see Conventional IDs.
GS Process Messages
GSProcessCreate
- type enum
GS_PROCESS_CREATE
- purpose
Request to global services to create a new managed process.
fields
- exe
string
executable name
- args
list of strings
arguments to the executable. [exe, args] == argv
- env
map, key=string value=string
Environment variables to be exported before running.
Note that this is additive/override to the shepherd environment - may need to be more fancy here if it turns out we need to add to paths
- rundir
string, absolute path
Indicates where the process is to be run from. If empty, does not matter.
- user_name
string
requested user name for the process
- options
options
initializer for other process options object
output_c_uid is the channel id where output, both stdout and stderr, should be
forwarded if present in an SHFwdOutput message. If this field is not present in the options, then output will be forwarded to the Backend. The SHFwdOutput message indicates whether it was from stdout or stderr.
response GSProcessCreateResponse
- see also
GSProcessDestroy
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSProcessCreateResponse
- type enum
GS_PROCESS_CREATE_RESPONSE
- purpose
Response to process creation request.
fields
Alternatives on
err
:- SUCCESS ( = 0)
A new process was created
- desc
map
initializer for ProcessDescriptor - includes at a minimum p_uid of new process and assigned user name.
- FAIL
No process was created
- err_info
string
explanation of what went wrong
Might add some more here, e.g. is this a GS related problem or a Shepherd related problem.
- ALREADY
The user name was already in use
- desc
map
init for the ProcessDescriptor of the process that already exists with that user name
- request
GSProcessCreate
implementation(s):
Python
GSProcessList
- type enum
GS_PROCESS_LIST
- purpose
Return a list of the p_uid for all the processes currently being managed
- fields
None additional
- response
GSProcessListResponse
- see also
GSProcessQuery
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSProcessListResponse
- type enum
GS_PROCESS_LIST_RESPONSE
- purpose
Responds with a list of the p_uid for all the processes currently being managed
- fields
- plist
list of nonnegative integers for all the processes
- request
GSProcessList
- see also
GSProcessQuery
implementation(s):
Python
GSProcessQuery
- type enum
GS_PROCESS_QUERY
- purpose
Request the ProcessDescriptor for a managed process
- fields
One of the
t_p_uid
anduser_name
fields must be present.If both are specified, then the
user_name
field is ignored.- t_p_uid
integer
target process UID
- user_name
string
user supplied name for the target process
- response
GSProcessQueryResponse
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSProcessQueryResponse
- type enum
GS_PROCESS_QUERY_RESPONSE
- purpose
Response to request for ProcessDescriptor for a managed process
- fields
Alternatives on
err
:- SUCCESS (= 0)
Process has been found
- desc
map
initializer for ProcessDescriptor - includes at a minimum p_uid of new process and assigned user name.
- UNKNOWN (= 1)
No such process is known.
- err_info
string
explanation of what went wrong
- request
GSProcessQuery
- see also
GSProcessList
implementation(s):
Python
GSProcessKill
- type enum
GS_PROCESS_KILL
- purpose
Request a managed process get killed
- fields
One of the
t_p_uid
anduser_name
fields must be present.If both are specified, then the
user_name
field is ignored.- t_p_uid
integer
target process UID
- user_name
string
user supplied name for the target process
- sig
integer
signal to kill the process with
- response
GSProcessKillResponse
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSProcessKillResponse
- type enum
GS_PROCESS_KILL_RESPONSE
- purpose
Response to GSProcessKill message
- fields
Alternatives on
err
:- SUCCESS (= 0)
Process has been found and has been killed.
- exit_code
integer
exit code the process returned
- UNKNOWN (= 1)
No such process is known.
- err_info
string
explanation of what went wrong
- FAIL_KILL (= 2)
Something went wrong in killing the process.
- err_info
string
explanation of what went wrong
- DEAD (= 3)
The process is already dead.
No other fields.
- request
GSProcessKill
- see also
GSProcessQuery
implementation(s):
Python
GSProcessJoin
- type enum
GS_PROCESS_JOIN
- purpose
Request notification when a given process exits
- fields
One of the
t_p_uid
anduser_name
fields must be present.If both are specified, then the
user_name
field is ignored.- t_p_uid
integer
target process UID
- user_name
string
user supplied name for the target process
- timeout
integer, interpreted as microseconds
timeout value; any value < 0 interpreted as infinite timeout
- response
GSProcessJoinResponse
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSProcessJoinResponse
- type enum
GS_PROCESS_JOIN_RESPONSE
- purpose
Response to request for notification when a process exits.
- fields
Alternatives on
err
:- SUCCESS (= 0)
Target process has exited. If the target process has already exited, this is not an error.
- exit_code
integer
the Unix return value of the process
- UNKNOWN (= 1)
No such process is known.
- err_info
string
explanation of what went wrong
- TIMEOUT (= 2)
The timer has expired and the process still hasn’t exited.
No fields.
- request
GSProcessJoin
- see also
GSProcessQuery
implementation(s):
Python
GSProcessJoinList
- type enum
GS_PROCESS_JOIN_LIST
- purpose
Request notification when any/all process from a given list exits.
- fields
Both
t_p_uid_list
anduser_name_list
fields could be present. Any process in the list could be identified by either p_uid or user_name.- t_p_uid_list
list of integers
list of target processes UID
- user_name_list
list of strings
list of user supplied names for the target processes
- timeout
integer, interpreted as microseconds
timeout value; any value < 0 interpreted as infinite timeout
- join_all
bool
default False, request notification when any process from the list exits. When True, request notification when all processes exit.
- response
GSProcessJoinListResponse
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSProcessJoinListResponse
- type enum
GS_PROCESS_JOIN_LIST_RESPONSE
- purpose
Response to request for notification when any/all process from a given list exits.
- fields
Reflect the status of every process in the list:
- **puid_status*
dictionary with the status of every process
key: p_uid
value: tuple (status, info)
The status of a process in the list can be one of the following:
- SUCCESS (= 0)
Target process has exited. If the target process has already exited, this is not an error.
- info:
integer
the Unix return value of the process
- UNKNOWN (= 1)
No such process is known.
- info:
string
explanation of what went wrong
- TIMEOUT (= 2)
The timer has expired and the process still hasn’t exited.
- info:
None
- SELF (= 3)
Attempt to join itself.
- info:
string
explanation of what went wrong
- PENDING (= 4)
The process is still pending (not exited, no timeout).
- info:
None
- request
GSProcessJoinList
implementation(s):
Python
Pool Messages
These messages concern the lifecycle of named memory pools. All requests
related to channels will include the p_uid
field of the requesting entity
and the r_c_uid
field for the channel to send the response to.
Pools are identified in the runtime system with a nonnegative globally unique
ID number, the ‘memory UID’ normally abbreviated as m_uid
.
GSPoolCreate
- type enum
GS_POOL_CREATE
- purpose
Requests that a new user memory pool be created.
fields
- user_name
string
optional user specified name for the pool. If absent or empty, Global Services will select a name.
this may not necessarily be the same name used on the node to name the pool object to the OS
- options
map
Initializer for globalservices.pool.PoolOptions object. This is an open type intended to encompass what is in a
dragonMemoryPoolAttr_t
but also will carry system level options such as what node to create the pool on.
- size
integer > 0
size of pool requested, in bytes. Note that the actual size of the pool delivered may be larger than this
- response
GSPoolCreateResponse
- see also
SHPoolCreate
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSPoolCreateResponse
- type enum
GS_POOL_CREATE_RESPONSE
- purpose
Response to request for a pool creation
- fields
Alternatives on
err
:- SUCCESS (= 0)
- desc
map
Initializer for a globalservices.pool.PoolDescriptor object. This object will contain the assigned
m_uid
and name for the pool, as well as the (serialized) dragonMemoryPoolSerial_t library descriptor.
- FAIL (= 1)
Something went wrong in pool creation. This can be an error on the node level (found when the call to allocate the resources is actually made) or a logical error with the request.
- err_info
string
explanation of what went wrong
- err_code
integer
The error code from the library call, if applicable. Not all errors will be revealed when the call is made; if the error isn’t due to the underlying API call
- ALREADY (= 2)
The user name for the pool was already in use; the current descriptor is returned.
- desc
map
Initializer for a globalservices.pool.PoolDescriptor object. This object will contain the assigned
m_uid
and name for the pool as well as the (serialized) dragonMemoryPoolSerial_t library descriptor.
- request
GSPoolCreate
- see also
SHPoolCreate, SHPoolCreateResponse
implementation(s):
Python
GSPoolDestroy
- type enum
GS_POOL_DESTROY
- purpose
Request destruction of a managed memory pool.
- fields
One of the
m_uid
anduser_name
fields must be present.If both are specified, then the
user_name
field is ignored.- m_uid
integer
target memory pool ID
- user_name
string
user supplied name for the target pool
- response
GSPoolDestroyResponse
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSPoolDestroyResponse
- type enum
GS_POOL_DESTROY_RESPONSE
- purpose
Response to GSPoolDestroy message
- fields
Alternatives on
err
:- SUCCESS (= 0)
Pool has been found and has been removed.
No other fields.
- UNKNOWN (= 1)
No such pool is known.
- err_info
string
explanation of what went wrong
- FAIL (= 2)
Something went wrong in removing the pool. This could be for a number of reasons.
- err_info
string
explanation of what went wrong
- err_code
integer
The error code from the library call, if applicable. Not all errors will be revealed when the call is made; if the error isn’t due to the underlying API call
- GONE (= 3)
The pool is already destroyed
No other fields.
- PENDING (= 4)
Pool creation is pending, and it can’t be destroyed. Wait until the GSPoolCreateResponse message has been received.
- request
GSPoolDestroy
implementation(s):
Python
GSPoolList
- type enum
GS_POOL_LIST
- purpose
Return a list of tuples of
m_uid
for all pools currently alive.- fields
None additional
- response
GSPoolListResponse
- see also
GSPoolQuery
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSPoolListResponse
- type enum
GS_POOL_LIST_RESPONSE
- purpose
Responds with a list of
m_uid
for all the pools currently alive- fields
- mlist
list of nonnegative integers
- request
GSPoolList
- see also
GSPoolQuery
implementation(s):
Python
GSPoolQuery
- type enum
GS_POOL_QUERY
- purpose
Request the PoolDescriptor for a managed memory pool.
- fields
One of the
m_uid
anduser_name
fields must be present.If both are specified, then the
user_name
field is ignored.- m_uid
integer
target pool UID
- user_name
string
user supplied name for the target pool
- response
GSPoolQueryResponse
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSPoolQueryResponse
- type enum
GS_POOL_QUERY_RESPONSE
- purpose
Response to request for PoolDescriptor for a managed memory pool. This object carries a serialized library pool descriptor and which node the pool is found on.
- fields
Alternatives on
err
:- SUCCESS (= 0)
Pool has been found
- desc
map
initializer for PoolDescriptor
- UNKNOWN (= 1)
No such pool is known.
- err_info
string
explanation of what went wrong
- request
GSPoolQuery
- see also
GSPoolList
implementation(s):
Python
Channel Messages
These messages are related to the lifecycle of channels. All requests
related to channels will include a p_uid
field, of the requesting
process and the r_c_uid
field for the channel to send the response to.
Channels are identified in the runtime system with a nonnegative
globally unique ID number, the ‘channel UID’ normally abbreviated
as c_uid
. Note that every channel is associated with a memory pool
which itself has its own m_uid
.
GSChannelCreate
- type enum
GS_CHANNEL_CREATE
- purpose
Requests that a new channel be created.
- fields
- user_name
string
optional user specified name for the channel. If absent or empty, Global Services will select a name.
- options
map
Initializer for a (global) ChannelOptions object. This is an open type intended to completely describe the functionality of the channel. It contains fields pertaining to the global function and local characteristics of the channel.
- m_uid
integer
m_uid of the pool to create the channel in
if 0, use the default pool on the target node.
- response
GSChannelCreateResponse
- see also
GSChannelQuery
refer to the Common Fields section for additional request message fields
- implementation(s):
Python
Python
GSChannelCreateResponse
- type enum
GS_CHANNEL_CREATE_RESPONSE
- purpose
Response to channel creation request.
- fields
Alternatives on
err
:- SUCCESS (= 0)
- desc
map
Initializer for a ChannelDescriptor object. This object will contain the assigned
c_uid
and name for the channel and the serialized channel descriptor as well as them_uid
for the pool associated with the channel.
- FAIL (= 1)
Something went wrong in channel creation.
- err_info
string
explanation of what went wrong
- ALREADY (= 2)
The user name was already in use
- desc
map
Initializer for a ChannelDescriptor object. This object will contain the assigned
c_uid
and name for the channel as well as the serialized channel descriptor.
- request
GSChannelCreate
implementation(s):
Python
GSChannelList
- type enum
GS_CHANNEL_LIST
- purpose
Request list of currently active channels.
- fields
None other than mandatory fields. To-do: add something to make this more selective.
- response
GSChannelListResponse
- see also
GSChannelQuery
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSChannelListResponse
- type enum
GS_CHANNEL_LIST_RESPONSE
- purpose
Response to request to list of currently active channels with a list of tuples of (c_uid, name) for all the channels currently being managed
- fields
- clist
list of integers
The list contains the currently active c_uids
- request
GSChannelList
- see also
GSChannelQuery
implementation(s):
Python
GSChannelQuery
- type enum
GS_CHANNEL_QUERY
- purpose
Request the descriptor for an already created channel by channel UID or user name.
- fields
One of the
c_uid
anduser_name
fields must be present.If both are specified, then the
user_name
field is ignored.- c_uid
integer
target channel UID
- user_name
string
user supplied name for the target channel
- inc_refcnt
bool
whether to increment refcnt on the Channel. Default False.
- response
GSChannelQueryResponse
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSChannelQueryResponse
- type enum
GS_CHANNEL_QUERY_RESPONSE
- purpose
Response to request for a channel descriptor
- fields
Alternatives on
err
:- SUCCESS (= 0)
- desc
map
Initializer for a ChannelDescriptor object. This object will contain the assigned
c_uid
and name for the channel as well as them_uid
of the pool associated with the channel. It also includes the serialized library level descriptor.
- UNKNOWN (= 1)
No such channel is active.
- request
GSChannelQuery
implementation(s):
Python
GSChannelDestroy
- type enum
GS_CHANNEL_DESTROY
- purpose
Request that a channel be destroyed or a refcount on the channel be released.
- fields
One of the
c_uid
anduser_name
fields must be present.If both are specified, then the
user_name
field is ignored.- c_uid
integer
target channel UID
- user_name
string
user supplied name for the target channel
- reply_req
bool
default True, reply to this message requested
- dec_ref
bool
default False, release refcount on this channel
- response
GSChannelDestroyResponse
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSChannelDestroyResponse
- type enum
GS_CHANNEL_DESTROY_RESPONSE
- purpose
Response to request to destroy a channel.
- fields
Alternatives on
err
:- SUCCESS (= 0)
No fields.
- UNKNOWN (= 1)
No such channel is known.
- FAIL (= 2)
The channel exists but
- err_info
string
explanation of what went wrong
- err_code
integer
error code from the library call that failed. If the problem is not related to a library call, this value is 0.
- BUSY (= 3)
Some entity is currently using the channel in some way that prevents its destruction.
- err_info
string
explanation of what went wrong
- request
GSChannelDestroy
implementation(s):
Python
GSChannelJoin
Request to be notified when a channel with a given name is created.
Note: when we refactor the messages, we may want to get rid of GSChannelJoin and its response and ride all this on top of GSChannelQuery by adding a timeout.
- type enum
GS_CHANNEL_JOIN
- purpose
Sometimes two processes want to communicate through a channel but we want to create the channel lazily with either the send side or receive side doing it. This lets the other side meet the creation by name in global services without polling.
- fields
- identifier
string
name of channel to join to
- timeout
integer, interpreted as microseconds
timeout value; any value < 0 interpreted as infinite timeout
- response
GSChannelJoinResponse
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSChannelJoinResponse
- type enum
GS_CHANNEL_JOIN_RESPONSE
- purpose
Response to request to join a channel
- fields
Alternatives on
err
:- SUCCESS (= 0)
- desc
map
Initializer for a ChannelDescriptor object. This object will contain the assigned
c_uid
and name for the channel as well as them_uid
of the pool associated with the channel. It also includes the serialized library level descriptor.
- TIMEOUT (= 1)
The specified timeout has expired.
No fields
- DEAD (= 2)
The channel used to exist, but it has been destroyed.
- request
GSChannelJoin
implementation(s):
Python
GSChannelDetach
PLACEHOLDER
This probably needs to mean giving back all the send and receive handles currently acquired.
- type enum
GS_CHANNEL_DETACH
- purpose
Request local detachment from channel.
- fields
- c_uid
integer
id of channel to detach from
- response
GSChannelDetachResponse
- see also
GSChannelAttach
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSChannelDetachResponse
PLACEHOLDER
- type enum
GS_CHANNEL_DETACH_RESPONSE
- purpose
Response to request to detach from a channel.
- fields
Alternatives on
err
:- SUCCESS (= 0)
No fields.
- UNKNOWN (= 1)
No such process is known.
No fields
- UNKNOWN_CHANNEL (= 2)
No such channel is known.
No fields
- NOT_ATTACHED (= 3)
Requested to detach from something you weren’t attached to to begin with.
- request
GSChannelDetach
implementation(s):
Python
GSChannelGetSendh
PLACEHOLDER
- type enum
GS_CHANNEL_GET_SENDH
- purpose
Request send handle for channel
- fields
- c_uid
integer
id of channel to get a send handle for
- response
GSChannelGetSendhResponse
- see also
GSChannelGetRecvh
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSChannelGetSendhResponse
PLACEHOLDER
- type enum
GS_CHANNEL_GET_SENDH_RESPONSE
- purpose
Response to request to get a channel send handle
- fields
Alternatives on
err
:- SUCCESS (= 0)
- sendh
map
info needed to set up the send handle locally
- UNKNOWN (= 1)
No such process is known.
No fields
- UNKNOWN_CHANNEL (= 2)
No such channel is known.
No fields
- NOT_ATTACHED (= 3)
Descriptor not attached, can’t get send handle.
- CANT (= 4)
Can’t get a recv handle - too many already gotten or some other reason related to the type.
- err_info
string
explanation of why not
- request
GSChannelGetSendh
implementation(s):
Python
GSChannelGetRecvh
PLACEHOLDER
- type enum
GS_CHANNEL_GET_RECVH
- purpose
Request recv handle for channel
- fields
- c_uid
integer
id of channel to get a recv handle for
- response
GSChannelGetRecvhResponse
- see also
GSChannelGetSendh
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSChannelGetRecvhResponse
PLACEHOLDER
- type enum
GS_CHANNEL_GET_RECVH_RESPONSE
- purpose
Response to request to get a channel recv handle
- fields
Alternatives on
err
:- SUCCESS (= 0)
- recvh
map
info needed to set up the recv handle locally
- UNKNOWN (= 1)
No such process is known.
No fields
- UNKNOWN_CHANNEL (= 2)
No such channel is known.
No fields
- NOT_ATTACHED (= 3)
Descriptor not attached, can’t get recv handle.
- CANT (= 4)
Can’t get a recv handle - too many already gotten or some other reason related to the type.
- err_info
string
explanation of why not
- request
GSChannelGetRecvh
implementation(s):
Python
Node Messages
These messages are related to hardware transactions with Global Services, e.g. querying the number of cpus in the system.
GSNodeList
- type enum
GS_NODE_LIST
- purpose
Return a list of tuples of
h_uid
for all nodes currently registered.- fields
None additional
- response
GSNodeListResponse
- see also
GSNodeQuery
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSNodeListResponse
- type enum
GS_NODE_LIST_RESPONSE
- purpose
Responds with a list of
h_uid
for all the nodes currently registered.- fields
- hlist
list of nonnegative integers
- request
GSNodeList
- see also
GSNodeQuery
implementation(s):
Python
GSNodeQuery
- type enum
GS_NODE_QUERY
- purpose
Ask Global Services for a node descriptor of the hardware Dragon is running on.
fields
- name
str
Name of the node to query
- h_uid
int
Unique host identifier (h_uid) held by GS.
The host running the Launcher Frontend has h_uid 0
The host running Global Services has h_uid 1
- response
GSNodeQueryResponse
implementation(s):
Python
GSNodeQueryResponse
- type enum
GS_NODE_QUERY_RESPONSE
- purpose
Return the machine descriptor after a GSNodeQuery.
- fields
Alternatives on
err
:SUCCESS (= 0)
The machine descriptor was successfully constructed
- desc
NodeDescriptor
Contains the status of the node Dragon is running on.
- UNKNOWN ( = 1)
An unknown error has occured.
- see also
GSMachineQuery
implementation(s):
Python
GSNodeQueryAll
- type enum
GS_NODE_QUERY_ALL
- purpose
Asks Global Services to return the NodeDescriptor for every node that is part of the system.
fields
- response
GSNodeQueryAllResponse
implementation(s):
Python
GSNodeQueryAllResponse
- type enum
GS_NODE_QUERY_ALL_RESPONSE
- purpose
Return the machine descriptor for every node after GSNodeQueryAll.
- fields
Alternatives on
err
:SUCCESS (= 0)
The machine descriptor was successfully constructed
- descriptors
List of NodeDescriptors, one for each node in the system
- UNKNOWN ( = 1)
An unknown error has occured.
- see also
GSMachineQuery
implementation(s):
Python
GSNodeQueryTotalCPUCount
- type enum
GS_NODE_QUERY_TOTAL_CPU_COUNT
- purpose
Asks GS to return the total number of CPUS beloging to all of the registered nodes.
- response
GSNodeQueryTotalCPUCountResponse
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSNodeQueryTotalCPUCountResponse
- type enum
GS_NODE_QUERY_TOTAL_CPU_COUNT_RESPONSE
- purpose
Return the total number of CPUS beloging to all of the registered nodes.
- fields
Alternatives on
err
:SUCCESS (= 0)
The machine descriptor was successfully constructed
- total_cpus
total number of CPUS beloging to all of the registered nodes.
- UNKNOWN ( = 1)
An unknown error has occured.
- request
GSNodeQueryTotalCPUCount
- see also
GSNodeQuery
implementation(s):
Python
GSGroupDestroy
- type enum
GS_GROUP_DESTROY
- purpose
Ask Global Services to destroy a group of resources. This means destroy all the member-resources, as well as the container/group.
fields
- g_uid
integer
group UID
- user_name
string
user supplied name for the group
- response
GSGroupDestroyResponse
implementation(s):
Python
GSGroupDestroyResponse
- type enum
GS_GROUP_DESTROY_RESPONSE
- purpose
Response to GSGroupDestroy message
- fields
Alternatives on
err
:
- SUCCESS (= 0)
Group has been found and has been destroyed.
- desc
base64 encoded string
The group descriptor
- UNKNOWN (= 1)
No such group is known.
- err_info
string
explanation of what went wrong
- DEAD (= 2)
The group is already dead.
No other fields.
- PENDING (= 3)
The group is pending.
- err_info
string
informing message that the group is still pending
- request
GSGroupDestroy
implementation(s):
Python
GSGroupAddTo
- type enum
GS_GROUP_DESTROY
- purpose
Ask Global Services to add specific resources to an existing group of resources. The resources are already created.
fields
- g_uid
integer
group UID
- user_name
string
user supplied name for the group
- items
list of strings or integers
list of user supplied names or UIDs for the resources
- response
GSGroupAddToResponse
implementation(s):
Python
GSGroupAddToResponse
- type enum
GS_GROUP_ADD_TO_RESPONSE
- purpose
Response to GSGroupAddTo message
- fields
Alternatives on
err
:
- SUCCESS (= 0)
Group has been found and the resources have been added to it.
- desc
base64 encoded string
The group descriptor
- UNKNOWN (= 1)
No such group is known.
- err_info
string
explanation of what went wrong
- FAIL (= 2)
The request has failed. This can happen when at least one of the resources to be added to the group is dead or not found.
- err_info
string
informing message with the UIDs of the resources that were not found or are dead
- DEAD (= 3)
The group is already dead.
No other fields.
- PENDING (= 4)
The group is pending.
- err_info
string
informing message that the group is still pending
- request
GSGroupAddTo
implementation(s):
Python
GSGroupRemoveFrom
- type enum
GS_GROUP_REMOVE_FROM
- purpose
Ask Global Services to remove specific resources from an existing group of resources.
fields
- g_uid
integer
group UID
- user_name
string
user supplied name for the group
- items
list of strings or integers
list of user supplied names or UIDs for the resources
- response
GSGroupRemoveFromResponse
implementation(s):
Python
GSGroupRemoveFromResponse
- type enum
GS_GROUP_REMOVE_FROM_RESPONSE
- purpose
Response to GSGroupRemoveFrom message
- fields
Alternatives on
err
:
- SUCCESS (= 0)
Group has been found and the resources have been removed from it.
- desc
base64 encoded string
The group descriptor
- UNKNOWN (= 1)
No such group is known.
- err_info
string
explanation of what went wrong
- FAIL (= 2)
The request has failed. This can happen when at least one of the resources to be removed from the group was not found.
- err_info
string
informing message with the UIDs of those resources that were not found
- DEAD (= 3)
The group is already dead.
No other fields.
- PENDING (= 4)
The group is pending.
- err_info
string
informing message that the group is still pending
- request
GSGroupRemoveFrom
implementation(s):
Python
GSGroupCreate
- type enum
GS_GROUP_CREATE
- purpose
Ask Global Services to create a group of resources.
fields
- user_name
string
user supplied name for the group
- items
list[tuple[int, dragon.infrastructure.messages.Message]]
- list of tuples where each tuple contains a replication factor and the Dragon
create message
- policy
infrastructure.policy.Policy or dict
policy object that controls the placement of the resources
- response
GSGroupCreateResponse
implementation(s):
Python
GSGroupCreateResponse
- type enum
GS_GROUP_CREATE_RESPONSE
- purpose
Response to GSGroupCreate message
fields
- desc
base64 encoded string
The group descriptor
- request
GSGroupCreate
implementation(s):
Python
GSGroupKill
- type enum
GS_GROUP_KILL
- purpose
Ask Global Services to send the processes belonging to a specified group a specified signal.
fields
- g_uid
integer
group UID
- user_name
string
user supplied name for the group
- sig
int
signal to use to kill the process, default=signal.SIGKILL
- response
GSGroupKillResponse
implementation(s):
Python
GSGroupKillResponse
- type enum
GS_GROUP_KILL_RESPONSE
- purpose
Response to GSGroupKill message
- fields
Alternatives on
err
:
- SUCCESS (= 0)
Group has successfully sent the kill requests.
- desc
base64 encoded string
The group descriptor
- UNKNOWN (= 1)
No such group is known.
- err_info
string
explanation of what went wrong
- DEAD (= 2)
The group is already dead.
No other fields.
- PENDING (= 3)
The group is pending.
- err_info
string
informing message that the group is still pending
- ALREADY (= 4)
All the processes of the group were already dead.
- desc
base64 encoded string
The group descriptor
- request
GSGroupKill
implementation(s):
Python
GSGroupCreateAddTo
- type enum
GS_GROUP_CREATE_ADD_TO
- purpose
Ask Global Services to create and add resources to an existing group of resources.
fields
- g_uid
integer
group UID
- user_name
string
user supplied name for the group
- items
list of strings or integers
list of user supplied names or UIDs for the resources
- policy
policy object that controls the placement of the resources
- response
GSGroupCreateAddToResponse
implementation(s):
Python
GSGroupCreateAddToResponse
- type enum
GS_GROUP_CREATE_ADD_TO_RESPONSE
- purpose
Response to GSGroupCreateAddTo message
- fields
Alternatives on
err
:
- SUCCESS (= 0)
Group has been found and the resources have been created and added to it.
- desc
base64 encoded string
The group descriptor
- UNKNOWN (= 1)
No such group is known.
- err_info
string
explanation of what went wrong
- DEAD (= 2)
The group is already dead.
No other fields.
- PENDING (= 3)
The group is pending.
- err_info
string
informing message that the group is still pending
- request
GSGroupCreateAddTo
implementation(s):
Python
GSGroupDestroyRemoveFrom
- type enum
GS_GROUP_DESTROY_REMOVE_FROM
- purpose
Ask Global Services to destroy and remove specific resources from an existing group of resources.
fields
- g_uid
integer
group UID
- user_name
string
user supplied name for the group
- items
list of strings or integers
list of user supplied names or UIDs for the resources
- response
GSGroupDestroyRemoveFromResponse
implementation(s):
Python
GSGroupDestroyRemoveFromResponse
- type enum
GS_GROUP_REMOVE_FROM_RESPONSE
- purpose
Response to GSGroupDestroyRemoveFrom message
- fields
Alternatives on
err
:
- SUCCESS (= 0)
Group has been found and the resources have been removed from it.
- desc
base64 encoded string
The group descriptor
- UNKNOWN (= 1)
No such group is known.
- err_info
string
explanation of what went wrong
- FAIL (= 2)
The request has failed. This can happen when at least one of the resources to be removed from the group was not found.
- err_info
string
informing message with the UIDs of those resources that were not found
- DEAD (= 3)
The group is already dead.
No other fields.
- PENDING (= 4)
The group is pending.
- err_info
string
informing message that the group is still pending
- request
GSGroupDestroyRemoveFrom
implementation(s):
Python
GSGroupList
- type enum
GS_GROUP_LIST
- purpose
Request a list of the g_uid for all the groups of resources currently being managed
- fields
None additional
- response
GSGroupListResponse
- see also
GSGroupQuery
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSGroupListResponse
- type enum
GS_GROUP_LIST_RESPONSE
- purpose
Response to GSGroupList message
- fields
- glist
list of nonnegative integers for all the groups
- request
GSGroupList
implementation(s):
Python
GSGroupQuery
- type enum
GS_GROUP_QUERY
- purpose
Request the GroupDescriptor for a managed group of resources
- fields
One of the
g_uid
anduser_name
fields must be present.If both are specified, then the
user_name
field is ignored.
- g_uid
integer
target group UID
- user_name
string
user supplied name for the target group
- response
GSGroupQueryResponse
refer to the Common Fields section for additional request message fields
implementation(s):
Python
GSGroupQueryResponse
- type enum
GS_GROUP_QUERY_RESPONSE
- purpose
Response to request for GroupDescriptor for a managed group
- fields
Alternatives on
err
:
- SUCCESS (= 0)
Group has been found
- desc
map
initializer for GroupDescriptor - includes at a minimum g_uid of new group and assigned user name.
- UNKNOWN (= 1)
No such group is known.
- err_info
string
explanation of what went wrong
- request
GSGroupQuery
- see also
GSGroupList
implementation(s):
Python
Other Messages
These messages are related to other transactions on Global Services, for example ones related to sequencing runtime startup and teardown.
AbnormalTermination
- type enum
ABNORMAL_TERMINATION
- purpose
Error result for startup and teardown messages, as well as for reporting problems that happen during execution that are not tied to any specific request.
Morally speaking, this message backs an exception class related to things going wrong in Dragon Run-time Services. These are generally nonrecoverable errors, but it is desirable to propagate an explanation of what went wrong, back to the front end when possible - users won’t want to collect disparate log files for basic bug reports.
- fields
- err_info
string
explanation of what went wrong.
implementation(s):
Python
GSStarted
- type enum
GS_STARTED
- purpose
Confirm to Launcher that the Global Services process (and if applicable, subsidiary processes) are started.
- fields
None
implementation(s):
Python
GSPingSH
- type enum
GS_PING_SH
- purpose
Confirm to Shepherd(s) that Global Services has started and request handshake response.
- fields
None
- see also
SHPingGS
implementation(s):
Python
GSIsUp
- type enum
GS_IS_UP
- purpose
Confirm to Launcher that Global Services is completely up
- fields
None
implementation(s):
Python
GSHeadExit
- type enum
GS_HEAD_EXIT
- purpose
Notify Launcher that the head process has exited. At this point the Launcher can start teardown or start a new head process.
- fields
- exit_code
integer
exit code the process left with
implementation(s):
Python
GSChannelRelease
- type enum
GS_CHANNEL_RELEASE
- purpose
Tell the Shepherd(s) that Global Services is exiting and will no longer be sending anything through Channels
- fields
None
implementation(s):
Python
GSHalted
- type enum
GS_HALTED
- purpose
Notify Launcher that Global Services is halted. The Global Services processes are expected to exit after sending this message, having done everything possible to clean up.
- fields
None
implementation(s):
Python
GSTeardown
- type enum
GS_TEARDOWN
- purpose
Direct Global Services to do a clean exit - clean up any remaining global structures and its own sub-workers, if any, and detach from channels.
This should only be given when there isn’t an active head process. If there is such a thing, we should probably take everything down anyway but complain about it.
implementation(s):
Python
GSPingProc
- type enum
GS_PING_PROC
- purpose
When a new managed process wanting to use Global Services comes up, it must wait to receive this message on its Global Services return channel before sending any messages to Global Services.
This message also carries ‘argument’ data, which is defined as binary data to be delivered to the target at startup. This can be used in whatever way the target would like. For example, the implementation of the Python Process object uses this to deliver the Python target function and arguments.
Remark. Even if no arguments are needed, this message must be sent to solve a race with the local Shepherd’s notification to Global Services that the process was successfully started.
fields
- mode
integer
explains how the argdata field should be interpreted
- Alternatives:
NONE (=0) No arguments are carried
PYTHON_IMMEDIATE (=1) The arguments are found in fields of the argdata field.
PYTHON_CHANNEL (=2) The argdata has fields for a serialized channel to attach to and get the args from otherwise challenge the available space in the pool the argument delivery channel is found in.
- argdata
base64 encoded string, fixme, with the binary argument data to be delivered, or nil.
implementation(s):
Python
GSDumpState
- type enum
GS_DUMP_STATE
- purpose
Primarily debugging. Makes global services dump its state in human readable form to a file specified in the message.
Remark. May in the future try for a formal serialization of GS state once more of the design is concrete. May also have a reply to this message.
- fields
- filename
string
file to open and write the dump to
implementation(s):
Python
GSUnexpected
- type enum
GS_UNEXPECTED
- purpose
Whenever GS gets a message that is not expected as an input to GS (which can depend on the state it is in) and a valid reply handle can be found corresponding to that message, this message gets returned as a response.
If a valid reply handle can’t be found, then GS will log an error but continue to run.
This kind of thing can happen when issuing interactive commands to a global services process from the launcher. If the head process leaves unexpectedly there can be a race; this message allows handling this without putting GS into error.
TODO: put this in GS documentation and xref, post GS doc update.
- fields
- ref
integer
matches the tag value of the unexpected message, if it has one.
implementation(s):
Python
ExceptionlessAbort
- type enum
EXCEPTIONLESS_ABORT
- purpose
It can be benefical to pass a message through the services runtime that communicates an abnormal termination has been encountered without immediatesly raising an exception. This message can thus be used to coordinate tearing down of resources without immediately raising an exception.
- fields
None other than default
implementation(s):
Python
Local Services Messages API
These messages are the ones underlying the Local Services API and are sent only to and from the Shepherd.
Like all request messages, these Shepherd request messages have a p_uid
field denoting the process that
sent the request message and r_c_uid
field denoting the channel ID the response should be returned to.
LS Process Messages
SHProcessCreate
- type enum
SH_PROCESS_CREATE
- purpose
Request to Shepherd to launch a process locally.
- fields
- t_p_uid
integer
process UID of the target process - Global Services will use this to refer to the process in other operations.
- exe
string
executable name
- args
list of strings
arguments to the executable. [exe, args] == argv
- env
map, key=string value=string
Environment variables to be exported before running.
Note that this is additive/override to the shepherd environment - may need to be more fancy here if it turns out we need to add to paths
- rundir
string, absolute path
Indicates where the process is to be run from. If empty, does not matter.
options
- output_c_uid is the channel id where output, both stdout
and stderr, should be forwarded if present in an SHFwdOutput message. If this field is not present in the options, then output will be forwarded to the Backend. The SHFwdOutput message indicates whether it was from stdout or stderr.
- response
SHProcessCreateResponse
- see also
SHProcessKill
refer to the Common Fields section for additional request message fields
implementation(s):
Python
SHProcessCreateResponse
- type enum
SH_PROCESS_CREATE_RESPONSE
- purpose
Response to process creation request.
- fields
Alternatives on
err
:- SUCCESS (= 0)
no fields
- FAIL (= 1)
- err_info
string
what went wrong
- request
SHProcessCreate
- see also
SHProcessKill
implementation(s):
Python
SHMultiProcessKill
- type enum
SH_MULTI_PROCESS_KILL
- purpose
Request to Shepherd to efficiently kill multiple processes.
- fields
- procs
list of SHProcessKill messages representing the processes to be killed.
- response
SHMultiProcessKillResponse
- see also
SHProcessKill
refer to the Common Fields section for additional request message fields
implementation(s):
Python
SHMultiProcessKillResponse
- type enum
SH_MULTI_PROCESS_KILL_RESPONSE
- purpose
Response to multi process kill request.
- fields
Alternatives on
err
:
- responses
list of SHProcessKillResponse messages representing the results of each SHProcessKill request.
- SUCCESS (= 0)
- responses
list of SHProcessKillResponse messages for each previously requested process.
- FAIL (= 1)
- err_info
string
what went wrong
- request
SHMultiProcessKill
- see also
SHProcessKill
implementation(s):
Python
SHProcessKill
- type enum
SH_PROCESS_KILL
- purpose
Request to kill a process owned by this shepherd, with a specified signal.
- fields
- t_p_uid
integer
process UID of the target process
- sig
integer, valid signal number
signal to send to the process
- response
SHProcessExit Note difference in nomenclature.
- see also
SHProcessCreate
refer to the Common Fields section for additional request message fields
implementation(s):
Python
SHMultiProcessCreate
- type enum
SH_MULTI_PROCESS_CREATE
- purpose
Request to Shepherd to launch multiple processes locally.
- fields
- pmi_group_info
Optional PMIGroupInfo structure.
Contains common PMI/MPI values needed to start the requested PMI enabled applications, including the nid_list, host_list.
- procs
list of SHProcessCreate messages representing the processes to be started.
- response
SHMultiProcessCreateResponse
- see also
SHProcessKill
refer to the Common Fields section for additional request message fields
implementation(s):
Python
SHMultiProcessCreateResponse
- type enum
SH_MULTI_PROCESS_CREATE_RESPONSE
- purpose
Response to multi process creation request.
- fields
Alternatives on
err
:
- SUCCESS (= 0)
- responses
list of SHProcessCreateResponse messages for each previously requested process.
- FAIL (= 1)
- err_info
string
what went wrong
- request
SHMultiProcessCreate
- see also
SHProcessKill
implementation(s):
Python
SHProcessKillResponse
- type enum
SH_PROCESS_KILL_RESPONSE
- purpose
Response to request to kill a process owned by this shepherd. Returns confirmation that the signal was delivered only. The process’s exit comes automatically via SHProcessExit.
- fields
Alternatives on
err
:- SUCCESS (= 0)
no fields
- FAIL (= 1)
- err_info
string
what went wrong
- response
SHProcessKill
- see also
SHProcessCreate SHProcessExit
refer to the Common Fields section for additional request message fields
implementation(s):
Python
SHProcessExit
- type enum
SH_PROCESS_EXIT
- purpose
This message is sent to Global Services when a managed process exits. There are no
ref
orerr
fields.- *fields
- exit_code
int
numerical exit code.
- p_uid
int
the Process ID of the process that exited. When a process exits of its own accord, this is the
only thing to identify which one it is.
- see also
SHProcessKill
implementation(s):
Python
Pool Messages
These messages are directives to a Local Services process to execute pool lifecycle commands. They are posed in the form of individual messages instead of generic operation completions because the Shepherd holds these OS resources.
SHPoolCreate
- type enum
SH_POOL_CREATE
- purpose
Create a new memory pool.
- fields
- m_uid
positive integer
system wide unique identifier for this pool. All future operations on this pool will refer to it using the
m_uid
.
- name
string
name of the pool to use when creating it. This name is guaranteed to be globally unique.
- size
positive integer
size in bytes of the pool.
- attr
string
this string is the base64 encoding of a serialized pool attribute object to use when creating the pool
- response
SHPoolCreateResponse
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
SHPoolCreateResponse
- type enum
SH_POOL_CREATE_RESPONSE
- purpose
Response to request to create a new memory pool.
- fields
Alternatives on
err
:- SUCCESS (= 0)
The pool was created successfully.
- desc
string
the string is the base64 encoding of a serialized pool descriptor for the pool that just got created
- FAIL (= 1)
The pool was not created successfully. This could be because the pool creation call failed or because there was something wrong with the pool creation message.
- err_info
string
what went wrong
- request
SHPoolCreate
implementation(s):
Python
SHPoolDestroy
- type enum
SH_POOL_DESTROY
- purpose
Request to destroy a memory pool.
- fields
- m_uid
positive integer
ID of the pool to destroy
- response
SHPoolDestroyResponse
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
SHPoolDestroyResponse
- type enum
SH_POOL_DESTROY_RESPONSE
- purpose
Response to request to destroy a memory pool
- fields
Alternatives on
err
:
- SUCCESS (= 0)
The pool was destroyed. No other fields
- FAIL (= 1)
The pool was not destroyed successfully. This could be because the pool destroy call failed or because there was something wrong with the pool destroy message.
- err_info
string
what went wrong
- request
SHPoolDestroy
implementation(s):
Python
SHExecMemRequest
- type enum
SH_EXEC_MEM_REQUEST
- purpose
Request to execute a memory request. This message contains a serialized operation given to the Shepherd to execute on behalf of the requesting process using
dragon_memory_exec_req
ordragon_memory_pool_exec_request
- fields
- kind
integer, legal values are 0 or 1
If 0, the request is a serialized
dragonMemoryPoolRequest
If 1, the request is a serialized
dragonMemoryRequest
- request
string
this is the base64 encoding of the request.
- response
SHExecMemResponse
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
SHExecMemResponse
- type enum
SH_EXEC_MEM_RESPONSE
- purpose
Response to request to execute a memory request. Note that there is no need for this message to include what kind of request it was - the sender knows this.
- fields
Alternatives on
err
:
- SUCCESS (= 0)
The memory request call completed. Even if the call did not complete with a success return value, the response has the details about what went wrong including the error code and this will be apparent when the original requester calls the completion function on the response.
- response
string
this is the base64 encoding of the serialized response
- FAIL (= 1)
The request could not be completed.
- err_info
string
what went wrong
- err_code
integer
if the error is due to a library call, this should carry the return value of the call.
Otherwise its value should be 0.
implementation(s):
Python
Channel Messages
SHChannelCreate
- type enum
SH_CHANNEL_CREATE
- purpose
Request to create a channel in a memory pool known to this shepherd.
- fields
- m_uid
integer
Unique id of the pool to create the channel inside.
- c_uid
integer
Unique id of the channel to be created. Future operations will use this number to refer to the channel
- options
map
Initializer for a (local) ChannelOptions object. This is an open type intended to completely describe the controllable parameters for channel creation at the library level. Will contain attributes string from library.
- response
SHChannelCreateResponse
- see also
refer to the Common Fields section for additional request message fields
- implementation(s):
Python
Python
SHChannelCreateResponse
- type enum
SH_CHANNEL_CREATE_RESPONSE
- purpose
Response to channel allocation request.
- fields
Alternatives on
err
:- SUCCESS (= 0)
- desc
string
serialized and ascii encoded library channel descriptor
- FAIL (= 1)
- err_info
string
what went wrong
- request
SHChannelCreate
implementation(s):
Python
SHChannelDestroy
- type enum
SH_CHANNEL_DESTROY
- purpose
Request to free a previously allocated channel.
- fields
- c_uid
integer
unique id of the channel to be destroyed.
- response
SHChannelDestroyResponse
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
SHChannelDestroyResponse
- type enum
SH_CHANNEL_DESTROY_RESPONSE
- purpose
Response to request to free a previously allocated channel.
- fields
Alternatives on
err
:- SUCCESS (= 0)
No additional fields
- FAIL (= 1)
- err_info
string
what went wrong
- request
SHChannelDestroy
implementation(s):
Python
SHLockChannel
PLACEHOLDER
- type enum
SH_LOCK_CHANNEL
- purpose
Request to lock a channel
TODO: should only something attached to a channel be allowed to lock it?
- fields
- read_lock
bool
locking this for reading
- write_lock
bool
locking this for writing
- response
SHLockChannelResponse
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
SHLockChannelResponse
PLACEHOLDER
- type enum
SH_LOCK_CHANNEL_RESPONSE
- purpose
Response to request to lock a channel
- fields
Alternatives on
err
:- SUCCESS (= 0)
Successfully locked.
- ALREADY (= 1)
Some other entity has already locked the channel.
- request
SHLockChannel
- see also
x
implementation(s):
Python
Memory Allocation Messages
SHAllocMsg
PLACEHOLDER - maybe OBE
- type enum
SH_ALLOC_MSG
- purpose
Request a shared memory allocation for a large message
- fields
x
- response
SHAllocMsgResponse
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
SHAllocMsgResponse
PLACEHOLDER - OBE?
- type enum
SH_ALLOC_MSG_RESPONSE
- purpose
Response to a requested allocation for a large message
- fields
Alternatives on
err
:SUCCESS (= 0)
- request
SHAllocMsg
- see also
x
implementation(s):
Python
SHAllocBlock
PLACEHOLDER - OBE?
- type enum
SH_ALLOC_BLOCK
- purpose
Request a shared memory allocation for generic memory
- fields
- p_uid
integer
Process UID of the requesting process.
Note that the Shepherd may need to know this to do cleanup if the process goes away. There probably needs to be an option as to which is the owning process.
- len
integer
length of allocation desired
- response
SHAllocBlockResponse
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
SHAllocBlockResponse
PLACEHOLDER - OBE?
- type enum
SH_ALLOC_BLOCK_RESPONSE
- purpose
Response to a requested allocation for generic memory
- fields
Alternatives on
err
:- SUCCESS (= 0)
Successfully allocated the block. Do we want a reference ID for the block to let other processes claim ownership? E.g. we could have the shepherd count ref on such blocks.
- seg_name
string
name of shared memory segment the block is found in.
- offset
nonnegative integer
byte offset of the allocated block within segment
- UNKNOWN (= 1)
The p_uid of the requester is not a process currently running and being managed by the shepherd
- FAIL (= 2)
The block could not be allocated.
- err_info
string
what went wrong
- request
SHAllocBlock
- see also
x
implementation(s):
Python
Other Messages
These Local Services messages are for other purposes, such as sequencing
bringup and teardown. Where appropriate they have the idx
field, indicating
which Shepherd instance there is. This is an integer inside [0,N),
where N is the number of compute nodes in the allocation. It has the
value 0 in the single node case.
SHChannelsUp
- type enum
SH_CHANNELS_UP
- purpose
Notify Launcher that this Shepherd has allocated the shared memory for the channels and initialized whatever default channels there are.
- fields
- node_desc
Python<dragon.infrastructure.node_desc.NodeDescriptor
or its serialized descriptor
- idx
integer
which Shepherd in the allocation this is. Index is 0 in the single node case and lives in [0,N) for N multi node back end allocation
- gs_cd
base64 encoded string
When idx is the primary node, then this is the channel descriptor of global services. Otherwise it is ignored and should be set to an empty string.
- implementation(s):
Python
SHPingGS
- type enum
SH_PING_GS
- purpose
Acknowledge to Global Services that this Shepherd is up and ready.
- fields
- idx
integer
which Shepherd in the allocation this is. Index is 0 in the single node case and lives in [0,N) for N multi node back end allocation.
- node_sdesc
dict
serialized
NodeDescriptor
of the node the Shepherd is running on. - Note that the h_uid in the descriptor will be None in this message. It is set later by GS.
implementation(s):
Python
SHHalted
- type enum
SH_HALTED
- purpose
Notify launcher that this Shepherd is halted.
- fields
- idx
integer
which Shepherd in the allocation this is. Index is 0 in the single node case and lives in [0,N) for N multi node back end allocation
implementation(s):
Python
SHTeardown
- type enum
SH_TEARDOWN
- purpose
Direct Shepherd to do a clean teardown.
- fields
None other than default
implementation(s):
Python
SHPingBE
- type enum
SH_PING_BE
- purpose
Shepherd handshake with MRNet backend
- fields
shep_cd
Channel Descriptor for the Shepherd (serialized, base64 encoded)
be_cd
The Launcher Back End Channel Descriptor (serialized, base64 encoded)
gs_cd
The Global Services Channel Descriptor (serialized, base64 encoded)
default_pd
The Default Pool Descriptor (serialized, base64 encoded)
inf_pd
The Infrastructure Pool Descriptor (serialized, base64 encoded)
implementation(s):
Python
SHHaltBE
- type enum
SH_HALT_BE
- purpose
Shepherd telling the MRNet backend to exit.
- fields
None other than default
implementation(s):
Python
SHFwdInput
- type enum
SH_FWD_INPUT
- purpose
Message carrying data intended to be written into the standard input of a process managed by this shepherd.
If the process is present and the stdin can be written without any problems, there is no response message.
- fields
- t_p_uid
integer
target p_uid
- input
string
input to be sent to target process
The max length of any input to be sent through this mechanism in one message is 1024 bytes.
- confirm
boolean
request confirmation of forwarded input.
- response
SHFwdInputErr
implementation(s):
Python
SHFwdInputErr
- type enum
SH_FWD_INPUT
- purpose
Error response to a forward input message. This message is returned to the sender if (for instance) the target process isn’t present at the time of delivery.
fields
All messages:
- idx
integer
which Shepherd in the allocation this is. Index is 0 in the single node case and lives in [0,N) for N multi node back end allocation
Alternatives on
err
:- SUCCESS (= 0)
No fields other than defaults.
No error. This message isn’t expected ever to be emitted, but is called out here to reserve the ‘0’ code for future use in case there needs to be positive confirmation of message delivery.
FAIL (= 1)
- err_info
string
explanation of what went wrong.
- request
SHFwdInput
implementation(s):
Python
SHFwdOutput
- type enum
SH_FWD_OUTPUT
- purpose
Message carrying data from either stdout or stderr of a process managed by this shepherd. This comes from the Shepherd and no response is expected.
fields
fdnum
Indicates which stream, stdout or stderr, is being forwarded.
Alternatives for
fdnum
:stdout (= 1) stderr (= 2)
- data
string
The data read from the stream and being forwarded
The max length of any output per message is 1024 bytes.
- idx
integer
which Shepherd in the allocation this is. Index is 0 in the single node case and lives in [0,N) for N multi node back end allocation
- p_uid
integer
The process UID this came from
implementation(s):
Python
SHHaltTA
- type enum
SH_HALT_TA
- purpose
Message coming from Launcher to the Shepherd, telling it to tell TA to halt. The Shepherd forwards this same message to TA.
- fields
None other than default
implementation(s):
Python
SHDumpState
- type enum
SH_DUMP_STATE
- purpose
Primarily debugging. Makes the Shepherd dump its state in human readable form to a file specified in the message or dump to the log if no filename is provided.
- fields
- filename
string
file to open and write the dump to. Defaults to None. If None, then the dump goes to the log file for the Shepherd.
implementation(s):
Python
Launcher Messages API
These messages go to the Launcher frontend in standard and server mode via the LA Channel.
LABroadcast
type enum
LA_BROADCAST
purpose
This message is sent by the launcher to the Launch Router. This message contains another message in the data field. The message in the data field is to be broadcast to the Shepherd on all nodes through the Backend. The LABroadcast message is sent to the backend. The Backend extracts the embedded message from the LABroadcast message and sets the embedded message’s r_c_uid as the return channel id in the message and forwards the embedded message to the Shepherd’s channel. All broadcast messages are sent to the Shepherd on each node.
fields
data
A string to be broadcast to all nodes. For the Dragon infrastructure this data field will itself be a valid, serialized message.
implementation(s):
Python
, Network Front End has a dependency on the type enum being 68.
LAPassThruFB
type enum
LA_PASS_THRU_FB
purpose
This message is a pass thru message from the Launcher to the Backend.
When sent to the Launcher Router, it forwards the message to the Backend on the primary node.
This message type is used by the Dragon Kernel Server to communicate its channel id to the Dragon Kernel indicating that the Dragon Kernel is up and ready to accept requests.
fields
c_uid
The channel to send this message to running on a compute node.
data
A user-defined serialized message that is unwrapped by the receiving end.
NOTE
The r_c_uid is set in this message by the Backend to the Backend channel id before it is forwarded on to its destination on the compute node.
implementation(s):
Python
LAPassThruBF
type enum
LA_PASS_THRU_BF
purpose
This message is a pass thru message from the Backend to the Launcher.
Note: One use of this message type is by a server back end to communicate its channel id to a server front end in the r_c_uid field. This serves as an indicator in this case that a front end that its corresponding back end is ready to accept requests (see Launcher Server Mode).
fields
data
A user-defined serialized message that is unwrapped by the
receiving end.
NOTE
The r_c_uid is set to the channel id of the source of the
passthrough message on a compute node.
implementation(s):
Python
LAServerMode
type enum
LA_SERVER_MODE
purpose
This is a message that is used internally in the Launcher to set up Server Mode. It is not ever to be sent between components in the Dragon run-time.
fields
frontend
A string with a file specification for the server front end to be run on the login node.
backend
A string with a file specification for the server back end to be run on the primary compute node.
frontendargs
A list of strings containing any command-line arguments to the front end.
backendargs
A list of strings containing any command-line arguments to the back end.
implementation(s):
Python
LAServerModeExit
type enum
LA_SERVER_MODE_EXIT
purpose
This is a message that is used internally in the Launcher to exit Server Mode. It is not ever to be sent between components in the Dragon run-time.
fields
None other than default
implementation(s):
Python
LAProcessDict
- type enum
LA_PROCESS_DICT
- purpose
Return a dictionary of process information for all the processes that were started in the launcher via the run command.
- fields
None additional
- response
LAProcessDictResponse
- see also
refer to the Common Fields section for additional request message fields
implementation(s):
Python
LAProcessDictResponse
- type enum
LA_PROCESS_DICT_RESPONSE
- purpose
Responds with a dictionary for all the processes were started in the launcher via the run command. The dictionary includes fields for exe, args, env, run directory, user name, options, create return code, and exit code.
- fields
- pdict
dictionary of p_uid to process information as described.
- request
LAProcessList
implementation(s):
Python
LADumpState
- type enum
LA_DUMP_STATE
- purpose
Primarily debugging. Makes the Launcher dump its state in human readable form to a file specified in the message or dump to the log if no filename is provided.
- fields
- filename
string
file to open and write the dump to. Defaults to None. If None, then the dump goes to the log file for the Launcher.
implementation(s):
Python
LANodeIdxSH
- type enum
LA_NODEINDEX_SH
- purpose
Communicates the node index from the Launcher Back End to the Shepherd during bootstrapping of the environment.
- fields
- node_idx
integer
The index of this node in the allocation upon which the Dragon runtime services are running.
implementation(s):
Python
LAChannelsInfo
- type_enum
LA_CHANNELS_INFO
- purpose
Broadcast to all nodes to provide hostnames, node indices, and channel descriptors to all nodes in the allocation so services like global services and the inter-node transport service (HSTA, TCP or otherwise) can be started.
When the launcher backend receives this message, it forwards it to the local shepherd. The local shepherd provides this message to the inter-node transport service when it is started and to global services (on the primary node) when it is started.
- fields
- nodes_desc
dictionary with keys corresponding to string node indices and values of class
dragon.infrastructure.node_desc.NodeDescriptor``(eg: The hostname info for node 5 is accessed via ``la_channels_msg.nodes_desc['4'].host_name
) - Excamples attributes for each value ofdragon.infrastructure.node_desc.NodeDescriptor
:‘host_name’: hostname string
‘host_id’: integer which is determined by calling the Posix gethostid function
‘ip_addr’: ip address string
‘shep_cd’: base64 encoded shepherd channel descriptor
- gs_cd
string
The base64 encoded channel descriptor of global services
- num_gw_channels
int
The number of gateway channels per node
- implementation(s):
Python
Breakpoint
- type_enum
BREAKPOINT
- purpose
Inform front end that a managed process has reached a breakpoint for the first time and wants to establish a debug connection.
- fields
- p_uid
int
process uid that has just reached a breakpoint
- index
int
node that process is on
- out_desc
string
encoded binary serialized channel descriptor for debug connection outbound
- in_desc
string
encoded binary serialized channel descriptor for debug connection inbound
implementation(s):
Python
Launcher Backend Messages APIs
These messages are used by a combination of the Launcher and the Launcher Backend to support the launcher in both regular and server mode. They are communicated through the BELA Channel.
BEIsUp
- type enum
BE_IS_UP
- purpose
Confirm to Launcher Frontend that the Backend is up and send its serialized channel descriptor and host_id. This message is not communicated through the BELA Channel, as this is not set up yet.
fields
be_ch_desc
Backend serialized channel descriptor. The frontend will attach to
this channel and establish a connection with the backend.
host_id
integer
the hostid of the node sending this message.
implementation(s):
Python
BEPingSH
- type enum
BE_PING_SH
- purpose
MRNet backend handshake with Shepherd
- fields
None other than default
implementation(s):
Python
BEHalted
- type enum
BE_HALTED
- purpose
Indicate that the MRNet backend instance on this node has exited normally.
fields
None other than default
implementation(s):
Python
BENodeIdxSH
- type enum
BE_NODE_IDX_SH
purpose
This message is sent from the Launcher Backend to Local Services during initialization to tell Local Services its node index and related host info for the current run. All nodes are numbered 0 to n-1 where n is the number of nodes in the allocation.
fields
node_idx
An integer representing the node index value for this node.
host_name
string
name of host as returned by MRNet backend
- ip_addr
string
IP address as returned by DNS lookup of hostname
- primary
string
hostname of primary node
- logger_sdesc
dragon.utils.B64
serialized descriptor of backend logging channel
implementation(s):
Python
Launcher Frontend Messages APIs
These messages are used by a combination of the Launcher and the Launcher Frontend to support the launcher in both regular and server mode.
FENodeIdxBE
- type enum
FE_NODE_IDX_BE
- purpose
The Frontend sends the node index to the Backend, based on the backend’s host_id.
fields
- node_index
integer
The index of this node in the allocation upon which the Dragon runtime services are running.
implementation(s):
Python
HaltOverlay
- type enum
HALT_OVERLAY
- purpose
Indicate that monitoring of the overlay network should cease
- fields
None other than default
implementation(s):
Python
HaltLoggingInfra
- type enum
HALT_LOGGING_INFRA
- purpose
Indicate that monitoring of logging messages from the backend should cease
- fields
None other than default
implementation(s):
Python
LAExit
- type enum
LA_EXIT
- purpose
Indicate the launcher should exit. Use in case the launcher teardown was unable to complete correctly.
- fields
None other than default
implementation(s):
Python
Logging Messages API
In order to transmit messages from backend services to the launcher frontend, the logging infrastructure packs logging strings into messages that are passed from a given backend service to the MRNet server backend, which aggregates any logging messages its received and sends them to the launcher frontend to be offically logged to file or terminal.
LoggingMsg
- type enum
LOGGING_MSG
- purpose
To take logging strings provided via python logging (
eg: log.info('message')
) and pack all information that’s necessary to make log messages useful on the frontend
fields
All fields are auto-populated by the logging infrastructure found in LoggingInfrastructure
- name
string
name of logger message came from. Typically formatted
f{service}.{func}
- msg
string
log mesage
- time
string
time generated by the sending logger’s formatter
- func
string
name of function the logging msg was made in
- hostname
string
hostname of sender
- ip_address
string
IP address of sender
- port
string
Port of sender (placeholder for now but likely useful in single-node testing)
- service
string
service as defined in
Python
- level
int
logging level as defined by
dragonMemoryPoolAttr_t
and is consistent with Python’s only loggin int values
implementation(s)
Python
LoggingMsgList
- type enum
LOGGING_MSG_LIST
- purpose
Takes a list of LoggingMsg and aggregates them into a single message the MRNet server backend can send up to the MRNet frontend server
- fields
- records
list of LoggingMsg messages with attributes as defined therein
implementation(s)
Python
LogFlushed
- type enum
LOG_FLUSHED
- purpose
Sent by MRNet server backend to frontend after it has completed its final flushing and transmission of logs to the fronted. Part of the transaction diagram for teardown in Multi Node Teardown
- fields
None other than default
implementation(s)
Python
Overlay/Transport Agent Messages API
In multi node cases, one or more transport agent processes live on each node and they need a few messages related to setup and teardown control through the TA Channel.
TAPingSH
- type enum
TA_PING_SH
- purpose
Indicate that the TA instance on this node has come up and succeeded in getting communication with the other TA instances.
- fields
None other than default
implementation(s):
Python
TAHalted
- type enum
TA_HALTED
- purpose
Indicate that the TA instance on this node has exited normally.
- fields
- idx
integer
which node in the allocation this is.
implementation(s):
Python
TAUp
- type enum
TA_UP
- purpose
Indicate that the TA instance on this node is up and ready.
- fields
None other than default
implementation(s):
Python
OverlayPingBE
- type enum
OVERLAY_PING_BE
- purpose
Indicate that the Overlay instance on this backend node is up and ready.
- fields
None other than default
implementation(s):
Python
OverlayPingLA
- type enum
OVERLAY_PING_LA
- purpose
Indicate that the Overlay instance on this frontend node is up and ready.
- fields
None other than default
implementation(s):
Python
LAHaltOverlay
- type enum
LA_HALT_OVERLAY
- purpose]
Launcher frontend requests a shutdown of Overlay agent
- fields
None other than default
implementation(s):
Python
BEHaltOverlay
- type enum
BE_HALT_OVERLAY
- purpose]
This backend node requests a shutdown of its Overlay agent
- fields
None other than default
implementation(s):
Python
OverlayHalted
- type enum
OVERLAY_HALTED
- purpose]
This overlay instance has shutdown
- fields
None other than default
implementation(s):
Python
Python Infrastructure Messages Implementation
The following subsections contain structures relating to the Python implementation of the Dragon infrastructure message definitions. The specification of these message types i`` in the previous section and cross-referenced from these two sections for convenience.
Python Implementation of Infrastructure Messages
Alphabetical listing of all message classes in the Python implementation of the infrastructure messages.
Dragon infrastructure messages are the internal API used for service communication.
- class PMIGroupInfo
Bases:
object
Required information to enable the launching of pmi based applications.
- classmethod fromdict(d)
- class PMIProcessInfo
Bases:
object
Required information to enable the launching of pmi based applications.
- classmethod fromdict(d)
- class InfraMsg
Bases:
object
Common base for all messages.
This common base type for all messages sets up the default fields and the serialization strategy for now.
- __init__(tag, ref=None, err=None)
- get_sdict()
- property tc
- classmethod tcv()
- property tag
- property ref
- property err
- classmethod from_sdict(sdict)
- classmethod deserialize(msg)
- uncompressed_serialize()
- serialize()
- class CapNProtoMsg
Bases:
object
Common base for all capnproto messages.
This common base type for all messages sets up the default fields and the serialization strategy for messages to be exchanged between C and Python.
- Errors
alias of
DragonError
- __init__(tag)
- classmethod from_sdict(sdict)
- classmethod deserialize(msg_str)
- serialize()
- get_sdict()
- builder()
- property capnp_name
- property tc
- property tag
- class CapNProtoResponseMsg
Bases:
CapNProtoMsg
Common base for all capnproto response messages.
This provides some support for code common to all response messages.
- __init__(tag, ref, err, errInfo)
- get_sdict()
- builder()
- property ref
- property err
- property errInfo
- class SHCreateProcessLocalChannel
Bases:
CapNProtoMsg
- __init__(tag, puid, respFLI)
- get_sdict()
- builder()
- property respFLI
- property puid
- class SHCreateProcessLocalChannelResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='', serChannel='')
- get_sdict()
- builder()
- property serialized_channel
- class SHPushKVL
Bases:
CapNProtoMsg
- __init__(tag, key, value, respFLI)
- get_sdict()
- builder()
- property key
- property value
- property respFLI
- class SHPushKVLResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='')
- class SHPopKVL
Bases:
CapNProtoMsg
- __init__(tag, key, value, respFLI)
- get_sdict()
- builder()
- property key
- property value
- property respFLI
- class SHPopKVLResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='')
- class SHGetKVL
Bases:
CapNProtoMsg
- __init__(tag, key, respFLI)
- get_sdict()
- builder()
- property key
- property respFLI
- class SHGetKVLResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='', values=[])
- get_sdict()
- builder()
- property values
- class SHSetKV
Bases:
CapNProtoMsg
- __init__(tag, key, value, respFLI)
- get_sdict()
- builder()
- property key
- property value
- property respFLI
- class SHSetKVResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='')
- class SHGetKV
Bases:
CapNProtoMsg
- __init__(tag, key, respFLI)
- get_sdict()
- builder()
- property key
- property respFLI
- class SHGetKVResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='', value=None)
- get_sdict()
- builder()
- property value
- class DDCreate
Bases:
CapNProtoMsg
- __init__(tag, respFLI, args)
- get_sdict()
- builder()
- property respFLI
- property args
- class DDCreateResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='')
- class DDGetRandomManager
Bases:
CapNProtoMsg
- __init__(tag, respFLI)
- get_sdict()
- builder()
- property respFLI
- class DDGetRandomManagerResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, manager, errInfo='')
- get_sdict()
- builder()
- property manager
- class DDRegisterClient
Bases:
CapNProtoMsg
- __init__(tag, respFLI, bufferedRespFLI)
- get_sdict()
- builder()
- property respFLI
- property bufferedRespFLI
- class DDRegisterClientResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, clientID, numManagers, managerID, managerNodes, timeout, errInfo='')
- get_sdict()
- builder()
- property clientID
- property numManagers
- property managerID
- property managerNodes
- property timeout
- class DDConnectToManager
Bases:
CapNProtoMsg
- __init__(tag, clientID, managerID)
- get_sdict()
- builder()
- property clientID
- property managerID
- class DDConnectToManagerResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, manager, errInfo='')
- get_sdict()
- builder()
- property manager
- class DDDestroy
Bases:
CapNProtoMsg
- __init__(tag, clientID, respFLI)
- get_sdict()
- builder()
- property respFLI
- property clientID
- class DDDestroyResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='')
- class DDRegisterManager
Bases:
CapNProtoMsg
- __init__(tag, mainFLI, respFLI, hostID)
- get_sdict()
- builder()
- property mainFLI
- property respFLI
- property hostID
- class DDRegisterManagerResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, managerID, errInfo='', managers=[], managerNodes=[])
- get_sdict()
- builder()
- property managerID
- property managers
- property managerNodes
- class DDRegisterClientID
Bases:
CapNProtoMsg
- __init__(tag, clientID, respFLI, bufferedRespFLI)
- get_sdict()
- builder()
- property clientID
- property respFLI
- property bufferedRespFLI
- class DDRegisterClientIDResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='')
- class DDDestroyManager
Bases:
CapNProtoMsg
- __init__(tag, respFLI)
- get_sdict()
- builder()
- property respFLI
- class DDDestroyManagerResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='')
- class DDPut
Bases:
CapNProtoMsg
- __init__(tag, clientID, chkptID=0, persist=True)
- get_sdict()
- builder()
- property clientID
- property chkptID
- property persist
- class DDPutResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='')
- class DDGet
Bases:
CapNProtoMsg
- __init__(tag, clientID, chkptID=0)
- get_sdict()
- builder()
- property clientID
- property chkptID
- class DDGetResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='')
- class DDPop
Bases:
CapNProtoMsg
- __init__(tag, clientID, chkptID=0)
- get_sdict()
- builder()
- property clientID
- property chkptID
- class DDPopResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='')
- class DDContains
Bases:
CapNProtoMsg
- __init__(tag, clientID, chkptID=0)
- get_sdict()
- builder()
- property clientID
- property chkptID
- class DDContainsResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='')
- class DDGetLength
Bases:
CapNProtoMsg
- __init__(tag, clientID, respFLI, chkptID=0)
- get_sdict()
- builder()
- property clientID
- property respFLI
- property chkptID
- class DDGetLengthResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='', length=0)
- get_sdict()
- builder()
- property length
- class DDClear
Bases:
CapNProtoMsg
- __init__(tag, clientID, chkptID, respFLI)
- get_sdict()
- builder()
- property clientID
- property chkptID
- property respFLI
- class DDClearResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='')
- class DDManagerStats
Bases:
CapNProtoMsg
- __init__(tag, respFLI)
- get_sdict()
- builder()
- property respFLI
- class DDManagerStatsResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='', data='')
- get_sdict()
- builder()
- property data
- class DDManagerGetNewestChkptID
Bases:
CapNProtoMsg
- __init__(tag, respFLI)
- get_sdict()
- builder()
- property respFLI
- class DDManagerGetNewestChkptIDResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='', managerID=0, chkptID=0)
- get_sdict()
- builder()
- property chkptID
- property managerID
- class DDGetIterator
Bases:
CapNProtoMsg
- __init__(tag, clientID, chkptID=0)
- get_sdict()
- builder()
- property clientID
- property chkptID
- class DDGetIteratorResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='', iterID=0)
- get_sdict()
- builder()
- property iterID
- class DDIteratorNext
Bases:
CapNProtoMsg
- __init__(tag, clientID, iterID)
- get_sdict()
- builder()
- property clientID
- property iterID
- class DDIteratorNextResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='')
- class DDKeys
Bases:
CapNProtoMsg
- __init__(tag, clientID, chkptID=0)
- get_sdict()
- builder()
- property clientID
- property chkptID
- class DDKeysResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='')
- class DDDeregisterClient
Bases:
CapNProtoMsg
- __init__(tag, clientID, respFLI)
- get_sdict()
- builder()
- property clientID
- property respFLI
- class DDDeregisterClientResponse
Bases:
CapNProtoResponseMsg
- __init__(tag, ref, err, errInfo='')
- class GSProcessCreate
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, exe, args, env=None, rundir='', user_name='', options=None, stdin=None, stdout=None, stderr=None, group=None, user=None, umask=-1, pipesize=None, pmi_required=False, _pmi_info=None, layout=None, policy=None, restart=False, resilient=False, head_proc=False, _tc=None)
- property options
- get_sdict()
- class GSProcessCreateResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- class Errors
Bases:
Enum
An enumeration.
- SUCCESS = 0
Process was created
- FAIL = 1
Process was not created
- ALREADY = 2
Process exists already
- __init__(tag, ref, err, desc=None, err_info='', _tc=None)
- property desc
- get_sdict()
- class GSProcessList
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, _tc=None)
- get_sdict()
- class GSProcessListResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, plist=None, _tc=None)
- get_sdict()
- class GSProcessQuery
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, t_p_uid=None, user_name='', _tc=None)
- get_sdict()
- class GSProcessQueryResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, desc=None, err_info='', _tc=None)
- property desc
- get_sdict()
- class GSProcessKill
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, sig, t_p_uid=None, hide_stderr=False, user_name='', _tc=None)
- get_sdict()
- class GSProcessKillResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- class Errors
Bases:
Enum
An enumeration.
- SUCCESS = 0
- UNKNOWN = 1
- FAIL_KILL = 2
- DEAD = 3
- PENDING = 4
- __init__(tag, ref, err, exit_code=0, err_info='', _tc=None)
- get_sdict()
- class GSProcessJoin
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, timeout=-1, t_p_uid=None, user_name='', _tc=None)
- get_sdict()
- class GSProcessJoinResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, exit_code=0, err_info='', _tc=None)
- get_sdict()
- class GSProcessJoinList
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, timeout=-1, t_p_uid_list=None, user_name_list=None, join_all=False, return_on_bad_exit=False, _tc=None)
- get_sdict()
- class GSProcessJoinListResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- class Errors
Bases:
Enum
An enumeration.
- SUCCESS = 0
- UNKNOWN = 1
- TIMEOUT = 2
- SELF = 3
- PENDING = 4
- __init__(tag, ref, puid_status, _tc=None)
- get_sdict()
- class GSPoolCreate
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, size, user_name='', options=None, _tc=None)
- property options
- get_sdict()
- class GSPoolCreateResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- class Errors
Bases:
Enum
An enumeration.
- SUCCESS = 0
Pool was created
- FAIL = 1
Pool was not created
- ALREADY = 2
Pool exists already
- __init__(tag, ref, err, err_code=0, desc=None, err_info='', _tc=None)
- property desc
- get_sdict()
- class GSPoolList
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, _tc=None)
- get_sdict()
- class GSPoolListResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, mlist=None, _tc=None)
- get_sdict()
- class GSPoolQuery
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, m_uid=None, user_name='', _tc=None)
- get_sdict()
- class GSPoolQueryResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, desc=None, err_info='', _tc=None)
- property desc
- get_sdict()
- class GSPoolDestroy
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, m_uid=0, user_name='', _tc=None)
- get_sdict()
- class GSPoolDestroyResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- class Errors
Bases:
Enum
An enumeration.
- SUCCESS = 0
- UNKNOWN = 1
- FAIL = 2
- GONE = 3
- PENDING = 4
- __init__(tag, ref, err, err_code=0, err_info='', _tc=None)
- get_sdict()
- class GSGroupCreate
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, items=None, policy=None, user_name='', _tc=None)
- get_sdict()
- class GSGroupCreateResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, desc=None, _tc=None)
- property desc
- get_sdict()
- class GSGroupList
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, _tc=None)
- get_sdict()
- class GSGroupListResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, glist=None, _tc=None)
- get_sdict()
- class GSGroupQuery
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, g_uid=None, user_name='', _tc=None)
- get_sdict()
- class GSGroupQueryResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, desc=None, err_info='', _tc=None)
- property desc
- get_sdict()
- class GSGroupKill
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, sig, g_uid=None, user_name='', hide_stderr=False, _tc=None)
- get_sdict()
- class GSGroupKillResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- class Errors
Bases:
IntEnum
An enumeration.
- SUCCESS = 0
- UNKNOWN = 1
- DEAD = 2
- PENDING = 3
- ALREADY = 4
- __new__(value)
- __init__(tag, ref, err, err_info='', desc=None, _tc=None)
- property desc
- get_sdict()
- class GSGroupDestroy
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, g_uid=None, user_name='', _tc=None)
- get_sdict()
- class GSGroupDestroyResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- class Errors
Bases:
IntEnum
An enumeration.
- SUCCESS = 0
- UNKNOWN = 1
- DEAD = 3
- PENDING = 4
- __new__(value)
- __init__(tag, ref, err, err_info='', desc=None, _tc=None)
- property desc
- get_sdict()
- class GSGroupRebootRuntime
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, h_uid, r_c_uid, _tc=None)
- get_sdict()
- class GSGroupRebootRuntimeResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, err_info='', desc=None, _tc=None)
- get_sdict()
- class GSGroupAddTo
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, g_uid=None, user_name='', items=None, _tc=None)
- get_sdict()
- class GSGroupAddToResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- class Errors
Bases:
Enum
An enumeration.
- SUCCESS = 0
- UNKNOWN = 1
- FAIL = 2
- DEAD = 3
- PENDING = 4
- __init__(tag, ref, err, err_info='', desc=None, _tc=None)
- property desc
- get_sdict()
- class GSGroupCreateAddTo
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, g_uid=None, user_name='', items=None, policy=None, _tc=None)
- get_sdict()
- class GSGroupCreateAddToResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, err_info='', desc=None, _tc=None)
- property desc
- get_sdict()
- class GSGroupRemoveFrom
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, g_uid=None, user_name='', items=None, _tc=None)
- get_sdict()
- class GSGroupRemoveFromResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- class Errors
Bases:
Enum
An enumeration.
- SUCCESS = 0
- UNKNOWN = 1
- FAIL = 2
- DEAD = 3
- PENDING = 4
- __init__(tag, ref, err, err_info='', desc=None, _tc=None)
- property desc
- get_sdict()
- class GSGroupDestroyRemoveFrom
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, g_uid=None, user_name='', items=None, _tc=None)
- get_sdict()
- class GSGroupDestroyRemoveFromResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- class Errors
Bases:
Enum
An enumeration.
- SUCCESS = 0
- UNKNOWN = 1
- FAIL = 2
- DEAD = 3
- PENDING = 4
- __init__(tag, ref, err, err_info='', desc=None, _tc=None)
- property desc
- get_sdict()
- class GSChannelCreate
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, m_uid, options=None, user_name='', _tc=None)
- get_sdict()
- property options
- class GSChannelCreateResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, desc=None, err_info='', _tc=None)
- property desc
- get_sdict()
- class GSChannelList
Bases:
InfraMsg
Refer to definition and to Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, _tc=None)
- get_sdict()
- class GSChannelListResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, clist=None, _tc=None)
- get_sdict()
- class GSChannelQuery
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, c_uid=None, user_name='', inc_refcnt=False, _tc=None)
- get_sdict()
- class GSChannelQueryResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, desc=None, err_info='', _tc=None)
- property desc
- get_sdict()
- class GSChannelDestroy
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, c_uid=None, user_name='', reply_req=True, dec_ref=False, _tc=None)
- get_sdict()
- class GSChannelDestroyResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, err_info='', _tc=None)
- get_sdict()
- class GSChannelJoin
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, name, timeout=-1, _tc=None)
- get_sdict()
- class GSChannelJoinResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, desc=None, _tc=None)
- property desc
- get_sdict()
- class GSChannelDetach
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, c_uid, _tc=None)
- get_sdict()
- class GSChannelDetachResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- class Errors
Bases:
Enum
An enumeration.
- SUCCESS = 0
- UNKNOWN = 1
- UNKNOWN_CHANNEL = 2
- NOT_ATTACHED = 3
- __init__(tag, ref, err, _tc=None)
- get_sdict()
- class GSChannelGetSendH
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, c_uid, _tc=None)
- get_sdict()
- class GSChannelGetSendHResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- class Errors
Bases:
Enum
An enumeration.
- SUCCESS = 0
- UNKNOWN = 1
- UNKNOWN_CHANNEL = 2
- NOT_ATTACHED = 3
- CANT = 4
- __init__(tag, ref, err, sendh=None, err_info='', _tc=None)
- property sendh
- get_sdict()
- class GSChannelGetRecvH
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, c_uid, _tc=None)
- get_sdict()
- class GSChannelGetRecvHResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- class Errors
Bases:
Enum
An enumeration.
- SUCCESS = 0
- UNKNOWN = 1
- UNKNOWN_CHANNEL = 2
- NOT_ATTACHED = 3
- CANT = 4
- __init__(tag, ref, err, recvh=None, err_info='', _tc=None)
- property recvh
- get_sdict()
- class GSNodeList
Bases:
InfraMsg
- type enum
GS_NODE_LIST (= 102)
- purpose
Return a list of tuples of
h_uid
for all nodes currently registered.- fields
None additional
- response
GSNodeListResponse
- see also
GSNodeQuery
- see also
refer to the Common Fields section for additional request message fields
- __init__(tag, p_uid, r_c_uid, _tc=None)
- get_sdict()
- class GSNodeListResponse
Bases:
InfraMsg
- type enum
GS_NODE_LIST_RESPONSE (= 103)
- purpose
Responds with a list of
h_uid
for all the nodes currently registered.- fields
- hlist
list of nonnegative integers
- request
GSNodeList
- see also
GSNodeQuery
- __init__(tag, ref, err, hlist=None, _tc=None)
- get_sdict()
- class GSNodeQuery
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- get_sdict()
- class GSNodeQueryResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, desc=None, err_info='', _tc=None)
- property desc
- get_sdict()
- class GSNodeQueryAll
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- get_sdict()
- class GSNodeQueryAllResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, descriptors: List[dict | NodeDescriptor] | None = None, err_info='', _tc=None)
- get_sdict()
- class GSNodeQueryTotalCPUCount
Bases:
InfraMsg
- type enum
GS_NODE_QUERY_TOTAL_CPU_COUNT (= 104)
- purpose
Asks GS to return the total number of CPUS beloging to all of the registered nodes.
- see also
refer to the Common Fields section for additional request message fields
- response
GSNodeQueryTotalCPUCountResponse
- get_sdict()
- class GSNodeQueryTotalCPUCountResponse
Bases:
InfraMsg
- type enum
GS_NODE_QUERY_TOTAL_CPU_COUNT_RESPONSE (= 105)
- purpose
Return the total number of CPUS beloging to all of the registered nodes.
- fields
Alternatives on
err
:SUCCESS (= 0)
The machine descriptor was successfully constructed
- total_cpus
total number of CPUS beloging to all of the registered nodes.
- UNKNOWN ( = 1)
An unknown error has occured.
- request
GSNodeQueryTotalCPUCount
- see also
GSNodeQuery
- __init__(tag, ref, err, total_cpus=0, err_info='', _tc=None)
- property total_cpus
- get_sdict()
- class AbnormalTermination
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, err_info='', host_id=0, _tc=None)
- get_sdict()
- class ExceptionlessAbort
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class GSStarted
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class GSPingSH
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class GSIsUp
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class GSPingProc
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, mode=None, argdata=None, _tc=None)
- get_sdict()
- class GSDumpState
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, filename, _tc=None)
- get_sdict()
- class GSHeadExit
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, exit_code=0, _tc=None)
- get_sdict()
- class GSTeardown
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class GSUnexpected
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, _tc=None)
- get_sdict()
- class GSChannelRelease
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class GSHalted
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class SHProcessCreate
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
The initial_stdin is a string which if non-empty is written along with a terminating newline character to the stdin of the newly created process.
The stdin, stdout, and stderr are all either None or an instance of SHChannelCreate to be processed by the local services component.
- __init__(tag, p_uid, r_c_uid, t_p_uid, exe, args, env=None, rundir='', options=None, initial_stdin='', stdin=None, stdout=None, stderr=None, group=None, user=None, umask=-1, pipesize=None, stdin_msg=None, stdout_msg=None, stderr_msg=None, pmi_info=None, layout=None, gs_ret_chan_msg=None, _tc=None)
- property options
- get_sdict()
- class SHProcessCreateResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, err_info='', stdin_resp=None, stdout_resp=None, stderr_resp=None, gs_ret_chan_resp=None, _tc=None)
- get_sdict()
- class SHProcessKill
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, t_p_uid, sig, hide_stderr=False, _tc=None)
- get_sdict()
- class SHProcessKillResponse
Bases:
InfraMsg
- __init__(tag, ref, err, err_info='', _tc=None)
- get_sdict()
- class SHMultiProcessKill
Bases:
InfraMsg
- __init__(tag, r_c_uid, procs: List[Dict | SHProcessKill], _tc=None)
- get_sdict()
- class SHMultiProcessKillResponse
Bases:
InfraMsg
- __init__(tag, ref, err, err_info='', exit_code=0, responses: List[Dict | SHProcessKillResponse] | None = None, failed: bool = False, _tc=None)
- get_sdict()
- class SHProcessExit
Bases:
InfraMsg
Refer to definition and to Common Fields for a description of the message structure.
- __init__(tag, p_uid, creation_msg_tag=None, exit_code=0, _tc=None)
- get_sdict()
- class SHMultiProcessCreate
Bases:
InfraMsg
- __init__(tag, r_c_uid, procs: List[Dict | SHProcessCreate], pmi_group_info: PMIGroupInfo | None = None, _tc=None)
- get_sdict()
- class SHMultiProcessCreateResponse
Bases:
InfraMsg
- __init__(tag, ref, err, err_info='', exit_code=0, responses: List[Dict | SHProcessCreateResponse] | None = None, failed: bool = False, _tc=None)
- get_sdict()
- class SHPoolCreate
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, size, m_uid, name, attr='', _tc=None)
- get_sdict()
- class SHPoolCreateResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- class Errors
Bases:
Enum
An enumeration.
- SUCCESS = 0
Pool was created
- FAIL = 1
Pool was not created
- __init__(tag, ref, err, desc=None, err_info='', _tc=None)
- get_sdict()
- class SHPoolDestroy
Bases:
InfraMsg
Refer to definition and to the Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, m_uid, _tc=None)
- get_sdict()
- class SHPoolDestroyResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, err_info='', _tc=None)
- get_sdict()
- class SHExecMemRequest
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, kind, request, _tc=None)
- get_sdict()
- class SHExecMemResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, err_code=0, err_info='', response=None, _tc=None)
- get_sdict()
- class SHChannelCreate
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, m_uid, c_uid, options=None, _tc=None)
- get_sdict()
- property options
- class SHChannelCreateResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, desc=None, err_info='', _tc=None)
- get_sdict()
- class SHChannelDestroy
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, c_uid, _tc=None)
- get_sdict()
- class SHChannelDestroyResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, err_info='', _tc=None)
- get_sdict()
- class SHLockChannel
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, _tc=None)
- get_sdict()
- class SHLockChannelResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, _tc=None)
- get_sdict()
- class SHAllocMsg
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, _tc=None)
- get_sdict()
- class SHAllocMsgResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, _tc=None)
- get_sdict()
- class SHAllocBlock
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, r_c_uid, _tc=None)
- get_sdict()
- class SHAllocBlockResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, seg_name='', offset=0, err_info='', _tc=None)
- get_sdict()
- class SHChannelsUp
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, node_desc, gs_cd, idx=0, net_conf_key=0, _tc=None)
- get_sdict()
- class SHPingGS
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, idx=0, node_sdesc=None, _tc=None)
- get_sdict()
- class SHTeardown
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class SHPingBE
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- EMPTY = ''
- __init__(tag, shep_cd='', be_cd='', gs_cd='', default_pd='', inf_pd='', _tc=None)
- get_sdict()
- class SHHaltTA
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class SHHaltBE
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class SHHalted
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, idx=0, _tc=None)
- get_sdict()
- class SHFwdInput
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- MAX = 1024
- __init__(tag, p_uid, r_c_uid, t_p_uid=None, input='', confirm=False, _tc=None)
- get_sdict()
- class SHFwdInputErr
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, idx=0, err_info='', _tc=None)
- get_sdict()
- class SHFwdOutput
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- MAX = 1024
- __init__(tag, p_uid, idx, fd_num, data, _tc=None, pid=-1, hostname='NONE')
- get_sdict()
- class SHDumpState
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, filename=None, _tc=None)
- get_sdict()
- class BENodeIdxSH
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, node_idx, host_name=None, ip_addrs=None, primary=None, logger_sdesc=None, net_conf_key=None, _tc=None)
- property logger_sdesc
- get_sdict()
- class BEPingSH
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class BEHalted
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class LABroadcast
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure. The _tc value of this message must be set to 68. The Launcher Network Front End has this as an external dependency.
- __init__(tag, data, _tc=None)
- get_sdict()
- class LAPassThruFB
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, c_uid, data, _tc=None)
- get_sdict()
- class LAPassThruBF
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, data, _tc=None)
- get_sdict()
- class LAServerMode
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, frontend, backend, frontend_args=None, backend_args=None, backend_env=None, backend_run_dir='', backend_user_name='', backend_options=None, _tc=None)
- get_sdict()
- class LAServerModeExit
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, _tc=None)
- get_sdict()
- class LAProcessDict
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class LAProcessDictResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, ref, err, pdict=None, _tc=None)
- get_sdict()
- class LADumpState
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, filename=None, _tc=None)
- get_sdict()
- class LAChannelsInfo
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, nodes_desc, gs_cd, num_gw_channels, port=None, transport='tcp', _tc=None, *, fe_ext_ip_addr=None)
- get_sdict()
- class LoggingMsg
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, name, msg, time, func, hostname, ip_address, port, service, level, _tc=None)
- get_sdict()
- get_logging_dict()
- class LoggingMsgList
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, records, _tc=None)
- property records
- get_sdict()
- class LogFlushed
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class TAPingSH
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class TAHalted
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class TAUp
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
The test_channels are empty unless the DRAGON_TRANSPORT_TEST environment variable is set to some value. When set in the environment, local services will create two channels and provide the base64 encoded serialized channel descriptors in this test_channels field to the launcher front end which then disseminates them to the test program which is started on each node.
- __init__(tag, idx=0, test_channels=[], _tc=None)
- get_sdict()
- class Breakpoint
Bases:
InfraMsg
- __init__(tag, p_uid, index, out_desc, in_desc, _tc=None)
- get_sdict()
- class BEIsUp
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, be_ch_desc, host_id, _tc=None)
- get_sdict()
- class FENodeIdxBE
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, node_index, net_conf_key_mapping=None, forward: dict[str, NodeDescriptor | dict] | None = None, send_desc: B64 | str | None = None, _tc=None)
- property forward
- property send_desc
- get_sdict()
- class HaltLoggingInfra
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class HaltOverlay
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class OverlayHalted
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class BEHaltOverlay
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class LAHaltOverlay
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class OverlayPingBE
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class OverlayPingLA
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class LAExit
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, sigint=False, _tc=None)
- get_sdict()
- class RuntimeDesc
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, gs_cd, gs_ret_cd, ls_cd, ls_ret_cd, fe_ext_ip_addr, head_node_ip_addr, oob_port, env, _tc=None)
- get_sdict()
- class UserHaltOOB
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, _tc=None)
- get_sdict()
- class TAUpdateNodes
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, nodes: list[NodeDescriptor | dict], _tc=None)
- property nodes
- get_sdict()
- class PGRegisterClient
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, resp_cd, _tc=None)
- get_sdict()
- class PGUnregisterClient
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, p_uid, _tc=None)
- get_sdict()
- class PGClientResponse
Bases:
InfraMsg
Refer to definition and Common Fields for a description of the message structure.
- __init__(tag, src_tag, error=None, ex=None, payload=None, _tc=None)
- get_sdict()
- class PGAddProcessTemplates
Bases:
InfraMsg
- __init__(tag, p_uid, templates, _tc=None)
- get_sdict()
- class PGPuids
Bases:
InfraMsg
- __init__(tag, p_uid, active=True, inactive=False, _tc=None)
- get_sdict()
- type_filter(the_msg_types)
- camel_case_msg_name(msg_id)
- mk_all_message_classes_set()
- parse(serialized, restrict=None)
Python Message Type Value Enumeration
This contains the message type _tc
definition for the Python implementation
of the infrastructure messages.
- class MessageTypes
Bases:
Enum
These are the enumerated values of message type identifiers within the Dragon infrastructure messages.
- DRAGON_MSG = 0
Deliberately invalid
- GS_PROCESS_CREATE = 1
- GS_PROCESS_CREATE_RESPONSE = 2
- GS_PROCESS_LIST = 3
- GS_PROCESS_LIST_RESPONSE = 4
- GS_PROCESS_QUERY = 5
- GS_PROCESS_QUERY_RESPONSE = 6
- GS_PROCESS_KILL = 7
- GS_PROCESS_KILL_RESPONSE = 8
- GS_PROCESS_JOIN = 9
- GS_PROCESS_JOIN_RESPONSE = 10
- GS_CHANNEL_CREATE = 11
- GS_CHANNEL_CREATE_RESPONSE = 12
- GS_CHANNEL_LIST = 13
- GS_CHANNEL_LIST_RESPONSE = 14
- GS_CHANNEL_QUERY = 15
- GS_CHANNEL_QUERY_RESPONSE = 16
- GS_CHANNEL_DESTROY = 17
- GS_CHANNEL_DESTROY_RESPONSE = 18
- GS_CHANNEL_JOIN = 19
- GS_CHANNEL_JOIN_RESPONSE = 20
- GS_CHANNEL_DETACH = 21
- GS_CHANNEL_DETACH_RESPONSE = 22
- GS_CHANNEL_GET_SENDH = 23
- GS_CHANNEL_GET_SENDH_RESPONSE = 24
- GS_CHANNEL_GET_RECVH = 25
- GS_CHANNEL_GET_RECVH_RESPONSE = 26
- ABNORMAL_TERMINATION = 27
- GS_STARTED = 28
- GS_PING_SH = 29
- GS_IS_UP = 30
- GS_HEAD_EXIT = 31
- GS_CHANNEL_RELEASE = 32
- GS_HALTED = 33
- SH_PROCESS_CREATE = 34
- SH_PROCESS_CREATE_RESPONSE = 35
- SH_MULTI_PROCESS_CREATE = 36
- SH_MULTI_PROCESS_CREATE_RESPONSE = 37
- SH_MULTI_PROCESS_KILL = 38
- SH_PROCESS_KILL = 39
- SH_PROCESS_EXIT = 40
- SH_CHANNEL_CREATE = 41
- SH_CHANNEL_CREATE_RESPONSE = 42
- SH_CHANNEL_DESTROY = 43
- SH_CHANNEL_DESTROY_RESPONSE = 44
- SH_LOCK_CHANNEL = 45
- SH_LOCK_CHANNEL_RESPONSE = 46
- SH_ALLOC_MSG = 47
- SH_ALLOC_MSG_RESPONSE = 48
- SH_ALLOC_BLOCK = 49
- SH_ALLOC_BLOCK_RESPONSE = 50
- SH_CHANNELS_UP = 51
- SH_PING_GS = 52
- SH_HALTED = 53
- SH_FWD_INPUT = 54
- SH_FWD_INPUT_ERR = 55
- SH_FWD_OUTPUT = 56
- GS_TEARDOWN = 57
- SH_TEARDOWN = 58
- SH_PING_BE = 59
- BE_PING_SH = 60
- TA_PING_SH = 61
- SH_HALT_TA = 62
- TA_HALTED = 63
- SH_HALT_BE = 64
- BE_HALTED = 65
- TA_UP = 66
- GS_PING_PROC = 67
- GS_DUMP_STATE = 68
- SH_DUMP_STATE = 69
- LA_BROADCAST = 70
- LA_PASS_THRU_FB = 71
- LA_PASS_THRU_BF = 72
- GS_POOL_CREATE = 73
- GS_POOL_CREATE_RESPONSE = 74
- GS_POOL_DESTROY = 75
- GS_POOL_DESTROY_RESPONSE = 76
- GS_POOL_LIST = 77
- GS_POOL_LIST_RESPONSE = 78
- GS_POOL_QUERY = 79
- GS_POOL_QUERY_RESPONSE = 80
- SH_POOL_CREATE = 81
- SH_POOL_CREATE_RESPONSE = 82
- SH_POOL_DESTROY = 83
- SH_POOL_DESTROY_RESPONSE = 84
- SH_CREATE_PROCESS_LOCAL_CHANNEL = 85
- SH_CREATE_PROCESS_LOCAL_CHANNEL_RESPONSE = 86
- SH_PUSH_KVL = 87
- SH_PUSH_KVL_RESPONSE = 88
- SH_POP_KVL = 89
- SH_POP_KVL_RESPONSE = 90
- SH_GET_KVL = 91
- SH_GET_KVL_RESPONSE = 92
- SH_SET_KV = 93
- SH_SET_KV_RESPONSE = 94
- SH_GET_KV = 95
- SH_GET_KV_RESPONSE = 96
- SH_EXEC_MEM_REQUEST = 97
- SH_EXEC_MEM_RESPONSE = 98
- GS_UNEXPECTED = 99
- LA_SERVER_MODE = 100
- LA_SERVER_MODE_EXIT = 101
- LA_PROCESS_DICT = 102
- LA_PROCESS_DICT_RESPONSE = 103
- LA_DUMP_STATE = 104
- BE_NODE_IDX_SH = 105
- LA_CHANNELS_INFO = 106
- SH_MULTI_PROCESS_KILL_RESPONSE = 107
- SH_PROCESS_KILL_RESPONSE = 108
- BREAKPOINT = 109
- GS_PROCESS_JOIN_LIST = 110
- GS_PROCESS_JOIN_LIST_RESPONSE = 111
- GS_NODE_QUERY = 112
- GS_NODE_QUERY_RESPONSE = 113
- GS_NODE_QUERY_ALL = 114
- GS_NODE_QUERY_ALL_RESPONSE = 115
- LOGGING_MSG = 116
- LOGGING_MSG_LIST = 117
- LOG_FLUSHED = 118
- GS_NODE_LIST = 119
- GS_NODE_LIST_RESPONSE = 120
- GS_NODE_QUERY_TOTAL_CPU_COUNT = 121
- GS_NODE_QUERY_TOTAL_CPU_COUNT_RESPONSE = 122
- BE_IS_UP = 123
- FE_NODE_IDX_BE = 124
- HALT_OVERLAY = 125
- HALT_LOGGING_INFRA = 126
- OVERLAY_PING_BE = 127
- OVERLAY_PING_LA = 128
- LA_HALT_OVERLAY = 129
- BE_HALT_OVERLAY = 130
- OVERLAY_HALTED = 131
- EXCEPTIONLESS_ABORT = 132
Communicate abnormal termination without raising exception
- LA_EXIT = 133
- GS_GROUP_LIST = 134
- GS_GROUP_LIST_RESPONSE = 135
- GS_GROUP_QUERY = 136
- GS_GROUP_QUERY_RESPONSE = 137
- GS_GROUP_DESTROY = 138
- GS_GROUP_DESTROY_RESPONSE = 139
- GS_GROUP_ADD_TO = 140
- GS_GROUP_ADD_TO_RESPONSE = 141
- GS_GROUP_REMOVE_FROM = 142
- GS_GROUP_REMOVE_FROM_RESPONSE = 143
- GS_GROUP_CREATE = 144
- GS_GROUP_CREATE_RESPONSE = 145
- GS_GROUP_KILL = 146
- GS_GROUP_KILL_RESPONSE = 147
- GS_GROUP_CREATE_ADD_TO = 148
- GS_GROUP_CREATE_ADD_TO_RESPONSE = 149
- GS_GROUP_DESTROY_REMOVE_FROM = 150
- GS_GROUP_DESTROY_REMOVE_FROM_RESPONSE = 151
- GS_GROUP_REBOOT_RUNTIME = 152
- GS_GROUP_REBOOT_RUNTIME_RESPONSE = 153
- TA_UPDATE_NODES = 154
- RUNTIME_DESC = 155
- USER_HALT_OOB = 156
- DD_REGISTER_CLIENT = 157
- DD_REGISTER_CLIENT_RESPONSE = 158
- DD_DESTROY = 159
- DD_DESTROY_RESPONSE = 160
- DD_REGISTER_MANAGER = 161
- DD_REGISTER_MANAGER_RESPONSE = 162
- DD_REGISTER_CLIENT_ID = 163
- DD_REGISTER_CLIENT_ID_RESPONSE = 164
- DD_DESTROY_MANAGER = 165
- DD_DESTROY_MANAGER_RESPONSE = 166
- DD_PUT = 167
- DD_PUT_RESPONSE = 168
- DD_GET = 169
- DD_GET_RESPONSE = 170
- DD_POP = 171
- DD_POP_RESPONSE = 172
- DD_CONTAINS = 173
- DD_CONTAINS_RESPONSE = 174
- DD_GET_LENGTH = 175
- DD_GET_LENGTH_RESPONSE = 176
- DD_CLEAR = 177
- DD_CLEAR_RESPONSE = 178
- DD_GET_ITERATOR = 179
- DD_GET_ITERATOR_RESPONSE = 180
- DD_ITERATOR_NEXT = 181
- DD_ITERATOR_NEXT_RESPONSE = 182
- DD_KEYS = 183
- DD_KEYS_RESPONSE = 184
- DD_DEREGISTER_CLIENT = 185
- DD_DEREGISTER_CLIENT_RESPONSE = 186
- DD_CREATE = 187
- DD_CREATE_RESPONSE = 188
- DD_CONNECT_TO_MANAGER = 189
- DD_CONNECT_TO_MANAGER_RESPONSE = 190
- DD_GET_RANDOM_MANAGER = 191
- DD_GET_RANDOM_MANAGER_RESPONSE = 192
- DD_MANAGER_STATS = 193
- DD_MANAGER_STATS_RESPONSE = 194
- DD_MANAGER_GET_NEWEST_CHKPT_ID = 195
- DD_MANAGER_GET_NEWEST_CHKPT_ID_RESPONSE = 196
- PG_REGISTER_CLIENT = 197
- PG_UNREGISTER_CLIENT = 198
- PG_CLIENT_RESPONSE = 199
- PG_SET_PROPERTIES = 200
- PG_STOP_RESTART = 201
- PG_ADD_PROCESS_TEMPLATES = 202
- PG_START = 203
- PG_JOIN = 204
- PG_SIGNAL = 205
- PG_STATE = 206
- PG_PUIDS = 207
- PG_STOP = 208
- PG_CLOSE = 209