Infrastructure Messages

Services in the Dragon runtime interact with each other using messages transported with a variety of different means (mostly Channels). Although typically there will be a Python API to construct and send these messages, the messages themselves constitute the true internal interface. To that end, they are a convention.

Here we document this internal message API.

Generalities about Messages

Many messages are part of a request/response pair. At the level discussed here, successful and error responses are carried by the same response type but the fields present or valid may change according to whether the response is successful or an error.

All messages are serialized using JSON; the reason for this is to allow non-Python actors to interact with the Dragon infrastructure as well as allowing the possibility for the different components of the Dragon runtime to be implemented in something other than Python.

The serialization method may change in the future as an optimization; JSON is chosen here as a reasonable cross language starting point. In particular, the fields discussed here for basic message parsing (type and instance tagging, and errors) are defined in terms of integer valued enumerations suitable for a fixed field location and width binary determination.

The top level of the JSON structure will be a dictionary (map) where the ‘_tc’ key’s value is an integer corresponding to an element of the dragon.messages.MsgTypes enumeration class. The other fields of this map will be defined on a message by message basis.

The canonical form of the message will be taken on the JSON level, not the string level. This means that messages should not be compared or sorted as strings. Internally the Python interface will construct inbound messages into class objects of distinct type according to the ‘_tc’ field in the initializer by using a factory function. The reason to have a lot of different message types instead of just one type differentiated by the value of the ‘_tc’ field is that this allows IDEs and documentation tools to work better by having explicit knowledge of what fields and methods are expected in any particular message.

Common Fields

These fields are common to every message described below depending on their category of classification.

The _tc TypeCode Field

The _tc typecode field is used during parsing to determine the type of a received message. The message format is JSON. The _tc field can be used to determine of all expected fields of a message are indeed in the message to verify its format.

Beyond the _tc typecode field, there are other fields expected to belong to every message.

The tag Field

Every message has a 64 bit positive unsigned integer tag field. Together with the identity of the sender is implicit or explicitly identified in some other field, this serves to uniquely identify the message.

Ideally, this would be throughout the life of the runtime, but it’s enough to say that no two messages should be simultaneously relevant with the same (sender, tag).

Message Categories

There are three general categories of messages:
  • request messages, where one entity asks another one to do something

  • response messages, returning the result to the requester
    • the name of these messages normally ends “Response”

  • other messages, mostly to do with infrastructure sequencing

Every request message will contain p_uid and r_c_uid fields. These fields denote the unique process ID of the entity that created the request and the unique channel ID to send the response to. See Conventional IDs for more details on process identification.

The ref Field

Every response message generated in reply to a request message will contain a ref field that echos the tag field of the request message that caused it to be generated. If a response message is generated asynchronously - for instance, as part of a startup notification or some event such as a process exiting unexpectedly - then the ref field will be 0, which is an illegal value for the tag field.

The err Field

Every response message will also have an err field, holding an integer enumeration particular to the response. This enumeration will be the class attribute Errors in the response message class. The enumeration element with value 0 will always indicate success. Which fields are present in the response message may depend on the enumeration value.

Format of Messages List

  1. Class Name

    type enum

    member of dragon.messages.MsgTypes enum class

    purpose

    succinct description of purpose, sender and receiver, where and when typically seen. Starts with ‘Request’ if it is a request message and ‘Response’ if it is a response message.

    fields

    Fields in the message, together with description of how they are to get parsed out. Some fields may themselves be dictionary structures used to initialize member objects.

    Fields in common to every message (such as tag, ref, and err) are not mentioned here.

    An expected type will be mentioned on each field where it is an obvious primitive (such as a string or an integer). Where a type isn’t mentioned, it is assumed to be another key-value map to be interpreted as the initializer for some more complex type.

    response

    If a request kind of message, gives the class name of the corresponding response message, if applicable

    request

    If a request kind of message, gives the class name of the corresponding response message.

    see also

    Other related messages

Global Services Messages API

These messages are the ones underlying the Global Services API and are sent only to and from Global Services.

All global services request messages have an integer p_uid field that contains the process UID of the requester and an integer r_c_uid field that contains the channel ID to send the response to. This field is not echoed in the corresponding response. For details on IDs, see Conventional IDs.

GS Process Messages

  1. GSProcessCreate

    type enum

    GS_PROCESS_CREATE

    purpose

    Request to global services to create a new managed process.

    fields

    exe
    • string

    • executable name

    args
    • list of strings

    • arguments to the executable. [exe, args] == argv

    env
    • map, key=string value=string

    • Environment variables to be exported before running.

      Note that this is additive/override to the shepherd environment - may need to be more fancy here if it turns out we need to add to paths

    rundir
    • string, absolute path

    • Indicates where the process is to be run from. If empty, does not matter.

    user_name
    • string

    • requested user name for the process

    options
    • options

      • initializer for other process options object

      • output_c_uid is the channel id where output, both stdout and stderr, should be

      forwarded if present in an SHFwdOutput message. If this field is not present in the options, then output will be forwarded to the Backend. The SHFwdOutput message indicates whether it was from stdout or stderr.

      response GSProcessCreateResponse

    see also

    GSProcessDestroy

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. GSProcessCreateResponse

    type enum

    GS_PROCESS_CREATE_RESPONSE

    purpose

    Response to process creation request.

    fields

    Alternatives on err:

    SUCCESS ( = 0)

    A new process was created

    desc
    • map

    • initializer for ProcessDescriptor - includes at a minimum p_uid of new process and assigned user name.

    FAIL

    No process was created

    err_info
    • string

    • explanation of what went wrong

    Might add some more here, e.g. is this a GS related problem or a Shepherd related problem.

    ALREADY

    The user name was already in use

    desc
    • map

    • init for the ProcessDescriptor of the process that already exists with that user name

    request

    GSProcessCreate

    implementation(s): Python

  1. GSProcessList

    type enum

    GS_PROCESS_LIST

    purpose

    Return a list of the p_uid for all the processes currently being managed

    fields

    None additional

    response

    GSProcessListResponse

    see also

    GSProcessQuery

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. GSProcessListResponse

    type enum

    GS_PROCESS_LIST_RESPONSE

    purpose

    Responds with a list of the p_uid for all the processes currently being managed

    fields
    plist
    • list of nonnegative integers for all the processes

    request

    GSProcessList

    see also

    GSProcessQuery

    implementation(s): Python

  1. GSProcessQuery

    type enum

    GS_PROCESS_QUERY

    purpose

    Request the ProcessDescriptor for a managed process

    fields

    One of the t_p_uid and user_name fields must be present.

    If both are specified, then the user_name field is ignored.

    t_p_uid
    • integer

    • target process UID

    user_name
    • string

    • user supplied name for the target process

    response

    GSProcessQueryResponse

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. GSProcessQueryResponse

    type enum

    GS_PROCESS_QUERY_RESPONSE

    purpose

    Response to request for ProcessDescriptor for a managed process

    fields

    Alternatives on err:

    SUCCESS (= 0)

    Process has been found

    desc
    • map

    • initializer for ProcessDescriptor - includes at a minimum p_uid of new process and assigned user name.

    UNKNOWN (= 1)

    No such process is known.

    err_info
    • string

    • explanation of what went wrong

    request

    GSProcessQuery

    see also

    GSProcessList

    implementation(s): Python

  1. GSProcessKill

    type enum

    GS_PROCESS_KILL

    purpose

    Request a managed process get killed

    fields

    One of the t_p_uid and user_name fields must be present.

    If both are specified, then the user_name field is ignored.

    t_p_uid
    • integer

    • target process UID

    user_name
    • string

    • user supplied name for the target process

    sig
    • integer

    • signal to kill the process with

    response

    GSProcessKillResponse

    see also

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. GSProcessKillResponse

    type enum

    GS_PROCESS_KILL_RESPONSE

    purpose

    Response to GSProcessKill message

    fields

    Alternatives on err:

    SUCCESS (= 0)

    Process has been found and has been killed.

    exit_code
    • integer

    • exit code the process returned

    UNKNOWN (= 1)

    No such process is known.

    err_info
    • string

    • explanation of what went wrong

    FAIL_KILL (= 2)

    Something went wrong in killing the process.

    err_info
    • string

    • explanation of what went wrong

    DEAD (= 3)

    The process is already dead.

    No other fields.

    request

    GSProcessKill

    see also

    GSProcessQuery

    implementation(s): Python

  1. GSProcessJoin

    type enum

    GS_PROCESS_JOIN

    purpose

    Request notification when a given process exits

    fields

    One of the t_p_uid and user_name fields must be present.

    If both are specified, then the user_name field is ignored.

    t_p_uid
    • integer

    • target process UID

    user_name
    • string

    • user supplied name for the target process

    timeout
    • integer, interpreted as microseconds

    • timeout value; any value < 0 interpreted as infinite timeout

    response

    GSProcessJoinResponse

    see also

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. GSProcessJoinResponse

    type enum

    GS_PROCESS_JOIN_RESPONSE

    purpose

    Response to request for notification when a process exits.

    fields

    Alternatives on err:

    SUCCESS (= 0)

    Target process has exited. If the target process has already exited, this is not an error.

    exit_code
    • integer

    • the Unix return value of the process

    UNKNOWN (= 1)

    No such process is known.

    err_info
    • string

    • explanation of what went wrong

    TIMEOUT (= 2)

    The timer has expired and the process still hasn’t exited.

    No fields.

    request

    GSProcessJoin

    see also

    GSProcessQuery

    implementation(s): Python

  1. GSProcessJoinList

    type enum

    GS_PROCESS_JOIN_LIST

    purpose

    Request notification when any/all process from a given list exits.

    fields

    Both t_p_uid_list and user_name_list fields could be present. Any process in the list could be identified by either p_uid or user_name.

    t_p_uid_list
    • list of integers

    • list of target processes UID

    user_name_list
    • list of strings

    • list of user supplied names for the target processes

    timeout
    • integer, interpreted as microseconds

    • timeout value; any value < 0 interpreted as infinite timeout

    join_all
    • bool

    • default False, request notification when any process from the list exits. When True, request notification when all processes exit.

    response

    GSProcessJoinListResponse

    see also

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. GSProcessJoinListResponse

    type enum

    GS_PROCESS_JOIN_LIST_RESPONSE

    purpose

    Response to request for notification when any/all process from a given list exits.

    fields

    Reflect the status of every process in the list:

    **puid_status*
    • dictionary with the status of every process

    • key: p_uid

    • value: tuple (status, info)

    The status of a process in the list can be one of the following:

    SUCCESS (= 0)

    Target process has exited. If the target process has already exited, this is not an error.

    info:
    • integer

    • the Unix return value of the process

    UNKNOWN (= 1)

    No such process is known.

    info:
    • string

    • explanation of what went wrong

    TIMEOUT (= 2)

    The timer has expired and the process still hasn’t exited.

    info:

    None

    SELF (= 3)

    Attempt to join itself.

    info:
    • string

    • explanation of what went wrong

    PENDING (= 4)

    The process is still pending (not exited, no timeout).

    info:

    None

    request

    GSProcessJoinList

    implementation(s): Python

Pool Messages

These messages concern the lifecycle of named memory pools. All requests related to channels will include the p_uid field of the requesting entity and the r_c_uid field for the channel to send the response to.

Pools are identified in the runtime system with a nonnegative globally unique ID number, the ‘memory UID’ normally abbreviated as m_uid.

  1. GSPoolCreate

    type enum

    GS_POOL_CREATE

    purpose

    Requests that a new user memory pool be created.

    fields

    user_name
    • string

    • optional user specified name for the pool. If absent or empty, Global Services will select a name.

    • this may not necessarily be the same name used on the node to name the pool object to the OS

    options
    • map

    • Initializer for globalservices.pool.PoolOptions object. This is an open type intended to encompass what is in a dragonMemoryPoolAttr_t but also will carry system level options such as what node to create the pool on.

    size
    • integer > 0

    • size of pool requested, in bytes. Note that the actual size of the pool delivered may be larger than this

    response

    GSPoolCreateResponse

    see also

    SHPoolCreate

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. GSPoolCreateResponse

    type enum

    GS_POOL_CREATE_RESPONSE

    purpose

    Response to request for a pool creation

    fields

    Alternatives on err:

    SUCCESS (= 0)
    desc
    • map

    • Initializer for a globalservices.pool.PoolDescriptor object. This object will contain the assigned m_uid and name for the pool, as well as the (serialized) dragonMemoryPoolSerial_t library descriptor.

    FAIL (= 1)

    Something went wrong in pool creation. This can be an error on the node level (found when the call to allocate the resources is actually made) or a logical error with the request.

    err_info
    • string

    • explanation of what went wrong

    err_code
    • integer

    • The error code from the library call, if applicable. Not all errors will be revealed when the call is made; if the error isn’t due to the underlying API call

    ALREADY (= 2)

    The user name for the pool was already in use; the current descriptor is returned.

    desc
    • map

    • Initializer for a globalservices.pool.PoolDescriptor object. This object will contain the assigned m_uid and name for the pool as well as the (serialized) dragonMemoryPoolSerial_t library descriptor.

    request

    GSPoolCreate

    see also

    SHPoolCreate, SHPoolCreateResponse

    implementation(s): Python

  1. GSPoolDestroy

    type enum

    GS_POOL_DESTROY

    purpose

    Request destruction of a managed memory pool.

    fields

    One of the m_uid and user_name fields must be present.

    If both are specified, then the user_name field is ignored.

    m_uid
    • integer

    • target memory pool ID

    user_name
    • string

    • user supplied name for the target pool

    response

    GSPoolDestroyResponse

    see also

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. GSPoolDestroyResponse

    type enum

    GS_POOL_DESTROY_RESPONSE

    purpose

    Response to GSPoolDestroy message

    fields

    Alternatives on err:

    SUCCESS (= 0)

    Pool has been found and has been removed.

    No other fields.

    UNKNOWN (= 1)

    No such pool is known.

    err_info
    • string

    • explanation of what went wrong

    FAIL (= 2)

    Something went wrong in removing the pool. This could be for a number of reasons.

    err_info
    • string

    • explanation of what went wrong

    err_code
    • integer

    • The error code from the library call, if applicable. Not all errors will be revealed when the call is made; if the error isn’t due to the underlying API call

    GONE (= 3)

    The pool is already destroyed

    No other fields.

    PENDING (= 4)

    Pool creation is pending, and it can’t be destroyed. Wait until the GSPoolCreateResponse message has been received.

    request

    GSPoolDestroy

    implementation(s): Python

  1. GSPoolList

    type enum

    GS_POOL_LIST

    purpose

    Return a list of tuples of m_uid for all pools currently alive.

    fields

    None additional

    response

    GSPoolListResponse

    see also

    GSPoolQuery

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. GSPoolListResponse

    type enum

    GS_POOL_LIST_RESPONSE

    purpose

    Responds with a list of m_uid for all the pools currently alive

    fields
    mlist
    • list of nonnegative integers

    request

    GSPoolList

    see also

    GSPoolQuery

    implementation(s): Python

  1. GSPoolQuery

    type enum

    GS_POOL_QUERY

    purpose

    Request the PoolDescriptor for a managed memory pool.

    fields

    One of the m_uid and user_name fields must be present.

    If both are specified, then the user_name field is ignored.

    m_uid
    • integer

    • target pool UID

    user_name
    • string

    • user supplied name for the target pool

    response

    GSPoolQueryResponse

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. GSPoolQueryResponse

    type enum

    GS_POOL_QUERY_RESPONSE

    purpose

    Response to request for PoolDescriptor for a managed memory pool. This object carries a serialized library pool descriptor and which node the pool is found on.

    fields

    Alternatives on err:

    SUCCESS (= 0)

    Pool has been found

    desc
    • map

    • initializer for PoolDescriptor

    UNKNOWN (= 1)

    No such pool is known.

    err_info
    • string

    • explanation of what went wrong

    request

    GSPoolQuery

    see also

    GSPoolList

    implementation(s): Python

Channel Messages

These messages are related to the lifecycle of channels. All requests related to channels will include a p_uid field, of the requesting process and the r_c_uid field for the channel to send the response to.

Channels are identified in the runtime system with a nonnegative globally unique ID number, the ‘channel UID’ normally abbreviated as c_uid. Note that every channel is associated with a memory pool which itself has its own m_uid.

  1. GSChannelCreate

    type enum

    GS_CHANNEL_CREATE

    purpose

    Requests that a new channel be created.

    fields
    user_name
    • string

    • optional user specified name for the channel. If absent or empty, Global Services will select a name.

    options
    • map

    • Initializer for a (global) ChannelOptions object. This is an open type intended to completely describe the functionality of the channel. It contains fields pertaining to the global function and local characteristics of the channel.

    m_uid
    • integer

    • m_uid of the pool to create the channel in

    • if 0, use the default pool on the target node.

    response

    GSChannelCreateResponse

    see also

    GSChannelQuery

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

    Python

  1. GSChannelCreateResponse

    type enum

    GS_CHANNEL_CREATE_RESPONSE

    purpose

    Response to channel creation request.

    fields

    Alternatives on err:

    SUCCESS (= 0)
    desc
    • map

    • Initializer for a ChannelDescriptor object. This object will contain the assigned c_uid and name for the channel and the serialized channel descriptor as well as the m_uid for the pool associated with the channel.

    FAIL (= 1)

    Something went wrong in channel creation.

    err_info
    • string

    • explanation of what went wrong

    ALREADY (= 2)

    The user name was already in use

    desc
    • map

    • Initializer for a ChannelDescriptor object. This object will contain the assigned c_uid and name for the channel as well as the serialized channel descriptor.

    request

    GSChannelCreate

    implementation(s): Python

  1. GSChannelList

    type enum

    GS_CHANNEL_LIST

    purpose

    Request list of currently active channels.

    fields

    None other than mandatory fields. To-do: add something to make this more selective.

    response

    GSChannelListResponse

    see also

    GSChannelQuery

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. GSChannelListResponse

    type enum

    GS_CHANNEL_LIST_RESPONSE

    purpose

    Response to request to list of currently active channels with a list of tuples of (c_uid, name) for all the channels currently being managed

    fields
    clist
    • list of integers

    • The list contains the currently active c_uids

    request

    GSChannelList

    see also

    GSChannelQuery

    implementation(s): Python

  1. GSChannelQuery

    type enum

    GS_CHANNEL_QUERY

    purpose

    Request the descriptor for an already created channel by channel UID or user name.

    fields

    One of the c_uid and user_name fields must be present.

    If both are specified, then the user_name field is ignored.

    c_uid
    • integer

    • target channel UID

    user_name
    • string

    • user supplied name for the target channel

    inc_refcnt
    • bool

    • whether to increment refcnt on the Channel. Default False.

    response

    GSChannelQueryResponse

    see also

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. GSChannelQueryResponse

    type enum

    GS_CHANNEL_QUERY_RESPONSE

    purpose

    Response to request for a channel descriptor

    fields

    Alternatives on err:

    SUCCESS (= 0)
    desc
    • map

    • Initializer for a ChannelDescriptor object. This object will contain the assigned c_uid and name for the channel as well as the m_uid of the pool associated with the channel. It also includes the serialized library level descriptor.

    UNKNOWN (= 1)

    No such channel is active.

    request

    GSChannelQuery

    implementation(s): Python

  1. GSChannelDestroy

    type enum

    GS_CHANNEL_DESTROY

    purpose

    Request that a channel be destroyed or a refcount on the channel be released.

    fields

    One of the c_uid and user_name fields must be present.

    If both are specified, then the user_name field is ignored.

    c_uid
    • integer

    • target channel UID

    user_name
    • string

    • user supplied name for the target channel

    reply_req
    • bool

    • default True, reply to this message requested

    dec_ref
    • bool

    • default False, release refcount on this channel

    response

    GSChannelDestroyResponse

    see also

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. GSChannelDestroyResponse

    type enum

    GS_CHANNEL_DESTROY_RESPONSE

    purpose

    Response to request to destroy a channel.

    fields

    Alternatives on err:

    SUCCESS (= 0)

    No fields.

    UNKNOWN (= 1)

    No such channel is known.

    FAIL (= 2)

    The channel exists but

    err_info
    • string

    • explanation of what went wrong

    err_code
    • integer

    • error code from the library call that failed. If the problem is not related to a library call, this value is 0.

    BUSY (= 3)

    Some entity is currently using the channel in some way that prevents its destruction.

    err_info
    • string

    • explanation of what went wrong

    request

    GSChannelDestroy

    implementation(s): Python

  1. GSChannelJoin

    Request to be notified when a channel with a given name is created.

    Note: when we refactor the messages, we may want to get rid of GSChannelJoin and its response and ride all this on top of GSChannelQuery by adding a timeout.

    type enum

    GS_CHANNEL_JOIN

    purpose

    Sometimes two processes want to communicate through a channel but we want to create the channel lazily with either the send side or receive side doing it. This lets the other side meet the creation by name in global services without polling.

    fields
    identifier
    • string

    • name of channel to join to

    timeout
    • integer, interpreted as microseconds

    • timeout value; any value < 0 interpreted as infinite timeout

    response

    GSChannelJoinResponse

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. GSChannelJoinResponse

    type enum

    GS_CHANNEL_JOIN_RESPONSE

    purpose

    Response to request to join a channel

    fields

    Alternatives on err:

    SUCCESS (= 0)
    desc
    • map

    • Initializer for a ChannelDescriptor object. This object will contain the assigned c_uid and name for the channel as well as the m_uid of the pool associated with the channel. It also includes the serialized library level descriptor.

    TIMEOUT (= 1)

    The specified timeout has expired.

    No fields

    DEAD (= 2)

    The channel used to exist, but it has been destroyed.

    request

    GSChannelJoin

    implementation(s): Python

  1. GSChannelDetach

    PLACEHOLDER

    This probably needs to mean giving back all the send and receive handles currently acquired.

    type enum

    GS_CHANNEL_DETACH

    purpose

    Request local detachment from channel.

    fields
    c_uid
    • integer

    • id of channel to detach from

    response

    GSChannelDetachResponse

    see also

    GSChannelAttach

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. GSChannelDetachResponse

    PLACEHOLDER

    type enum

    GS_CHANNEL_DETACH_RESPONSE

    purpose

    Response to request to detach from a channel.

    fields

    Alternatives on err:

    SUCCESS (= 0)

    No fields.

    UNKNOWN (= 1)

    No such process is known.

    No fields

    UNKNOWN_CHANNEL (= 2)

    No such channel is known.

    No fields

    NOT_ATTACHED (= 3)

    Requested to detach from something you weren’t attached to to begin with.

    request

    GSChannelDetach

    implementation(s): Python

  1. GSChannelGetSendh

    PLACEHOLDER

    type enum

    GS_CHANNEL_GET_SENDH

    purpose

    Request send handle for channel

    fields
    c_uid
    • integer

    • id of channel to get a send handle for

    response

    GSChannelGetSendhResponse

    see also

    GSChannelGetRecvh

    see also

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. GSChannelGetSendhResponse

    PLACEHOLDER

    type enum

    GS_CHANNEL_GET_SENDH_RESPONSE

    purpose

    Response to request to get a channel send handle

    fields

    Alternatives on err:

    SUCCESS (= 0)
    sendh
    • map

    • info needed to set up the send handle locally

    UNKNOWN (= 1)

    No such process is known.

    No fields

    UNKNOWN_CHANNEL (= 2)

    No such channel is known.

    No fields

    NOT_ATTACHED (= 3)

    Descriptor not attached, can’t get send handle.

    CANT (= 4)

    Can’t get a recv handle - too many already gotten or some other reason related to the type.

    err_info
    • string

    • explanation of why not

    request

    GSChannelGetSendh

    implementation(s): Python

  1. GSChannelGetRecvh

    PLACEHOLDER

    type enum

    GS_CHANNEL_GET_RECVH

    purpose

    Request recv handle for channel

    fields
    c_uid
    • integer

    • id of channel to get a recv handle for

    response

    GSChannelGetRecvhResponse

    see also

    GSChannelGetSendh

    see also

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. GSChannelGetRecvhResponse

    PLACEHOLDER

    type enum

    GS_CHANNEL_GET_RECVH_RESPONSE

    purpose

    Response to request to get a channel recv handle

    fields

    Alternatives on err:

    SUCCESS (= 0)
    recvh
    • map

    • info needed to set up the recv handle locally

    UNKNOWN (= 1)

    No such process is known.

    No fields

    UNKNOWN_CHANNEL (= 2)

    No such channel is known.

    No fields

    NOT_ATTACHED (= 3)

    Descriptor not attached, can’t get recv handle.

    CANT (= 4)

    Can’t get a recv handle - too many already gotten or some other reason related to the type.

    err_info
    • string

    • explanation of why not

    request

    GSChannelGetRecvh

    implementation(s): Python

Node Messages

These messages are related to hardware transactions with Global Services, e.g. querying the number of cpus in the system.

  1. GSNodeList

type enum

GS_NODE_LIST

purpose

Return a list of tuples of h_uid for all nodes currently registered.

fields

None additional

response

GSNodeListResponse

see also

GSNodeQuery

see also

refer to the Common Fields section for additional request message fields

implementation(s): Python

  1. GSNodeListResponse

type enum

GS_NODE_LIST_RESPONSE

purpose

Responds with a list of h_uid for all the nodes currently registered.

fields
hlist
  • list of nonnegative integers

request

GSNodeList

see also

GSNodeQuery

implementation(s): Python

  1. GSNodeQuery

    type enum

    GS_NODE_QUERY

    purpose

    Ask Global Services for a node descriptor of the hardware Dragon is running on.

    fields

    name
    • str

    • Name of the node to query

    h_uid
    • int

    • Unique host identifier (h_uid) held by GS.

    • The host running the Launcher Frontend has h_uid 0

    • The host running Global Services has h_uid 1

    response

    GSNodeQueryResponse

    implementation(s): Python

  1. GSNodeQueryResponse

    type enum

    GS_NODE_QUERY_RESPONSE

    purpose

    Return the machine descriptor after a GSNodeQuery.

    fields

    Alternatives on err:

    SUCCESS (= 0)

    The machine descriptor was successfully constructed

    desc
    • NodeDescriptor

    • Contains the status of the node Dragon is running on.

    UNKNOWN ( = 1)

    An unknown error has occured.

    see also

    GSMachineQuery

    implementation(s): Python

  1. GSNodeQueryTotalCPUCount

type enum

GS_NODE_QUERY_TOTAL_CPU_COUNT

purpose

Asks GS to return the total number of CPUS beloging to all of the registered nodes.

response

GSNodeQueryTotalCPUCountResponse

see also

refer to the Common Fields section for additional request message fields

implementation(s): Python

  1. GSNodeQueryTotalCPUCountResponse

type enum

GS_NODE_QUERY_TOTAL_CPU_COUNT_RESPONSE

purpose

Return the total number of CPUS beloging to all of the registered nodes.

fields

Alternatives on err:

SUCCESS (= 0)

The machine descriptor was successfully constructed

total_cpus
  • total number of CPUS beloging to all of the registered nodes.

UNKNOWN ( = 1)

An unknown error has occured.

request

GSNodeQueryTotalCPUCount

see also

GSNodeQuery

implementation(s): Python

  1. GSGroupDestroy

type enum

GS_GROUP_DESTROY

purpose

Ask Global Services to destroy a group of resources. This means destroy all the member-resources, as well as the container/group.

fields

g_uid
  • integer

  • group UID

user_name
  • string

  • user supplied name for the group

response

GSGroupDestroyResponse

implementation(s): Python

  1. GSGroupDestroyResponse

type enum

GS_GROUP_DESTROY_RESPONSE

purpose

Response to GSGroupDestroy message

fields

Alternatives on err:

SUCCESS (= 0)

Group has been found and has been destroyed.

desc
  • base64 encoded string

  • The group descriptor

UNKNOWN (= 1)

No such group is known.

err_info
  • string

  • explanation of what went wrong

DEAD (= 2)

The group is already dead.

No other fields.

PENDING (= 3)

The group is pending.

err_info
  • string

  • informing message that the group is still pending

request

GSGroupDestroy

implementation(s): Python

  1. GSGroupAddTo

type enum

GS_GROUP_DESTROY

purpose

Ask Global Services to add specific resources to an existing group of resources. The resources are already created.

fields

g_uid
  • integer

  • group UID

user_name
  • string

  • user supplied name for the group

items
  • list of strings or integers

  • list of user supplied names or UIDs for the resources

response

GSGroupAddToResponse

implementation(s): Python

  1. GSGroupAddToResponse

type enum

GS_GROUP_ADD_TO_RESPONSE

purpose

Response to GSGroupAddTo message

fields

Alternatives on err:

SUCCESS (= 0)

Group has been found and the resources have been added to it.

desc
  • base64 encoded string

  • The group descriptor

UNKNOWN (= 1)

No such group is known.

err_info
  • string

  • explanation of what went wrong

FAIL (= 2)

The request has failed. This can happen when at least one of the resources to be added to the group is dead or not found.

err_info
  • string

  • informing message with the UIDs of the resources that were not found or are dead

DEAD (= 3)

The group is already dead.

No other fields.

PENDING (= 4)

The group is pending.

err_info
  • string

  • informing message that the group is still pending

request

GSGroupAddTo

implementation(s): Python

  1. GSGroupRemoveFrom

type enum

GS_GROUP_REMOVE_FROM

purpose

Ask Global Services to remove specific resources from an existing group of resources.

fields

g_uid
  • integer

  • group UID

user_name
  • string

  • user supplied name for the group

items
  • list of strings or integers

  • list of user supplied names or UIDs for the resources

response

GSGroupRemoveFromResponse

implementation(s): Python

  1. GSGroupRemoveFromResponse

type enum

GS_GROUP_REMOVE_FROM_RESPONSE

purpose

Response to GSGroupRemoveFrom message

fields

Alternatives on err:

SUCCESS (= 0)

Group has been found and the resources have been removed from it.

desc
  • base64 encoded string

  • The group descriptor

UNKNOWN (= 1)

No such group is known.

err_info
  • string

  • explanation of what went wrong

FAIL (= 2)

The request has failed. This can happen when at least one of the resources to be removed from the group was not found.

err_info
  • string

  • informing message with the UIDs of those resources that were not found

DEAD (= 3)

The group is already dead.

No other fields.

PENDING (= 4)

The group is pending.

err_info
  • string

  • informing message that the group is still pending

request

GSGroupRemoveFrom

implementation(s): Python

  1. GSGroupCreate

type enum

GS_GROUP_CREATE

purpose

Ask Global Services to create a group of resources.

fields

user_name
  • string

  • user supplied name for the group

items
  • list[tuple[int, dragon.infrastructure.messages.Message]]

  • list of tuples where each tuple contains a replication factor and the Dragon

    create message

policy
  • infrastructure.policy.Policy or dict

  • policy object that controls the placement of the resources

response

GSGroupCreateResponse

implementation(s): Python

  1. GSGroupCreateResponse

type enum

GS_GROUP_CREATE_RESPONSE

purpose

Response to GSGroupCreate message

fields

desc
  • base64 encoded string

  • The group descriptor

request

GSGroupCreate

implementation(s): Python

  1. GSGroupKill

type enum

GS_GROUP_KILL

purpose

Ask Global Services to send the processes belonging to a specified group a specified signal.

fields

g_uid
  • integer

  • group UID

user_name
  • string

  • user supplied name for the group

sig
  • int

  • signal to use to kill the process, default=signal.SIGKILL

response

GSGroupKillResponse

implementation(s): Python

  1. GSGroupKillResponse

type enum

GS_GROUP_KILL_RESPONSE

purpose

Response to GSGroupKill message

fields

Alternatives on err:

SUCCESS (= 0)

Group has successfully sent the kill requests.

desc
  • base64 encoded string

  • The group descriptor

UNKNOWN (= 1)

No such group is known.

err_info
  • string

  • explanation of what went wrong

DEAD (= 2)

The group is already dead.

No other fields.

PENDING (= 3)

The group is pending.

err_info
  • string

  • informing message that the group is still pending

ALREADY (= 4)

All the processes of the group were already dead.

desc
  • base64 encoded string

  • The group descriptor

request

GSGroupKill

implementation(s): Python

  1. GSGroupCreateAddTo

type enum

GS_GROUP_CREATE_ADD_TO

purpose

Ask Global Services to create and add resources to an existing group of resources.

fields

g_uid
  • integer

  • group UID

user_name
  • string

  • user supplied name for the group

items
  • list of strings or integers

  • list of user supplied names or UIDs for the resources

policy
  • policy object that controls the placement of the resources

response

GSGroupCreateAddToResponse

implementation(s): Python

  1. GSGroupCreateAddToResponse

type enum

GS_GROUP_CREATE_ADD_TO_RESPONSE

purpose

Response to GSGroupCreateAddTo message

fields

Alternatives on err:

SUCCESS (= 0)

Group has been found and the resources have been created and added to it.

desc
  • base64 encoded string

  • The group descriptor

UNKNOWN (= 1)

No such group is known.

err_info
  • string

  • explanation of what went wrong

DEAD (= 2)

The group is already dead.

No other fields.

PENDING (= 3)

The group is pending.

err_info
  • string

  • informing message that the group is still pending

request

GSGroupCreateAddTo

implementation(s): Python

  1. GSGroupDestroyRemoveFrom

type enum

GS_GROUP_DESTROY_REMOVE_FROM

purpose

Ask Global Services to destroy and remove specific resources from an existing group of resources.

fields

g_uid
  • integer

  • group UID

user_name
  • string

  • user supplied name for the group

items
  • list of strings or integers

  • list of user supplied names or UIDs for the resources

response

GSGroupDestroyRemoveFromResponse

implementation(s): Python

  1. GSGroupDestroyRemoveFromResponse

type enum

GS_GROUP_REMOVE_FROM_RESPONSE

purpose

Response to GSGroupDestroyRemoveFrom message

fields

Alternatives on err:

SUCCESS (= 0)

Group has been found and the resources have been removed from it.

desc
  • base64 encoded string

  • The group descriptor

UNKNOWN (= 1)

No such group is known.

err_info
  • string

  • explanation of what went wrong

FAIL (= 2)

The request has failed. This can happen when at least one of the resources to be removed from the group was not found.

err_info
  • string

  • informing message with the UIDs of those resources that were not found

DEAD (= 3)

The group is already dead.

No other fields.

PENDING (= 4)

The group is pending.

err_info
  • string

  • informing message that the group is still pending

request

GSGroupDestroyRemoveFrom

implementation(s): Python

  1. GSGroupList

type enum

GS_GROUP_LIST

purpose

Request a list of the g_uid for all the groups of resources currently being managed

fields

None additional

response

GSGroupListResponse

see also

GSGroupQuery

refer to the Common Fields section for additional request message fields

implementation(s): Python

  1. GSGroupListResponse

type enum

GS_GROUP_LIST_RESPONSE

purpose

Response to GSGroupList message

fields
glist
  • list of nonnegative integers for all the groups

request

GSGroupList

implementation(s): Python

  1. GSGroupQuery

type enum

GS_GROUP_QUERY

purpose

Request the GroupDescriptor for a managed group of resources

fields

One of the g_uid and user_name fields must be present.

If both are specified, then the user_name field is ignored.

g_uid
  • integer

  • target group UID

user_name
  • string

  • user supplied name for the target group

response

GSGroupQueryResponse

refer to the Common Fields section for additional request message fields

implementation(s): Python

  1. GSGroupQueryResponse

type enum

GS_GROUP_QUERY_RESPONSE

purpose

Response to request for GroupDescriptor for a managed group

fields

Alternatives on err:

SUCCESS (= 0)

Group has been found

desc
  • map

  • initializer for GroupDescriptor - includes at a minimum g_uid of new group and assigned user name.

UNKNOWN (= 1)

No such group is known.

err_info
  • string

  • explanation of what went wrong

request

GSGroupQuery

see also

GSGroupList

implementation(s): Python

Other Messages

These messages are related to other transactions on Global Services, for example ones related to sequencing runtime startup and teardown.

  1. AbnormalTermination

    type enum

    ABNORMAL_TERMINATION

    purpose

    Error result for startup and teardown messages, as well as for reporting problems that happen during execution that are not tied to any specific request.

    Morally speaking, this message backs an exception class related to things going wrong in Dragon Run-time Services. These are generally nonrecoverable errors, but it is desirable to propagate an explanation of what went wrong, back to the front end when possible - users won’t want to collect disparate log files for basic bug reports.

    fields
    err_info
    • string

    • explanation of what went wrong.

    implementation(s): Python

  1. GSStarted

    type enum

    GS_STARTED

    purpose

    Confirm to Launcher that the Global Services process (and if applicable, subsidiary processes) are started.

    fields

    None

    implementation(s): Python

  1. GSPingSH

    type enum

    GS_PING_SH

    purpose

    Confirm to Shepherd(s) that Global Services has started and request handshake response.

    fields

    None

    see also

    SHPingGS

    implementation(s): Python

  1. GSIsUp

    type enum

    GS_IS_UP

    purpose

    Confirm to Launcher that Global Services is completely up

    fields

    None

    implementation(s): Python

  1. GSHeadExit

    type enum

    GS_HEAD_EXIT

    purpose

    Notify Launcher that the head process has exited. At this point the Launcher can start teardown or start a new head process.

    fields
    exit_code
    • integer

    • exit code the process left with

    implementation(s): Python

  1. GSChannelRelease

    type enum

    GS_CHANNEL_RELEASE

    purpose

    Tell the Shepherd(s) that Global Services is exiting and will no longer be sending anything through Channels

    fields

    None

    implementation(s): Python

  1. GSHalted

    type enum

    GS_HALTED

    purpose

    Notify Launcher that Global Services is halted. The Global Services processes are expected to exit after sending this message, having done everything possible to clean up.

    fields

    None

    implementation(s): Python

  1. GSTeardown

    type enum

    GS_TEARDOWN

    purpose

    Direct Global Services to do a clean exit - clean up any remaining global structures and its own sub-workers, if any, and detach from channels.

    This should only be given when there isn’t an active head process. If there is such a thing, we should probably take everything down anyway but complain about it.

    implementation(s): Python

  1. GSPingProc

    type enum

    GS_PING_PROC

    purpose

    When a new managed process wanting to use Global Services comes up, it must wait to receive this message on its Global Services return channel before sending any messages to Global Services.

    This message also carries ‘argument’ data, which is defined as binary data to be delivered to the target at startup. This can be used in whatever way the target would like. For example, the implementation of the Python Process object uses this to deliver the Python target function and arguments.

    Remark. Even if no arguments are needed, this message must be sent to solve a race with the local Shepherd’s notification to Global Services that the process was successfully started.

    fields

    mode
    • integer

    • explains how the argdata field should be interpreted

    • Alternatives:
      • NONE (=0) No arguments are carried

      • PYTHON_IMMEDIATE (=1) The arguments are found in fields of the argdata field.

      • PYTHON_CHANNEL (=2) The argdata has fields for a serialized channel to attach to and get the args from otherwise challenge the available space in the pool the argument delivery channel is found in.

    argdata
    • base64 encoded string, fixme, with the binary argument data to be delivered, or nil.

    implementation(s): Python

  1. GSDumpState

    type enum

    GS_DUMP_STATE

    purpose

    Primarily debugging. Makes global services dump its state in human readable form to a file specified in the message.

    Remark. May in the future try for a formal serialization of GS state once more of the design is concrete. May also have a reply to this message.

    fields
    filename
    • string

    • file to open and write the dump to

    implementation(s): Python

  1. GSUnexpected

    type enum

    GS_UNEXPECTED

    purpose

    Whenever GS gets a message that is not expected as an input to GS (which can depend on the state it is in) and a valid reply handle can be found corresponding to that message, this message gets returned as a response.

    If a valid reply handle can’t be found, then GS will log an error but continue to run.

    This kind of thing can happen when issuing interactive commands to a global services process from the launcher. If the head process leaves unexpectedly there can be a race; this message allows handling this without putting GS into error.

    TODO: put this in GS documentation and xref, post GS doc update.

    fields
    ref
    • integer

    • matches the tag value of the unexpected message, if it has one.

    implementation(s): Python

  1. ExceptionlessAbort

    type enum

    EXCEPTIONLESS_ABORT

    purpose

    It can be benefical to pass a message through the services runtime that communicates an abnormal termination has been encountered without immediatesly raising an exception. This message can thus be used to coordinate tearing down of resources without immediately raising an exception.

    fields

    None other than default

    implementation(s): Python

Local Services Messages API

These messages are the ones underlying the Local Services API and are sent only to and from the Shepherd. Like all request messages, these Shepherd request messages have a p_uid field denoting the process that sent the request message and r_c_uid field denoting the channel ID the response should be returned to.

LS Process Messages

  1. SHProcessCreate

    type enum

    SH_PROCESS_CREATE

    purpose

    Request to Shepherd to launch a process locally.

    fields
    t_p_uid
    • integer

    • process UID of the target process - Global Services will use this to refer to the process in other operations.

    exe
    • string

    • executable name

    args
    • list of strings

    • arguments to the executable. [exe, args] == argv

    env
    • map, key=string value=string

    • Environment variables to be exported before running.

      Note that this is additive/override to the shepherd environment - may need to be more fancy here if it turns out we need to add to paths

    rundir
    • string, absolute path

    • Indicates where the process is to be run from. If empty, does not matter.

    options

    output_c_uid is the channel id where output, both stdout

    and stderr, should be forwarded if present in an SHFwdOutput message. If this field is not present in the options, then output will be forwarded to the Backend. The SHFwdOutput message indicates whether it was from stdout or stderr.

    response

    SHProcessCreateResponse

    see also

    SHProcessKill

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. SHProcessCreateResponse

    type enum

    SH_PROCESS_CREATE_RESPONSE

    purpose

    Response to process creation request.

    fields

    Alternatives on err:

    SUCCESS (= 0)

    no fields

    FAIL (= 1)
    err_info
    • string

    • what went wrong

    request

    SHProcessCreate

    see also

    SHProcessKill

    implementation(s): Python

  1. SHProcessKill

    type enum

    SH_PROCESS_KILL

    purpose

    Request to kill a process owned by this shepherd, with a specified signal.

    fields
    t_p_uid
    • integer

    • process UID of the target process

    sig
    • integer, valid signal number

    • signal to send to the process

    response

    SHProcessExit Note difference in nomenclature.

    see also

    SHProcessCreate

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. SHMultiProcessCreate

type enum

SH_MULTI_PROCESS_CREATE

purpose

Request to Shepherd to launch multiple processes locally.

fields
pmi_group_info
  • Optional PMIGroupInfo structure.

  • Contains common PMI/MPI values needed to start the requested PMI enabled applications, including the nid_list, host_list.

procs
  • list of SHProcessCreate messages representing the processes to be started.

response

SHMultiProcessCreateResponse

see also

SHProcessKill

refer to the Common Fields section for additional request message fields

implementation(s): Python

  1. SHMultiProcessCreateResponse

type enum

SH_MULTI_PROCESS_CREATE_RESPONSE

purpose

Response to multi process creation request.

fields

Alternatives on err:

SUCCESS (= 0)
responses
  • list of SHProcessCreateResponse messages for each previously requested process.

FAIL (= 1)
err_info
  • string

  • what went wrong

request

SHMultiProcessCreate

see also

SHProcessKill

implementation(s): Python

  1. SHProcessKillResponse

    type enum

    SH_PROCESS_KILL_RESPONSE

    purpose

    Response to request to kill a process owned by this shepherd. Returns confirmation that the signal was delivered only. The process’s exit comes automatically via SHProcessExit.

    fields

    Alternatives on err:

    SUCCESS (= 0)

    no fields

    FAIL (= 1)
    err_info
    • string

    • what went wrong

    response

    SHProcessKill

    see also

    SHProcessCreate SHProcessExit

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. SHProcessExit

    type enum

    SH_PROCESS_EXIT

    purpose

    This message is sent to Global Services when a managed process exits. There are no ref or err fields.

    *fields
    exit_code
    • int

    • numerical exit code.

    p_uid
    • int

    • the Process ID of the process that exited. When a process exits of its own accord, this is the

    only thing to identify which one it is.

    see also

    SHProcessKill

    implementation(s): Python

Pool Messages

These messages are directives to a Local Services process to execute pool lifecycle commands. They are posed in the form of individual messages instead of generic operation completions because the Shepherd holds these OS resources.

  1. SHPoolCreate

    type enum

    SH_POOL_CREATE

    purpose

    Create a new memory pool.

    fields
    m_uid
    • positive integer

    • system wide unique identifier for this pool. All future operations on this pool will refer to it using the m_uid.

    name
    • string

    • name of the pool to use when creating it. This name is guaranteed to be globally unique.

    size
    • positive integer

    • size in bytes of the pool.

    attr
    • string

    • this string is the base64 encoding of a serialized pool attribute object to use when creating the pool

    response

    SHPoolCreateResponse

    see also

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. SHPoolCreateResponse

    type enum

    SH_POOL_CREATE_RESPONSE

    purpose

    Response to request to create a new memory pool.

    fields

    Alternatives on err:

    SUCCESS (= 0)

    The pool was created successfully.

    desc
    • string

    • the string is the base64 encoding of a serialized pool descriptor for the pool that just got created

    FAIL (= 1)

    The pool was not created successfully. This could be because the pool creation call failed or because there was something wrong with the pool creation message.

    err_info
    • string

    • what went wrong

    request

    SHPoolCreate

    implementation(s): Python

  1. SHPoolDestroy

type enum

SH_POOL_DESTROY

purpose

Request to destroy a memory pool.

fields
m_uid
  • positive integer

  • ID of the pool to destroy

response

SHPoolDestroyResponse

see also

refer to the Common Fields section for additional request message fields

implementation(s): Python

  1. SHPoolDestroyResponse

type enum

SH_POOL_DESTROY_RESPONSE

purpose

Response to request to destroy a memory pool

fields

Alternatives on err:

SUCCESS (= 0)

The pool was destroyed. No other fields

FAIL (= 1)

The pool was not destroyed successfully. This could be because the pool destroy call failed or because there was something wrong with the pool destroy message.

err_info
  • string

  • what went wrong

request

SHPoolDestroy

implementation(s): Python

  1. SHExecMemRequest

type enum

SH_EXEC_MEM_REQUEST

purpose

Request to execute a memory request. This message contains a serialized operation given to the Shepherd to execute on behalf of the requesting process using dragon_memory_exec_req or dragon_memory_pool_exec_request

fields
kind
  • integer, legal values are 0 or 1

  • If 0, the request is a serialized dragonMemoryPoolRequest

  • If 1, the request is a serialized dragonMemoryRequest

request
  • string

  • this is the base64 encoding of the request.

response

SHExecMemResponse

see also

refer to the Common Fields section for additional request message fields

implementation(s): Python

  1. SHExecMemResponse

type enum

SH_EXEC_MEM_RESPONSE

purpose

Response to request to execute a memory request. Note that there is no need for this message to include what kind of request it was - the sender knows this.

fields

Alternatives on err:

SUCCESS (= 0)

The memory request call completed. Even if the call did not complete with a success return value, the response has the details about what went wrong including the error code and this will be apparent when the original requester calls the completion function on the response.

response
  • string

  • this is the base64 encoding of the serialized response

FAIL (= 1)

The request could not be completed.

err_info
  • string

  • what went wrong

err_code
  • integer

  • if the error is due to a library call, this should carry the return value of the call.

    Otherwise its value should be 0.

implementation(s): Python

Channel Messages

  1. SHChannelCreate

    type enum

    SH_CHANNEL_CREATE

    purpose

    Request to create a channel in a memory pool known to this shepherd.

    fields
    m_uid
    • integer

    • Unique id of the pool to create the channel inside.

    c_uid
    • integer

    • Unique id of the channel to be created. Future operations will use this number to refer to the channel

    options
    • map

    • Initializer for a (local) ChannelOptions object. This is an open type intended to completely describe the controllable parameters for channel creation at the library level. Will contain attributes string from library.

    response

    SHChannelCreateResponse

    see also

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

    Python

  1. SHChannelCreateResponse

    type enum

    SH_CHANNEL_CREATE_RESPONSE

    purpose

    Response to channel allocation request.

    fields

    Alternatives on err:

    SUCCESS (= 0)
    desc
    • string

    • serialized and ascii encoded library channel descriptor

    FAIL (= 1)
    err_info
    • string

    • what went wrong

    request

    SHChannelCreate

    implementation(s): Python

  1. SHChannelDestroy

    type enum

    SH_CHANNEL_DESTROY

    purpose

    Request to free a previously allocated channel.

    fields
    c_uid
    • integer

    • unique id of the channel to be destroyed.

    response

    SHChannelDestroyResponse

    see also

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. SHChannelDestroyResponse

    type enum

    SH_CHANNEL_DESTROY_RESPONSE

    purpose

    Response to request to free a previously allocated channel.

    fields

    Alternatives on err:

    SUCCESS (= 0)

    No additional fields

    FAIL (= 1)
    err_info
    • string

    • what went wrong

    request

    SHChannelDestroy

    implementation(s): Python

  1. SHLockChannel

    PLACEHOLDER

    type enum

    SH_LOCK_CHANNEL

    purpose

    Request to lock a channel

    TODO: should only something attached to a channel be allowed to lock it?

    fields
    read_lock
    • bool

    • locking this for reading

    write_lock
    • bool

    • locking this for writing

    response

    SHLockChannelResponse

    see also

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. SHLockChannelResponse

    PLACEHOLDER

    type enum

    SH_LOCK_CHANNEL_RESPONSE

    purpose

    Response to request to lock a channel

    fields

    Alternatives on err:

    SUCCESS (= 0)

    Successfully locked.

    ALREADY (= 1)

    Some other entity has already locked the channel.

    request

    SHLockChannel

    see also

    x

    implementation(s): Python

Memory Allocation Messages

  1. SHAllocMsg

    PLACEHOLDER - maybe OBE

    type enum

    SH_ALLOC_MSG

    purpose

    Request a shared memory allocation for a large message

    fields

    x

    response

    SHAllocMsgResponse

    see also

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. SHAllocMsgResponse

    PLACEHOLDER - OBE?

    type enum

    SH_ALLOC_MSG_RESPONSE

    purpose

    Response to a requested allocation for a large message

    fields

    Alternatives on err:

    SUCCESS (= 0)

    request

    SHAllocMsg

    see also

    x

    implementation(s): Python

  1. SHAllocBlock

    PLACEHOLDER - OBE?

    type enum

    SH_ALLOC_BLOCK

    purpose

    Request a shared memory allocation for generic memory

    fields
    p_uid
    • integer

    • Process UID of the requesting process.

    Note that the Shepherd may need to know this to do cleanup if the process goes away. There probably needs to be an option as to which is the owning process.

    len
    • integer

    • length of allocation desired

    response

    SHAllocBlockResponse

    see also

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. SHAllocBlockResponse

    PLACEHOLDER - OBE?

    type enum

    SH_ALLOC_BLOCK_RESPONSE

    purpose

    Response to a requested allocation for generic memory

    fields

    Alternatives on err:

    SUCCESS (= 0)

    Successfully allocated the block. Do we want a reference ID for the block to let other processes claim ownership? E.g. we could have the shepherd count ref on such blocks.

    seg_name
    • string

    • name of shared memory segment the block is found in.

    offset
    • nonnegative integer

    • byte offset of the allocated block within segment

    UNKNOWN (= 1)

    The p_uid of the requester is not a process currently running and being managed by the shepherd

    FAIL (= 2)

    The block could not be allocated.

    err_info
    • string

    • what went wrong

    request

    SHAllocBlock

    see also

    x

    implementation(s): Python

Other Messages

These Local Services messages are for other purposes, such as sequencing bringup and teardown. Where appropriate they have the idx field, indicating which Shepherd instance there is. This is an integer inside [0,N), where N is the number of compute nodes in the allocation. It has the value 0 in the single node case.

  1. SHChannelsUp

    type enum

    SH_CHANNELS_UP

    purpose

    Notify Launcher that this Shepherd has allocated the shared memory for the channels and initialized whatever default channels there are.

    fields
    node_desc
    • Python<dragon.infrastructure.node_desc.NodeDescriptor or its serialized descriptor

    idx
    • integer

    • which Shepherd in the allocation this is. Index is 0 in the single node case and lives in [0,N) for N multi node back end allocation

    gs_cd
    • base64 encoded string

    • When idx is the primary node, then this is the channel descriptor of global services. Otherwise it is ignored and should be set to an empty string.

    implementation(s): Python

    Python

  1. SHPingGS

    type enum

    SH_PING_GS

    purpose

    Acknowledge to Global Services that this Shepherd is up and ready.

    fields
    idx
    • integer

    • which Shepherd in the allocation this is. Index is 0 in the single node case and lives in [0,N) for N multi node back end allocation.

    node_sdesc

    of the node the Shepherd is running on. - Note that the h_uid in the descriptor will be None in this message. It is set later by GS.

    implementation(s): Python

  1. SHHalted

    type enum

    SH_HALTED

    purpose

    Notify launcher that this Shepherd is halted.

    fields
    idx
    • integer

    • which Shepherd in the allocation this is. Index is 0 in the single node case and lives in [0,N) for N multi node back end allocation

    implementation(s): Python

  1. SHTeardown

    type enum

    SH_TEARDOWN

    purpose

    Direct Shepherd to do a clean teardown.

    fields

    None other than default

    implementation(s): Python

  1. SHPingBE

    type enum

    SH_PING_BE

    purpose

    Shepherd handshake with MRNet backend

    fields

    shep_cd

    • Channel Descriptor for the Shepherd (serialized, base64 encoded)

    be_cd

    • The Launcher Back End Channel Descriptor (serialized, base64 encoded)

    gs_cd

    • The Global Services Channel Descriptor (serialized, base64 encoded)

    default_pd

    • The Default Pool Descriptor (serialized, base64 encoded)

    inf_pd

    • The Infrastructure Pool Descriptor (serialized, base64 encoded)

    implementation(s): Python

  1. SHHaltBE

    type enum

    SH_HALT_BE

    purpose

    Shepherd telling the MRNet backend to exit.

    fields

    None other than default

    implementation(s): Python

  1. SHFwdInput

    type enum

    SH_FWD_INPUT

    purpose

    Message carrying data intended to be written into the standard input of a process managed by this shepherd.

    If the process is present and the stdin can be written without any problems, there is no response message.

    fields
    t_p_uid
    • integer

    • target p_uid

    input
    • string

    • input to be sent to target process

    • The max length of any input to be sent through this mechanism in one message is 1024 bytes.

    confirm
    • boolean

    • request confirmation of forwarded input.

    response

    SHFwdInputErr

    implementation(s): Python

  1. SHFwdInputErr

    type enum

    SH_FWD_INPUT

    purpose

    Error response to a forward input message. This message is returned to the sender if (for instance) the target process isn’t present at the time of delivery.

    fields

    All messages:

    idx
    • integer

    • which Shepherd in the allocation this is. Index is 0 in the single node case and lives in [0,N) for N multi node back end allocation

    Alternatives on err:

    SUCCESS (= 0)

    No fields other than defaults.

    No error. This message isn’t expected ever to be emitted, but is called out here to reserve the ‘0’ code for future use in case there needs to be positive confirmation of message delivery.

    FAIL (= 1)

    err_info
    • string

    • explanation of what went wrong.

    request

    SHFwdInput

    implementation(s): Python

  1. SHFwdOutput

    type enum

    SH_FWD_OUTPUT

    purpose

    Message carrying data from either stdout or stderr of a process managed by this shepherd. This comes from the Shepherd and no response is expected.

    fields

    fdnum

    Indicates which stream, stdout or stderr, is being forwarded.

    Alternatives for fdnum:

    stdout (= 1) stderr (= 2)

    data
    • string

    • The data read from the stream and being forwarded

    • The max length of any output per message is 1024 bytes.

    idx
    • integer

    • which Shepherd in the allocation this is. Index is 0 in the single node case and lives in [0,N) for N multi node back end allocation

    p_uid
    • integer

    • The process UID this came from

    implementation(s): Python

  1. SHHaltTA

    type enum

    SH_HALT_TA

    purpose

    Message coming from Launcher to the Shepherd, telling it to tell TA to halt. The Shepherd forwards this same message to TA.

    fields

    None other than default

    implementation(s): Python

  1. SHDumpState

    type enum

    SH_DUMP_STATE

    purpose

    Primarily debugging. Makes the Shepherd dump its state in human readable form to a file specified in the message or dump to the log if no filename is provided.

    fields
    filename
    • string

    • file to open and write the dump to. Defaults to None. If None, then the dump goes to the log file for the Shepherd.

    implementation(s): Python

Launcher Messages API

These messages go to the Launcher frontend in standard and server mode via the LA Channel.

  1. LABroadcast

    type enum

    LA_BROADCAST

    purpose

    This message is sent by the launcher to the Launch Router. This message contains another message in the data field. The message in the data field is to be broadcast to the Shepherd on all nodes through the Backend. The LABroadcast message is sent to the backend. The Backend extracts the embedded message from the LABroadcast message and sets the embedded message’s r_c_uid as the return channel id in the message and forwards the embedded message to the Shepherd’s channel. All broadcast messages are sent to the Shepherd on each node.

    fields

    data

    • A string to be broadcast to all nodes. For the Dragon infrastructure this data field will itself be a valid, serialized message.

    implementation(s): Python, Network Front End has a dependency on the type enum being 68.

  1. LAPassThruFB

    type enum

    LA_PASS_THRU_FB

    purpose

    This message is a pass thru message from the Launcher to the Backend.

    When sent to the Launcher Router, it forwards the message to the Backend on the primary node.

    This message type is used by the Dragon Kernel Server to communicate its channel id to the Dragon Kernel indicating that the Dragon Kernel is up and ready to accept requests.

    fields

    c_uid

    • The channel to send this message to running on a compute node.

    data

    • A user-defined serialized message that is unwrapped by the receiving end.

    NOTE

    • The r_c_uid is set in this message by the Backend to the Backend channel id before it is forwarded on to its destination on the compute node.

    implementation(s): Python

  1. LAPassThruBF

    type enum

    LA_PASS_THRU_BF

    purpose

    This message is a pass thru message from the Backend to the Launcher.

    Note: One use of this message type is by a server back end to communicate its channel id to a server front end in the r_c_uid field. This serves as an indicator in this case that a front end that its corresponding back end is ready to accept requests (see Launcher Server Mode).

    fields

    data

    • A user-defined serialized message that is unwrapped by the

    • receiving end.

    NOTE

    • The r_c_uid is set to the channel id of the source of the

    • passthrough message on a compute node.

    implementation(s): Python

  1. LAServerMode

    type enum

    LA_SERVER_MODE

    purpose

    This is a message that is used internally in the Launcher to set up Server Mode. It is not ever to be sent between components in the Dragon run-time.

    fields

    frontend

    • A string with a file specification for the server front end to be run on the login node.

    backend

    • A string with a file specification for the server back end to be run on the primary compute node.

    frontendargs

    • A list of strings containing any command-line arguments to the front end.

    backendargs

    • A list of strings containing any command-line arguments to the back end.

    implementation(s): Python

  1. LAServerModeExit

    type enum

    LA_SERVER_MODE_EXIT

    purpose

    This is a message that is used internally in the Launcher to exit Server Mode. It is not ever to be sent between components in the Dragon run-time.

    fields

    None other than default

    implementation(s): Python

  1. LAProcessDict

    type enum

    LA_PROCESS_DICT

    purpose

    Return a dictionary of process information for all the processes that were started in the launcher via the run command.

    fields

    None additional

    response

    LAProcessDictResponse

    see also

    refer to the Common Fields section for additional request message fields

    implementation(s): Python

  1. LAProcessDictResponse

    type enum

    LA_PROCESS_DICT_RESPONSE

    purpose

    Responds with a dictionary for all the processes were started in the launcher via the run command. The dictionary includes fields for exe, args, env, run directory, user name, options, create return code, and exit code.

    fields
    pdict
    • dictionary of p_uid to process information as described.

    request

    LAProcessList

    implementation(s): Python

  1. LADumpState

    type enum

    LA_DUMP_STATE

    purpose

    Primarily debugging. Makes the Launcher dump its state in human readable form to a file specified in the message or dump to the log if no filename is provided.

    fields
    filename
    • string

    • file to open and write the dump to. Defaults to None. If None, then the dump goes to the log file for the Launcher.

    implementation(s): Python

  1. LANodeIdxSH

    type enum

    LA_NODEINDEX_SH

    purpose

    Communicates the node index from the Launcher Back End to the Shepherd during bootstrapping of the environment.

    fields
    node_idx
    • integer

    • The index of this node in the allocation upon which the Dragon runtime services are running.

    implementation(s): Python

  1. LAChannelsInfo

    type_enum

    LA_CHANNELS_INFO

    purpose

    Broadcast to all nodes to provide hostnames, node indices, and channel descriptors to all nodes in the allocation so services like global services and the inter-node transport service (HSTA, TCP or otherwise) can be started.

    When the launcher backend receives this message, it forwards it to the local shepherd. The local shepherd provides this message to the inter-node transport service when it is started and to global services (on the primary node) when it is started.

    fields
    nodes_desc
    • dictionary with keys corresponding to string node indices and values of class

    dragon.infrastructure.node_desc.NodeDescriptor``(eg: The hostname info for node 5 is accessed via ``la_channels_msg.nodes_desc['4'].host_name ) - Excamples attributes for each value of dragon.infrastructure.node_desc.NodeDescriptor:

    • ‘host_name’: hostname string

    • ‘host_id’: integer which is determined by calling the Posix gethostid function

    • ‘ip_addr’: ip address string

    • ‘shep_cd’: base64 encoded shepherd channel descriptor

    gs_cd
    • string

    • The base64 encoded channel descriptor of global services

    num_gw_channels
    • int

    • The number of gateway channels per node

    implementation(s): Python

    Python

  1. Breakpoint

    type_enum

    BREAKPOINT

    purpose

    Inform front end that a managed process has reached a breakpoint for the first time and wants to establish a debug connection.

    fields
    p_uid
    • int

    • process uid that has just reached a breakpoint

    index
    • int

    • node that process is on

    out_desc
    • string

    • encoded binary serialized channel descriptor for debug connection outbound

    in_desc
    • string

    • encoded binary serialized channel descriptor for debug connection inbound

    implementation(s): Python

Launcher Backend Messages APIs

These messages are used by a combination of the Launcher and the Launcher Backend to support the launcher in both regular and server mode. They are communicated through the BELA Channel.

  1. BEIsUp

type enum

BE_IS_UP

purpose

Confirm to Launcher Frontend that the Backend is up and send its serialized channel descriptor and host_id. This message is not communicated through the BELA Channel, as this is not set up yet.

fields

be_ch_desc

  • Backend serialized channel descriptor. The frontend will attach to

this channel and establish a connection with the backend.

host_id

  • integer

  • the hostid of the node sending this message.

implementation(s): Python

  1. BEPingSH

    type enum

    BE_PING_SH

    purpose

    MRNet backend handshake with Shepherd

    fields

    None other than default

    implementation(s): Python

  1. BEHalted

    type enum

    BE_HALTED

    purpose

    Indicate that the MRNet backend instance on this node has exited normally.

    fields

    None other than default

    implementation(s): Python

  1. BENodeIdxSH

    type enum

    BE_NODE_IDX_SH

    purpose

    This message is sent from the Launcher Backend to Local Services during initialization to tell Local Services its node index and related host info for the current run. All nodes are numbered 0 to n-1 where n is the number of nodes in the allocation.

    fields

    node_idx

    • An integer representing the node index value for this node.

    host_name

    • string

    • name of host as returned by MRNet backend

    ip_addr
    • string

    • IP address as returned by DNS lookup of hostname

    primary
    • string

    • hostname of primary node

    logger_sdesc
    • dragon.utils.B64

    • serialized descriptor of backend logging channel

    implementation(s): Python

Launcher Frontend Messages APIs

These messages are used by a combination of the Launcher and the Launcher Frontend to support the launcher in both regular and server mode.

  1. FENodeIdxBE

type enum

FE_NODE_IDX_BE

purpose

The Frontend sends the node index to the Backend, based on the backend’s host_id.

fields

node_index
  • integer

  • The index of this node in the allocation upon which the Dragon runtime services are running.

implementation(s): Python

  1. HaltOverlay

type enum

HALT_OVERLAY

purpose

Indicate that monitoring of the overlay network should cease

fields

None other than default

implementation(s): Python

  1. HaltLoggingInfra

type enum

HALT_LOGGING_INFRA

purpose

Indicate that monitoring of logging messages from the backend should cease

fields

None other than default

implementation(s): Python

  1. LAExit

type enum

LA_EXIT

purpose

Indicate the launcher should exit. Use in case the launcher teardown was unable to complete correctly.

fields

None other than default

implementation(s): Python

Logging Messages API

In order to transmit messages from backend services to the launcher frontend, the logging infrastructure packs logging strings into messages that are passed from a given backend service to the MRNet server backend, which aggregates any logging messages its received and sends them to the launcher frontend to be offically logged to file or terminal.

  1. LoggingMsg

    type enum

    LOGGING_MSG

    purpose

    To take logging strings provided via python logging (eg: log.info('message')) and pack all information that’s necessary to make log messages useful on the frontend

    fields

    All fields are auto-populated by the logging infrastructure found in LoggingInfrastructure

    name
    • string

    • name of logger message came from. Typically formatted f{service}.{func}

    msg
    • string

    • log mesage

    time
    • string

    • time generated by the sending logger’s formatter

    func
    • string

    • name of function the logging msg was made in

    hostname
    • string

    • hostname of sender

    ip_address
    • string

    • IP address of sender

    port
    • string

    • Port of sender (placeholder for now but likely useful in single-node testing)

    service
    • string

    • service as defined in Python

    level
    • int

    • logging level as defined by dragonMemoryPoolAttr_t and is consistent with Python’s only loggin int values

    implementation(s) Python

  1. LoggingMsgList

    type enum

    LOGGING_MSG_LIST

    purpose

    Takes a list of LoggingMsg and aggregates them into a single message the MRNet server backend can send up to the MRNet frontend server

    fields
    records

    list of LoggingMsg messages with attributes as defined therein

    implementation(s) Python

  1. LogFlushed

    type enum

    LOG_FLUSHED

    purpose

    Sent by MRNet server backend to frontend after it has completed its final flushing and transmission of logs to the fronted. Part of the transaction diagram for teardown in Multi Node Teardown

    fields

    None other than default

    implementation(s) Python

Overlay/Transport Agent Messages API

In multi node cases, one or more transport agent processes live on each node and they need a few messages related to setup and teardown control through the TA Channel.

  1. TAPingSH

    type enum

    TA_PING_SH

    purpose

    Indicate that the TA instance on this node has come up and succeeded in getting communication with the other TA instances.

    fields

    None other than default

    implementation(s): Python

  1. TAHalted

    type enum

    TA_HALTED

    purpose

    Indicate that the TA instance on this node has exited normally.

    fields
    idx
    • integer

    • which node in the allocation this is.

    implementation(s): Python

  1. TAUp

    type enum

    TA_UP

    purpose

    Indicate that the TA instance on this node is up and ready.

    fields

    None other than default

    implementation(s): Python

  1. OverlayPingBE

type enum

OVERLAY_PING_BE

purpose

Indicate that the Overlay instance on this backend node is up and ready.

fields

None other than default

implementation(s): Python

  1. OverlayPingLA

type enum

OVERLAY_PING_LA

purpose

Indicate that the Overlay instance on this frontend node is up and ready.

fields

None other than default

implementation(s): Python

  1. LAHaltOverlay

type enum

LA_HALT_OVERLAY

purpose]

Launcher frontend requests a shutdown of Overlay agent

fields

None other than default

implementation(s): Python

  1. BEHaltOverlay

type enum

BE_HALT_OVERLAY

purpose]

This backend node requests a shutdown of its Overlay agent

fields

None other than default

implementation(s): Python

  1. OverlayHalted

type enum

OVERLAY_HALTED

purpose]

This overlay instance has shutdown

fields

None other than default

implementation(s): Python

Python Infrastructure Messages Implementation

The following subsections contain structures relating to the Python implementation of the Dragon infrastructure message definitions. The specification of these message types i`` in the previous section and cross-referenced from these two sections for convenience.

Python Implementation of Infrastructure Messages

Alphabetical listing of all message classes in the Python implementation of the infrastructure messages.

Dragon infrastructure messages are the internal API used for service communication.

class FileDescriptor

An enumeration.

exception AbnormalTerminationError
__init__(msg='')
class PMIGroupInfo

Required information to enable the launching of pmi based applications.

__init__(job_id: int, nnodes: int, nranks: int, nidlist: list[int], hostlist: list[str], control_port: int) None
class PMIProcessInfo

Required information to enable the launching of pmi based applications.

__init__(lrank: int, ppn: int, nid: int, pid_base: int) None
class InfraMsg

Common base for all messages.

This common base type for all messages sets up the default fields and the serialization strategy for now.

class Errors

An enumeration.

__init__(tag, ref=None, err=None)
class CapNProtoMsg

Common base for all capnproto messages.

This common base type for all messages sets up the default fields and the serialization strategy for messages to be exchanged between C and Python.

Errors

alias of DragonError

__init__(tag)
class CapNProtoResponseMsg

Common base for all capnproto response messages.

This provides some support for code common to all response messages.

__init__(tag, ref, err, errInfo)
class SHCreateProcessLocalChannel
__init__(tag, puid, respFLI)
class SHCreateProcessLocalChannelResponse
__init__(tag, ref, err, errInfo='', serChannel='')
class SHPushKVL
__init__(tag, key, value, respFLI)
class SHPushKVLResponse
__init__(tag, ref, err, errInfo='')
class SHPopKVL
__init__(tag, key, value, respFLI)
class SHPopKVLResponse
__init__(tag, ref, err, errInfo='')
class SHGetKVL
__init__(tag, key, respFLI)
class SHGetKVLResponse
__init__(tag, ref, err, errInfo='', values=[])
class SHSetKV
__init__(tag, key, value, respFLI)
class SHSetKVResponse
__init__(tag, ref, err, errInfo='')
class SHGetKV
__init__(tag, key, respFLI)
class SHGetKVResponse
__init__(tag, ref, err, errInfo='', value=None)
class DDCreate
__init__(tag, respFLI, args)
class DDCreateResponse
__init__(tag, ref, err, errInfo='')
class DDGetRandomManager
__init__(tag, respFLI)
class DDGetRandomManagerResponse
__init__(tag, ref, err, manager, errInfo='')
class DDRegisterClient
__init__(tag, respFLI, bufferedRespFLI)
class DDRegisterClientResponse
__init__(tag, ref, err, clientID, numManagers, errInfo='')
class DDConnectToManager
__init__(tag, clientID, managerID)
class DDConnectToManagerResponse
__init__(tag, ref, err, manager, errInfo='')
class DDDestroy
__init__(tag, clientID, respFLI)
class DDDestroyResponse
__init__(tag, ref, err, errInfo='')
class DDRegisterManager
__init__(tag, mainFLI, respFLI)
class DDRegisterManagerResponse
__init__(tag, ref, err, managerID, errInfo='', managers=[])
class DDRegisterClientID
__init__(tag, clientID, respFLI, bufferedRespFLI)
class DDRegisterClientIDResponse
__init__(tag, ref, err, errInfo='')
class DDDestroyManager
__init__(tag, respFLI)
class DDDestroyManagerResponse
__init__(tag, ref, err, errInfo='')
class DDPut
__init__(tag, clientID)
class DDPutResponse
__init__(tag, ref, err, errInfo='')
class DDGet
__init__(tag, clientID)
class DDGetResponse
__init__(tag, ref, err, errInfo='')
class DDPop
__init__(tag, clientID)
class DDPopResponse
__init__(tag, ref, err, errInfo='')
class DDContains
__init__(tag, clientID)
class DDContainsResponse
__init__(tag, ref, err, errInfo='')
class DDGetLength
__init__(tag, clientID)
class DDGetLengthResponse
__init__(tag, ref, err, errInfo='', length=0)
class DDClear
__init__(tag, clientID)
class DDClearResponse
__init__(tag, ref, err, errInfo='')
class DDGetIterator
__init__(tag, clientID)
class DDGetIteratorResponse
__init__(tag, ref, err, errInfo='', iterID=0)
class DDIteratorNext
__init__(tag, clientID, iterID)
class DDIteratorNextResponse
__init__(tag, ref, err, errInfo='')
class DDKeys
__init__(tag, clientID)
class DDKeysResponse
__init__(tag, ref, err, errInfo='')
class DDDeregisterClient
__init__(tag, clientID, respFLI)
class DDDeregisterClientResponse
__init__(tag, ref, err, errInfo='')
class GSProcessCreate

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, exe, args, env=None, rundir='', user_name='', options=None, stdin=None, stdout=None, stderr=None, group=None, user=None, umask=-1, pipesize=None, pmi_required=False, _pmi_info=None, layout=None, policy=None, _tc=None)
class GSProcessCreateResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0

Process was created

FAIL = 1

Process was not created

ALREADY = 2

Process exists already

__init__(tag, ref, err, desc=None, err_info='', _tc=None)
class GSProcessList

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, _tc=None)
class GSProcessListResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0

Always succeeds

__init__(tag, ref, err, plist=None, _tc=None)
class GSProcessQuery

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, t_p_uid=None, user_name='', _tc=None)
class GSProcessQueryResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
__init__(tag, ref, err, desc=None, err_info='', _tc=None)
class GSProcessKill

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, sig, t_p_uid=None, user_name='', _tc=None)
class GSProcessKillResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
FAIL_KILL = 2
DEAD = 3
PENDING = 4
__init__(tag, ref, err, exit_code=0, err_info='', _tc=None)
class GSProcessJoin

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, timeout=-1, t_p_uid=None, user_name='', _tc=None)
class GSProcessJoinResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
TIMEOUT = 2
SELF = 3
__init__(tag, ref, err, exit_code=0, err_info='', _tc=None)
class GSProcessJoinList

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, timeout=-1, t_p_uid_list=None, user_name_list=None, join_all=False, return_on_bad_exit=False, _tc=None)
class GSProcessJoinListResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
TIMEOUT = 2
SELF = 3
PENDING = 4
__init__(tag, ref, puid_status, _tc=None)
class GSPoolCreate

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, size, user_name='', options=None, _tc=None)
class GSPoolCreateResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0

Pool was created

FAIL = 1

Pool was not created

ALREADY = 2

Pool exists already

__init__(tag, ref, err, err_code=0, desc=None, err_info='', _tc=None)
class GSPoolList

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, _tc=None)
class GSPoolListResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0

Always succeeds

__init__(tag, ref, err, mlist=None, _tc=None)
class GSPoolQuery

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, m_uid=None, user_name='', _tc=None)
class GSPoolQueryResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
__init__(tag, ref, err, desc=None, err_info='', _tc=None)
class GSPoolDestroy

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, m_uid=0, user_name='', _tc=None)
class GSPoolDestroyResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
FAIL = 2
GONE = 3
PENDING = 4
__init__(tag, ref, err, err_code=0, err_info='', _tc=None)
class GSGroupCreate

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, items=None, policy=None, user_name='', _tc=None)
class GSGroupCreateResponse

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, ref, desc=None, _tc=None)
class GSGroupList

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, _tc=None)
class GSGroupListResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0

Always succeeds

__init__(tag, ref, err, glist=None, _tc=None)
class GSGroupQuery

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, g_uid=None, user_name='', _tc=None)
class GSGroupQueryResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
__init__(tag, ref, err, desc=None, err_info='', _tc=None)
class GSGroupKill

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, sig, g_uid=None, user_name='', _tc=None)
class GSGroupKillResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
DEAD = 2
PENDING = 3
ALREADY = 4
__init__(tag, ref, err, err_info='', desc=None, _tc=None)
class GSGroupDestroy

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, g_uid=None, user_name='', _tc=None)
class GSGroupDestroyResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
DEAD = 3
PENDING = 4
__init__(tag, ref, err, err_info='', desc=None, _tc=None)
class GSGroupAddTo

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, g_uid=None, user_name='', items=None, _tc=None)
class GSGroupAddToResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
FAIL = 2
DEAD = 3
PENDING = 4
__init__(tag, ref, err, err_info='', desc=None, _tc=None)
class GSGroupCreateAddTo

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, g_uid=None, user_name='', items=None, policy=None, _tc=None)
class GSGroupCreateAddToResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
DEAD = 2
PENDING = 3
__init__(tag, ref, err, err_info='', desc=None, _tc=None)
class GSGroupRemoveFrom

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, g_uid=None, user_name='', items=None, _tc=None)
class GSGroupRemoveFromResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
FAIL = 2
DEAD = 3
PENDING = 4
__init__(tag, ref, err, err_info='', desc=None, _tc=None)
class GSGroupDestroyRemoveFrom

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, g_uid=None, user_name='', items=None, _tc=None)
class GSGroupDestroyRemoveFromResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
FAIL = 2
DEAD = 3
PENDING = 4
__init__(tag, ref, err, err_info='', desc=None, _tc=None)
class GSChannelCreate

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, m_uid, options=None, user_name='', _tc=None)
class GSChannelCreateResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
FAIL = 1
ALREADY = 2
__init__(tag, ref, err, desc=None, err_info='', _tc=None)
class GSChannelList

Refer to definition and to Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, _tc=None)
class GSChannelListResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0

Always succeeds

__init__(tag, ref, err, clist=None, _tc=None)
class GSChannelQuery

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, c_uid=None, user_name='', inc_refcnt=False, _tc=None)
class GSChannelQueryResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
__init__(tag, ref, err, desc=None, err_info='', _tc=None)
class GSChannelDestroy

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, c_uid=None, user_name='', reply_req=True, dec_ref=False, _tc=None)
class GSChannelDestroyResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
UNKNOWN_CHANNEL = 2
BUSY = 3
__init__(tag, ref, err, err_info='', _tc=None)
class GSChannelJoin

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, name, timeout=-1, _tc=None)
class GSChannelJoinResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
TIMEOUT = 1
DEAD = 2
__init__(tag, ref, err, desc=None, _tc=None)
class GSChannelDetach

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, c_uid, _tc=None)
class GSChannelDetachResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
UNKNOWN_CHANNEL = 2
NOT_ATTACHED = 3
__init__(tag, ref, err, _tc=None)
class GSChannelGetSendH

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, c_uid, _tc=None)
class GSChannelGetSendHResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
UNKNOWN_CHANNEL = 2
NOT_ATTACHED = 3
CANT = 4
__init__(tag, ref, err, sendh=None, err_info='', _tc=None)
class GSChannelGetRecvH

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, c_uid, _tc=None)
class GSChannelGetRecvHResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
UNKNOWN_CHANNEL = 2
NOT_ATTACHED = 3
CANT = 4
__init__(tag, ref, err, recvh=None, err_info='', _tc=None)
class GSNodeList
type enum

GS_NODE_LIST (= 102)

purpose

Return a list of tuples of h_uid for all nodes currently registered.

fields

None additional

response

GSNodeListResponse

see also

GSNodeQuery

see also

refer to the Common Fields section for additional request message fields

__init__(tag, p_uid, r_c_uid, _tc=None)
class GSNodeListResponse
type enum

GS_NODE_LIST_RESPONSE (= 103)

purpose

Responds with a list of h_uid for all the nodes currently registered.

fields
hlist
  • list of nonnegative integers

request

GSNodeList

see also

GSNodeQuery

class Errors

An enumeration.

SUCCESS = 0

Always succeeds

__init__(tag, ref, err, hlist=None, _tc=None)
class GSNodeQuery

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid: int, r_c_uid: int, name: str = '', h_uid=None, _tc=None)
class GSNodeQueryResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
__init__(tag, ref, err, desc=None, err_info='', _tc=None)
class GSNodeQueryTotalCPUCount
type enum

GS_NODE_QUERY_TOTAL_CPU_COUNT (= 104)

purpose

Asks GS to return the total number of CPUS beloging to all of the registered nodes.

see also

refer to the Common Fields section for additional request message fields

response

GSNodeQueryTotalCPUCountResponse

__init__(tag, p_uid: int, r_c_uid: int, _tc=None)
class GSNodeQueryTotalCPUCountResponse
type enum

GS_NODE_QUERY_TOTAL_CPU_COUNT_RESPONSE (= 105)

purpose

Return the total number of CPUS beloging to all of the registered nodes.

fields

Alternatives on err:

SUCCESS (= 0)

The machine descriptor was successfully constructed

total_cpus
  • total number of CPUS beloging to all of the registered nodes.

UNKNOWN ( = 1)

An unknown error has occured.

request

GSNodeQueryTotalCPUCount

see also

GSNodeQuery

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
__init__(tag, ref, err, total_cpus=0, err_info='', _tc=None)
class AbnormalTermination

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, err_info='', host_id=0, _tc=None)
class ExceptionlessAbort

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class GSStarted

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class GSPingSH

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class GSIsUp

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class GSPingProc

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, mode=None, argdata=None, _tc=None)
class GSDumpState

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, filename, _tc=None)
class GSHeadExit

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, exit_code=0, _tc=None)
class GSTeardown

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class GSUnexpected

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, ref, _tc=None)
class GSChannelRelease

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class GSHalted

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class SHProcessCreate

Refer to definition and Common Fields for a description of the message structure.

The initial_stdin is a string which if non-empty is written along with a terminating newline character to the stdin of the newly created process.

The stdin, stdout, and stderr are all either None or an instance of SHChannelCreate to be processed by the local services component.

__init__(tag, p_uid, r_c_uid, t_p_uid, exe, args, env=None, rundir='', options=None, initial_stdin='', stdin=None, stdout=None, stderr=None, group=None, user=None, umask=-1, pipesize=None, stdin_msg=None, stdout_msg=None, stderr_msg=None, pmi_info=None, layout=None, gs_ret_chan_msg=None, _tc=None)
class SHProcessCreateResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
FAIL = 1
__init__(tag, ref, err, err_info='', stdin_resp=None, stdout_resp=None, stderr_resp=None, gs_ret_chan_resp=None, _tc=None)
class SHProcessKill

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, t_p_uid, sig, _tc=None)
class SHProcessKillResponse
class Errors

An enumeration.

__init__(tag, ref, err, err_info='', _tc=None)
class SHProcessExit

Refer to definition and to Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
__init__(tag, p_uid, exit_code=0, _tc=None)
class SHMultiProcessCreate
__init__(tag, r_c_uid, procs: List[Union[Dict, SHProcessCreate]], pmi_group_info: Optional[PMIGroupInfo] = None, _tc=None)
class SHMultiProcessCreateResponse
class Errors

An enumeration.

__init__(tag, ref, err, err_info='', exit_code=0, responses: Optional[List[Union[Dict, SHProcessCreateResponse]]] = None, failed: bool = False, _tc=None)
class SHPoolCreate

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, size, m_uid, name, attr='', _tc=None)
class SHPoolCreateResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0

Pool was created

FAIL = 1

Pool was not created

__init__(tag, ref, err, desc=None, err_info='', _tc=None)
class SHPoolDestroy

Refer to definition and to the Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, m_uid, _tc=None)
class SHPoolDestroyResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
FAIL = 1
__init__(tag, ref, err, err_info='', _tc=None)
class SHExecMemRequest

Refer to definition and Common Fields for a description of the message structure.

class KINDS

An enumeration.

__init__(tag, p_uid, r_c_uid, kind, request, _tc=None)
class SHExecMemResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
FAIL = 1
__init__(tag, ref, err, err_code=0, err_info='', response=None, _tc=None)
class SHChannelCreate

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, m_uid, c_uid, options=None, _tc=None)
class SHChannelCreateResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
FAIL = 1
__init__(tag, ref, err, desc=None, err_info='', _tc=None)
class SHChannelDestroy

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, c_uid, _tc=None)
class SHChannelDestroyResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
FAIL = 1
__init__(tag, ref, err, err_info='', _tc=None)
class SHLockChannel

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, _tc=None)
class SHLockChannelResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
ALREADY = 1
__init__(tag, ref, err, _tc=None)
class SHAllocMsg

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, _tc=None)
class SHAllocMsgResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
__init__(tag, ref, err, _tc=None)
class SHAllocBlock

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, _tc=None)
class SHAllocBlockResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
UNKNOWN = 1
FAIL = 2
__init__(tag, ref, err, seg_name='', offset=0, err_info='', _tc=None)
class SHChannelsUp

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, node_desc, gs_cd, idx=0, _tc=None)
class SHPingGS

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, idx=0, node_sdesc=None, _tc=None)
class SHTeardown

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class SHPingBE

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, shep_cd='', be_cd='', gs_cd='', default_pd='', inf_pd='', _tc=None)
class SHHaltTA

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class SHHaltBE

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class SHHalted

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, idx=0, _tc=None)
class SHFwdInput

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, p_uid, r_c_uid, t_p_uid=None, input='', confirm=False, _tc=None)
class SHFwdInputErr

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
FAIL = 1
__init__(tag, ref, err, idx=0, err_info='', _tc=None)
class SHFwdOutput

Refer to definition and Common Fields for a description of the message structure.

class FDNum

An enumeration.

__init__(tag, p_uid, idx, fd_num, data, _tc=None, pid=-1, hostname='NONE')
class SHDumpState

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, filename=None, _tc=None)
class BENodeIdxSH

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, node_idx, host_name=None, ip_addrs=None, primary=None, logger_sdesc=None, _tc=None)
class BEPingSH

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class BEHalted

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class LABroadcast

Refer to definition and Common Fields for a description of the message structure. The _tc value of this message must be set to 68. The Launcher Network Front End has this as an external dependency.

__init__(tag, data, _tc=None)
class LAPassThruFB

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, c_uid, data, _tc=None)
class LAPassThruBF

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, data, _tc=None)
class LAServerMode

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, frontend, backend, frontend_args=None, backend_args=None, backend_env=None, backend_run_dir='', backend_user_name='', backend_options=None, _tc=None)
class LAServerModeExit

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0
FAIL = 1
__init__(tag, ref, err, _tc=None)
class LAProcessDict

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class LAProcessDictResponse

Refer to definition and Common Fields for a description of the message structure.

class Errors

An enumeration.

SUCCESS = 0

Always succeeds

__init__(tag, ref, err, pdict=None, _tc=None)
class LADumpState

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, filename=None, _tc=None)
class LAChannelsInfo

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, nodes_desc, gs_cd, num_gw_channels, port=7575, transport='tcp', _tc=None)
class LoggingMsg

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, name, msg, time, func, hostname, ip_address, port, service, level, _tc=None)
class LoggingMsgList

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, records, _tc=None)
class LogFlushed

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class TAPingSH

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class TAHalted

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class TAUp

Refer to definition and Common Fields for a description of the message structure.

The test_channels are empty unless the DRAGON_TRANSPORT_TEST environment variable is set to some value. When set in the environment, local services will create two channels and provide the base64 encoded serialized channel descriptors in this test_channels field to the launcher front end which then disseminates them to the test program which is started on each node.

__init__(tag, idx=0, test_channels=[], _tc=None)
class Breakpoint
__init__(tag, p_uid, index, out_desc, in_desc, _tc=None)
class BEIsUp

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, be_ch_desc, host_id, _tc=None)
class FENodeIdxBE

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, node_index, forward: Optional[dict['str', Union[dragon.infrastructure.node_desc.NodeDescriptor, dict]]] = None, send_desc: Optional[Union[B64, str]] = None, _tc=None)
class HaltLoggingInfra

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class HaltOverlay

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class OverlayHalted

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class BEHaltOverlay

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class LAHaltOverlay

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class OverlayPingBE

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class OverlayPingLA

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class LAExit

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, sigint=False, _tc=None)
class RuntimeDesc

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, gs_cd, gs_ret_cd, ls_cd, ls_ret_cd, fe_ext_ip_addr, head_node_ip_addr, oob_port, env, _tc=None)
class UserHaltOOB

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, _tc=None)
class TAUpdateNodes

Refer to definition and Common Fields for a description of the message structure.

__init__(tag, nodes: list[Union[dragon.infrastructure.node_desc.NodeDescriptor, dict]], _tc=None)

Python Message Type Value Enumeration

This contains the message type _tc definition for the Python implementation of the infrastructure messages.

class MessageTypes

These are the enumerated values of message type identifiers within the Dragon infrastructure messages.

DRAGON_MSG = 0

Deliberately invalid

GS_PROCESS_CREATE = 1
GS_PROCESS_CREATE_RESPONSE = 2
GS_PROCESS_LIST = 3
GS_PROCESS_LIST_RESPONSE = 4
GS_PROCESS_QUERY = 5
GS_PROCESS_QUERY_RESPONSE = 6
GS_PROCESS_KILL = 7
GS_PROCESS_KILL_RESPONSE = 8
GS_PROCESS_JOIN = 9
GS_PROCESS_JOIN_RESPONSE = 10
GS_CHANNEL_CREATE = 11
GS_CHANNEL_CREATE_RESPONSE = 12
GS_CHANNEL_LIST = 13
GS_CHANNEL_LIST_RESPONSE = 14
GS_CHANNEL_QUERY = 15
GS_CHANNEL_QUERY_RESPONSE = 16
GS_CHANNEL_DESTROY = 17
GS_CHANNEL_DESTROY_RESPONSE = 18
GS_CHANNEL_JOIN = 19
GS_CHANNEL_JOIN_RESPONSE = 20
GS_CHANNEL_DETACH = 21
GS_CHANNEL_DETACH_RESPONSE = 22
GS_CHANNEL_GET_SENDH = 23
GS_CHANNEL_GET_SENDH_RESPONSE = 24
GS_CHANNEL_GET_RECVH = 25
GS_CHANNEL_GET_RECVH_RESPONSE = 26
ABNORMAL_TERMINATION = 27
GS_STARTED = 28
GS_PING_SH = 29
GS_IS_UP = 30
GS_HEAD_EXIT = 31
GS_CHANNEL_RELEASE = 32
GS_HALTED = 33
SH_PROCESS_CREATE = 34
SH_PROCESS_CREATE_RESPONSE = 35
SH_MULTI_PROCESS_CREATE = 36
SH_MULTI_PROCESS_CREATE_RESPONSE = 37
SH_PROCESS_KILL = 38
SH_PROCESS_EXIT = 39
SH_CHANNEL_CREATE = 40
SH_CHANNEL_CREATE_RESPONSE = 41
SH_CHANNEL_DESTROY = 42
SH_CHANNEL_DESTROY_RESPONSE = 43
SH_LOCK_CHANNEL = 44
SH_LOCK_CHANNEL_RESPONSE = 45
SH_ALLOC_MSG = 46
SH_ALLOC_MSG_RESPONSE = 47
SH_ALLOC_BLOCK = 48
SH_ALLOC_BLOCK_RESPONSE = 49
SH_CHANNELS_UP = 50
SH_PING_GS = 51
SH_HALTED = 52
SH_FWD_INPUT = 53
SH_FWD_INPUT_ERR = 54
SH_FWD_OUTPUT = 55
GS_TEARDOWN = 56
SH_TEARDOWN = 57
SH_PING_BE = 58
BE_PING_SH = 59
TA_PING_SH = 60
SH_HALT_TA = 61
TA_HALTED = 62
SH_HALT_BE = 63
BE_HALTED = 64
TA_UP = 65
GS_PING_PROC = 66
GS_DUMP_STATE = 67
SH_DUMP_STATE = 68
LA_BROADCAST = 69
LA_PASS_THRU_FB = 70
LA_PASS_THRU_BF = 71
GS_POOL_CREATE = 72
GS_POOL_CREATE_RESPONSE = 73
GS_POOL_DESTROY = 74
GS_POOL_DESTROY_RESPONSE = 75
GS_POOL_LIST = 76
GS_POOL_LIST_RESPONSE = 77
GS_POOL_QUERY = 78
GS_POOL_QUERY_RESPONSE = 79
SH_POOL_CREATE = 80
SH_POOL_CREATE_RESPONSE = 81
SH_POOL_DESTROY = 82
SH_POOL_DESTROY_RESPONSE = 83
SH_CREATE_PROCESS_LOCAL_CHANNEL = 84
SH_CREATE_PROCESS_LOCAL_CHANNEL_RESPONSE = 85
SH_PUSH_KVL = 86
SH_PUSH_KVL_RESPONSE = 87
SH_POP_KVL = 88
SH_POP_KVL_RESPONSE = 89
SH_GET_KVL = 90
SH_GET_KVL_RESPONSE = 91
SH_SET_KV = 92
SH_EXEC_MEM_REQUEST = 96
SH_EXEC_MEM_RESPONSE = 97
GS_UNEXPECTED = 98
LA_SERVER_MODE = 99
LA_SERVER_MODE_EXIT = 100
LA_PROCESS_DICT = 101
LA_PROCESS_DICT_RESPONSE = 102
LA_DUMP_STATE = 103
BE_NODE_IDX_SH = 104
LA_CHANNELS_INFO = 105
SH_PROCESS_KILL_RESPONSE = 106
BREAKPOINT = 107
GS_PROCESS_JOIN_LIST = 108
GS_PROCESS_JOIN_LIST_RESPONSE = 109
GS_NODE_QUERY = 110
GS_NODE_QUERY_RESPONSE = 111
LOGGING_MSG = 112
LOGGING_MSG_LIST = 113
LOG_FLUSHED = 114
GS_NODE_LIST = 115
GS_NODE_LIST_RESPONSE = 116
GS_NODE_QUERY_TOTAL_CPU_COUNT = 117
GS_NODE_QUERY_TOTAL_CPU_COUNT_RESPONSE = 118
BE_IS_UP = 119
FE_NODE_IDX_BE = 120
HALT_OVERLAY = 121
HALT_LOGGING_INFRA = 122
OVERLAY_PING_BE = 123
OVERLAY_PING_LA = 124
LA_HALT_OVERLAY = 125
BE_HALT_OVERLAY = 126
OVERLAY_HALTED = 127
EXCEPTIONLESS_ABORT = 128

Communicate abnormal termination without raising exception

LA_EXIT = 129
GS_GROUP_LIST = 130
GS_GROUP_LIST_RESPONSE = 131
GS_GROUP_QUERY = 132
GS_GROUP_QUERY_RESPONSE = 133
GS_GROUP_DESTROY = 134
GS_GROUP_DESTROY_RESPONSE = 135
GS_GROUP_ADD_TO = 136
GS_GROUP_ADD_TO_RESPONSE = 137
GS_GROUP_REMOVE_FROM = 138
GS_GROUP_REMOVE_FROM_RESPONSE = 139
GS_GROUP_CREATE = 140
GS_GROUP_CREATE_RESPONSE = 141
GS_GROUP_KILL = 142
GS_GROUP_KILL_RESPONSE = 143
GS_GROUP_CREATE_ADD_TO = 144
GS_GROUP_CREATE_ADD_TO_RESPONSE = 145
GS_GROUP_DESTROY_REMOVE_FROM = 146
GS_GROUP_DESTROY_REMOVE_FROM_RESPONSE = 147
TA_UPDATE_NODES = 148
RUNTIME_DESC = 149
USER_HALT_OOB = 150
DD_REGISTER_CLIENT = 151
DD_REGISTER_CLIENT_RESPONSE = 152
DD_DESTROY = 153
DD_DESTROY_RESPONSE = 154
DD_REGISTER_MANAGER = 155
DD_REGISTER_MANAGER_RESPONSE = 156
DD_REGISTER_CLIENT_ID = 157
DD_REGISTER_CLIENT_ID_RESPONSE = 158
DD_DESTROY_MANAGER = 159
DD_DESTROY_MANAGER_RESPONSE = 160
DD_PUT = 161
DD_PUT_RESPONSE = 162
DD_GET = 163
DD_GET_RESPONSE = 164
DD_POP = 165
DD_POP_RESPONSE = 166
DD_CONTAINS = 167
DD_CONTAINS_RESPONSE = 168
DD_GET_LENGTH = 169
DD_GET_LENGTH_RESPONSE = 170
DD_CLEAR = 171
DD_CLEAR_RESPONSE = 172
DD_GET_ITERATOR = 173
DD_GET_ITERATOR_RESPONSE = 174
DD_ITERATOR_NEXT = 175
DD_ITERATOR_NEXT_RESPONSE = 176
DD_KEYS = 177
DD_KEYS_RESPONSE = 178
DD_DEREGISTER_CLIENT = 179
DD_DEREGISTER_CLIENT_RESPONSE = 180
DD_CREATE = 181
DD_CREATE_RESPONSE = 182
DD_CONNECT_TO_MANAGER = 183
DD_CONNECT_TO_MANAGER_RESPONSE = 184
DD_GET_RANDOM_MANAGER = 185
DD_GET_RANDOM_MANAGER_RESPONSE = 186