Logging

Overview of the structure, classes, and functions that enable orderly logging of Dragon’s multiple backend components.

General Design

The goal of the infrastructure is to make all service logging easy to parse with grep or similar command line tools and get information out that actually makes sense. With that target in mind, the goals are:

  • Get all logging messages from all backend services to the frontend’s file system and/or stderr

  • Make it easy to see what service and and host/IP/port the logging message was generated from

  • Make sure timestamps are actually meaningful

  • Keep it simple for other devs to plug into their infrastructure and not irreparably break already existing logging infrastructure.

  • Stick to Python’s logging module as much as possible

Frontend Services

The MRNet frontend service is the ultimate destination for all the logging messages created by backend services.

Those messages are received via the MRNet server’s callback function that also provides all the other infrastructure related messages unrelated to logging. In order to avoid clobbering infrastructure messages, the dragon.launcher.util.logging_queue_monitor() decorator pulls any messages received from the callback and puts them in a separate queue so the MRNet server can write them to log as infrastructure requests allow. All the messages received by the frontend server must be an instance of dragon.infrastructure.messages.LoggingMsgList

In order to keep management of all the messages sane and easily parsible, the launcher frontend has a separate Python log for each service. Each log is referenced by names defined as attributes of dragon.dlogging.util.DragonLoggingServices. Each message is consequently logged under the scope of the service it came from. Many children can be created from this parent log, but they should belong to a given parent (ie: service log)

Listing 49 This is essentially what dragon.dlogging.util.log_FE_msg_list() performs
import logging
from dragon.dlogging.util import DragonLoggingServices

# Log a global services messages
log = logging.getLogger(DragonLoggingServices.GS)
log.log(level, log_msg, extra=dict_of_extra_record_attributed)

Backend Services

During bring-up, the MRNet server backend creates a DragonLogger instance and passes its serialized descriptor to all backend services on the same node as it: global services (for primary node), local services, transport agent, and the launcher backend.

All the listed services then connect to the MRNet server’s DragonLogger instance by attaching to it. All the services’ log messages are passed through a :py:class:DragonLogginerHandler that simply put them int the MRNet backend’s DragonLogger channel queue. Timestamps are generated when a given service emits its message, not when the MRNet backend plucks it from the queue. That way log messages reflect the time they were emitted by the given backend service.

Listing 50 Launcher backend attaching to a logging channel and logging
import logging
import dragon.dlogging.util as dlog

dlog.setup_BE_logging(service=dlog.DragonLoggingServices.LA_BE, logger_sdesc=logger_sdesc, level=level)
log.info(f'start in pid {os.getpid()}, pgid {os.getpgid(0)}')

Formatting

Log entries posted by the launcher frontend contain hostname, service, and function name the log entry came from:

2022-07-28 13:07:49,988 INFO     LA_BE.get_host_info (pinoak0202) :: sockets returned ip_addr 172.23.0.11 for pinoak0202
2022-07-28 13:07:49,988 INFO     LA_BE.get_host_info (pinoak0202) :: primary_hostname = pinoak0201 | hostname = pinoak0202
2022-07-28 13:07:49,988 INFO     LA_BE.get_host_info (pinoak0202) :: primary node status = False
2022-07-28 13:07:49,988 INFO     LA_BE.main (pinoak0202) :: sent BENodeIdxSH(node_idx=1) - m2.1
2022-07-28 13:07:50,006 DEBUG    LS.multinode (pinoak0202) :: dragon logging initiated on pid=17735
2022-07-28 13:07:50,006 INFO     LS.multinode (pinoak0202) :: got BENodeIdxSH (id=1, ip=172.23.0.11, host=pinoak0202, primmary=False) - m2.1

Module Classes and functions

Logging setup for Dragon runtime infrastructure

class LoggingValue

Bases: Action

__init__(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)
class DragonLoggingServices

Bases: str, Enum

Enumerator for logging names of backend services

Attributes:
LA_FE:

‘LA_FE’

LA_BE:

‘LA_BE’

GS:

‘GS’

TA:

‘TA’

ON:

‘ON’

OOB:

‘OOB’

LS:

‘LS’

DD:

‘DD’

TEST:

‘TEST’

PERF:

‘PERF’

PG:

‘PG’

LA_FE = 'LA_FE'
LA_BE = 'LA_BE'
GS = 'GS'
TA = 'TA'
ON = 'ON'
OOB = 'OOB'
LS = 'LS'
DD = 'DD'
TEST = 'TEST'
PERF = 'PERF'
PG = 'PG'
next_tag()
class DragonFEFilter

Bases: Filter

__init__() None

Create logging filter to add record attributes to FE logger

Adds hostname, IP address, port, and service name to all FE logging entries. These are needed to satisfy the dragon.dlogging.util.default_FE_fmt format string

filter(record: LogRecord) bool

Filter out BE messages and let through all FE messages while defining record attributes

Args:

record (logging.LogRecord): record that will be populated

Returns:

bool: Log message if True. Don’t if False

class DragonLoggingHandler

Bases: StreamHandler

__init__(serialized_log_descr: B64, mpool: MemoryPool | None = None, hostname: str | None = None, ip_address: str | None = None, port: str | None = None, service: DragonLoggingServices | str | None = None) None

Logging handler to enable python logging to be sent to the Dragon Logging channel on BE

Ultimately a logging.StreamHandler that sends stream output to /dev/null and also emits log records into a dragon.infrastructure.messages.LoggingMsg object that is serialized into a dragon.dlogging.logger.DragonLogger

Parameters:
  • serialized_log_descr (B64) – serialized descriptor from dragon.dlogging.logger.DragonLogger

  • mpool (MemoryPool, optional) – Memory pool to use for attaching to logging channel, defaults to None

  • hostname (str, optional) – hostname of logging service, defaults to None

  • ip_address (str, optional) – IP address of logging service, defaults to None

  • port (str, optional) – Port of logging service, defaults to None

  • service (Union[DragonLoggingServices, str], optional) – Name of logging service. Acceptable names are defined in DragonLoggingServices

emit(record: LogRecord) None

Emit a dragon.insfrauction.messages.LoggingMsg message

Args:

record (logging.LogRecord): Record populated by classes format method

Raises:

err: DragonLoggingError if message could not be emitted into logging channel

setup_logging(*, basename: str | None = None, basedir: str | None = None, level: int = 20, force: bool = False, add_console: bool = False)

Original dragon logging function to configure root logger.

Args:

basename (str, optional): Logging file basename. Date and time will be appended. Defaults to ‘dragon’. Defaults to None.

basedir (str, optional): Directory to save log file to os.getcwd(). Defaults to None.

level (int, optional): Logging level. Defaults to logging.INFO.

force (bool, optional): Remove any handlers attached to the root logger. Defaults to False.

add_console (bool, optional): Enable loggng to stderr. Defaults to False.

setup_FE_logging(log_device_level_map: dict = {}, basename: str | None = None, basedir: str | None = None, force: bool = False, add_console: bool = False, test_BE_dir: str | None = None, multi_node_mode: bool = False) Tuple[int, str]

Set-up launcher frontend logging infrastructure. Will log to file.

Configures the individual service loggers the FE will use to log all messages sent to it from the backend services of the MRNet overlay and creates the FileHandler they will all write to

Args:

basename (str, optional): Logging file basename. Date and time will be appended. Defaults to ‘dragon’.

basedir (str, optional): Directory to save log file to. Defaults to pwd

force (bool, optional): Remove all existing handlers from root logger. Defaults to False.

add_console (bool, optional): Turn on logging to stderr. Defaults to False.

test_BE_dir (str, optional): Directory to save backend log files to. Used for testing. Defaults to None.

multi_node_mode (bool, optional): Enable multi-node filtering and formatting.

Returns:

int: logging level

str: full path and filename of logging file

close_FE_logging()

Close individual service loggers used for logging on frontend

setup_dragon_logging(node_index: int) DragonLogger

Create a dragon.dlogging.logger.DragonLogger instance

Args:

node_index (int): node index logger is being created on

Returns:

DragonLogger: created DragonLogger instance

detach_from_dragon_handler(service: DragonLoggingServices | str)

Detach logger from dragon handler

This only needs to be called for processes that will outlive the MRNet backend service, which “owns” the logging memory pool and relevant backend infrastructure

Args:

service (str): logging service, options defined in DragonLoggingServices

setup_BE_logging(service: DragonLoggingServices | str, logger_sdesc: B64 | None = None, mpool: MemoryPool | None = None, hostname: str | None = None, ip_address: str | None = None, port: str | None = None, fname: str | None = None) Tuple[int, str]

Configure backend logging for requested backend service

Create a handler using the dragon.dlogging.logger.DragonLogger serialized descriptor and add it to the given service’s logger :param service: logging service, options defined in DragonLoggingServices :type service: Union[DragonLoggingServices, str] :param logger_sdesc: serialized logging descriptor, defaults to None :type logger_sdesc: B64, optional :param mpool: memory pool to use for attaching to channel, defaults to None :type mpool: MemoryPool, optional :param hostname: Hostname of logging service, defaults to None :type hostname: str, optional :param ip_address: IP address of logging service, defaults to None :type ip_address: str, optional :param port: Port of logging service, defaults to None :type port: str, optional :param fname: Filename for file logging, defaults to None :type fname: str, optional :return: logging level, filename logging will be saved to :rtype: Tuple[int, str]

class DragonLogger

Bases: object

Python interface to C-level Dragon logging infrastructure

__init__()

Create a DragonLogger instance for send/recv of infrastructure logs

Args:

mpool (MemoryPool): memory pool dedicated to logging infrastructure messages

lattrs (NULL, optional): Logging attributes. Not implemented

uid (default: 0): Unique ID to use for the logger.

If default is provided a semi-random value will be used starting at BASE_LOG_CUID (see facts.py)

mode (dragonLoggingMode, default: FIRST): What mode to run the logger in.

DRAGON_LOGGING_LOSSLESS - Block and wait on a full channel until a message can be inserted DRAGON_LOGGING_FIRST - Drop message on a full channel DRAGON_LOGGING_LAST - Remove oldest message to insert newest on a full channel !!! NOT IMPLEMENTED !!!

Raises:

RuntimeError: Input memory pool was None

DragonLoggingError: Logging was not successfully initialized

classmethod attach(serialized_bytes: bytes, mpool=None) DragonLogger

Attach to a logging channel given a serialized descriptor

Parameters:
  • serialized_bytes (bytes) – serialized logging channel descriptor

  • mpool (MemoryPool, optional) – memory pool to construct messages from. If None, tries to use the pool associated with the channel, defaults to None

Returns:

DragonLogger instance attached to given channel backed by option mpool

Return type:

DragonLogger

destroy(destroy_pool: bool = True)

Destroy the DragonLogger object and its underlying memory pool

Should only be called when it is known that no other processes or services will be writing to it

Parameters:

destroy_pool (bool, optional) – whether to destroy memory pool support the logging channel, defaults to True

Raises:

DragonLoggingError – unable to destroy the logging object as requested

get(priority=20, timeout=None)

Get a message out of logging channel queue of level priority

Args:

priority (int): level the return message should at minimum be (eg: logging.INFO) timeout (int, float): maximum time in seconds to wait for a message

num_logs()

Get number of logs in queue

Returns:

int: number of logs currently in channel queue

put(msg: str, priority: int = 20)

Put a logging message into the logging channel queue

Args:

msg (str)

priority (int): logging level with values matching those in Python’s logging module (eg: logging.INFO)

serialize()

Return a serialized descriptor of the logging channel

For an already created logger, return its serialized descriptor so other services can attach to it

Returns:

bytes: serialized descriptor

exception DragonLoggingError

Bases: Exception

class Errors

Bases: Enum

An enumeration.

SUCCESS = 0
FAIL = 1
__init__(lib_err, msg)