module: `dagrunner.utils`

class: `CaptureProcMemory`

Call Signature:

CaptureProcMemory(interval=1.0, pid=None)

Capture maximum process memory statistics.

See get_proc_mem_stat for more information.

function: `enter`

Source

Call Signature:

__enter__(self)

function: `exit`

Source

Call Signature:

__exit__(self, exc_type, exc_value, traceback)

function: `init`

Source

Call Signature:

__init__(self, interval=1.0, pid=None)

Initialize the memory capture.

Args:

interval: Time interval in seconds to capture memory statistics. Note that memory statistics are captured by reading /proc files. It is advised not to reduce the interval too much, otherwise we increase the overhead of reading the files.
pid: Process id. Optional. Default is the current process.

function: `max`

Source

Call Signature:

max(self)

Return maximum memory statistics.

Returns:

Dictionary with memory statistics in MB.

class: `CaptureSysMemory`

Source

Call Signature:

CaptureSysMemory(interval=1.0, **kwargs)

Capture maximum system memory statistics.

See get_sys_mem_stat for more information.

function: `enter`

Source

Call Signature:

__enter__(self)

function: `exit`

Source

Call Signature:

__exit__(self, exc_type, exc_value, traceback)

function: `init`

Source

Call Signature:

__init__(self, interval=1.0, **kwargs)

Initialize the memory capture.

Args:

interval: Time interval in seconds to capture memory statistics. Note that memory statistics are captured by reading /proc files. It is advised not to reduce the interval too much, otherwise we increase the overhead of reading the files.

function: `max`

Source

Call Signature:

max(self)

Return maximum memory statistics.

Returns:

Dictionary with memory statistics in MB.

class: `KeyValueAction`

Source

Call Signature:

KeyValueAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None, deprecated=False)

Information about how to convert command line strings to Python objects.

Action objects are used by an ArgumentParser to represent the information needed to parse a single argument from one or more strings from the command line. The keyword arguments to the Action constructor are also all attributes of Action instances.

Keyword Arguments:

- option_strings -- A list of command-line option strings which
    should be associated with this action.

- dest -- The name of the attribute to hold the created object(s)

- nargs -- The number of command-line arguments that should be
    consumed. By default, one argument will be consumed and a single
    value will be produced.  Other values include:
        - N (an integer) consumes N arguments (and produces a list)
        - '?' consumes zero or one arguments
        - '*' consumes zero or more arguments (and produces a list)
        - '+' consumes one or more arguments (and produces a list)
    Note that the difference between the default and nargs=1 is that
    with the default, a single value will be produced, while with
    nargs=1, a list containing a single value will be produced.

- const -- The value to be produced if the option is specified and the
    option uses an action that takes no values.

- default -- The value to be produced if the option is not specified.

- type -- A callable that accepts a single string argument, and
    returns the converted value.  The standard Python types str, int,
    float, and complex are useful examples of such callables.  If None,
    str is used.

- choices -- A container of values that should be allowed. If not None,
    after a command-line argument has been converted to the appropriate
    type, an exception will be raised if it is not a member of this
    collection.

- required -- True if the action must always be specified at the
    command line. This is only meaningful for optional command-line
    arguments.

- help -- The help string describing the argument.

- metavar -- The name to be used for the option's argument with the
    help string. If None, the 'dest' value will be used as the name.

function: `call`

Source

Call Signature:

__call__(self, parser, namespace, values, option_string=None)

Call self as a function.

class: `ObjectAsStr`

Source

Call Signature:

ObjectAsStr(obj, name=None)

Hide object under a string.

function: `hash`

Source

Call Signature:

__hash__(self)

Return hash(self).

function: `new`

Source

Call Signature:

__new__(cls, obj, name=None)

Create and return a new object. See help(type) for accurate signature.

function: `obj_to_name`

Source

Call Signature:

obj_to_name(obj, cls)

class: `Singleton`

Source

Singleton metaclass.

function: `call`

Source

Call Signature:

__call__(cls, *args, **kwargs)

Call self as a function.

class: `TimeIt`

Source

Call Signature:

TimeIt(verbose=False)

Timer context manager which can also be used as a standalone timer.

We can query our timer for the elapsed time in seconds even before .

Example as a context manager:

>>> with TimeIt() as timer:
>>>     sleep(0.05)
>>> print(timer)
"Elapsed time: 0.05s"

Example as a standalone timer:

>>> timer = TimeIt()
>>> timer.start_timer()
>>> sleep(0.05)
>>> print(timer)
"Elapsed time: 0.05s"

function: `enter`

Source

Call Signature:

__enter__(self)

function: `exit`

Source

Call Signature:

__exit__(self, *args)

function: `init`

Source

Call Signature:

__init__(self, verbose=False)

Initialize self. See help(type(self)) for accurate signature.

function: `str`

Source

Call Signature:

__str__(self)

Print elapsed time in seconds.

function: `start`

Source

Call Signature:

start(self)

function: `stop`

Source

Call Signature:

stop(self)

function: `as_iterable`

Source

Call Signature:

as_iterable(obj)

function: `data_polling`

Source

Call Signature:

data_polling(*args, timeout=120, polling=1, file_count=None, fail_fast=True, verbose=False)

Poll for the availability of files

Poll for data and return when all data is available or otherwise raise an exception if the timeout is reached.

Args:

*args: Variable length argument list of file patterns to be checked. <hostname>:<path> syntax supported for files on a remote host.

Args:

timeout (int): Timeout in seconds (default is 120 seconds).
polling (int): Time interval in seconds between each poll (default is 1 second).
file_count (int): Expected number of files to be found for globular expansion (default is >= 1 files per pattern).
fail_fast (bool): Stop when a file is not found (default is True).
verbose (bool): Print verbose output.

function: `docstring_parse`

Source

Call Signature:

docstring_parse(obj)

function: `function_to_argparse`

Source

Call Signature:

function_to_argparse(func, parser=None, exclude=None)

Generate an argparse from a function signature

function: `function_to_argparse_parse_args`

Source

Call Signature:

function_to_argparse_parse_args(*args, **kwargs)

function: `get_proc_mem_stat`

Source

Call Signature:

get_proc_mem_stat(pid=None)

Get process memory statistics from /proc//status.

More information can be found at https://github.com/torvalds/linux/blob/master/Documentation/filesystems/proc.txt

Args:

pid: Process id. Optional. Default is the current process.

Returns:

Dictionary with memory statistics in MB. Fields are VmSize, VmRSS, VmPeak and VmHWM.

function: `get_sys_mem_stat`

Source

Call Signature:

get_sys_mem_stat()

Get system memory statistics from /proc/meminfo.

More information can be found at https://github.com/torvalds/linux/blob/master/Documentation/filesystems/proc.txt

Returns:

Dictionary with memory statistics in MB. Fields are Committed_AS, MemFree, Buffers, Cached and MemTotal.

function: `in_notebook`

Source

Call Signature:

in_notebook()

Determine whether we are in a Jupyter notebook.

function: `pairwise`

Source

Call Signature:

pairwise(iterable)

Return successive overlapping pairs taken from the input iterable.

The number of 2-tuples in the output iterator will be one fewer than the number of inputs. It will be empty if the input iterable has fewer than two values.

pairwise('ABCDEFG') → AB BC CD DE EF FG

function: `process_path`

Source

Call Signature:

process_path(fpath: str)

Process path.

Args:

fpath: Remote path in the format :. If host corresponds to the local host, then the host element will be removed.

Returns:

Processed path

function: `stage_to_dir`

Source

Call Signature:

stage_to_dir(*args, staging_dir, verbose=False)

Copy input filepaths to a staging area and update paths.

Hard link copies are preferred (same host) and physical copies are made otherwise. File name, size and modification time are used to evaluate if the destination file exists already (matching criteria of rsync). If exists already, skip the copy. Staged files are named: <modification-time>_<file-size>_<filename> to avoid collision with identically names files.

function: `subset_equality`

Source

Call Signature:

subset_equality(obj_a, obj_b)

Return whether obj_a is a subset of obj_b.

Supporting namedtuple and dataclasses, otherwise fallback to equality. Note that a 'None' value in obj_a is considered a wildcard.

Files

dagrunner.utils.md

Latest commit

History

dagrunner.utils.md

File metadata and controls

module: dagrunner.utils

class: CaptureProcMemory

Call Signature:

function: __enter__

Call Signature:

function: __exit__

Call Signature:

function: __init__

Call Signature:

function: max

Call Signature:

class: CaptureSysMemory

Call Signature:

function: __enter__

Call Signature:

function: __exit__

Call Signature:

function: __init__

Call Signature:

function: max

Call Signature:

class: KeyValueAction

Call Signature:

function: __call__

Call Signature:

class: ObjectAsStr

Call Signature:

function: __hash__

Call Signature:

function: __new__

Call Signature:

function: obj_to_name

Call Signature:

class: Singleton

function: __call__

Call Signature:

class: TimeIt

Call Signature:

function: __enter__

Call Signature:

function: __exit__

Call Signature:

function: __init__

Call Signature:

function: __str__

Call Signature:

function: start

Call Signature:

function: stop

Call Signature:

function: as_iterable

Call Signature:

function: data_polling

Call Signature:

function: docstring_parse

Call Signature:

function: function_to_argparse

Call Signature:

function: function_to_argparse_parse_args

Call Signature:

function: get_proc_mem_stat

Call Signature:

function: get_sys_mem_stat

Call Signature:

function: in_notebook

Call Signature:

function: pairwise

Call Signature:

function: process_path

Call Signature:

function: stage_to_dir

Call Signature:

function: subset_equality

Call Signature:

module: `dagrunner.utils`

class: `CaptureProcMemory`

function: `enter`

function: `exit`

function: `init`

function: `max`

class: `CaptureSysMemory`

function: `enter`

function: `exit`

function: `init`

function: `max`

class: `KeyValueAction`

function: `call`

class: `ObjectAsStr`

function: `hash`

function: `new`

function: `obj_to_name`

class: `Singleton`

function: `call`

class: `TimeIt`

function: `enter`

function: `exit`

function: `init`

function: `str`

function: `start`

function: `stop`

function: `as_iterable`

function: `data_polling`

function: `docstring_parse`

function: `function_to_argparse`

function: `function_to_argparse_parse_args`

function: `get_proc_mem_stat`

function: `get_sys_mem_stat`

function: `in_notebook`

function: `pairwise`

function: `process_path`

function: `stage_to_dir`

function: `subset_equality`