bob.med.tb.utils.resources

Tools for interacting with the running computer or GPU

Module Attributes

GB

The number of bytes in a gigabyte

Functions

cpu_constants()

Returns static CPU information about the current system.

gpu_constants()

Returns GPU (static) information using nvidia-smi

gpu_log()

Returns GPU information about current non-static status using nvidia-smi

run_nvidia_smi(query[, rename])

Returns GPU information from query

Classes

CPULogger([pid])

Logs CPU information using psutil

ResourceMonitor(interval, has_gpu, main_pid, ...)

An external, non-blocking CPU/GPU resource monitor

bob.med.tb.utils.resources.GB = 1073741824.0

The number of bytes in a gigabyte

bob.med.tb.utils.resources.run_nvidia_smi(query, rename=None)[source]

Returns GPU information from query

For a comprehensive list of options and help, execute nvidia-smi --help-query-gpu on a host with a GPU

Parameters
  • query (list) – A list of query strings as defined by nvidia-smi --help-query-gpu

  • rename (list, Optional) – A list of keys to yield in the return value for each entry above. It gives you the opportunity to rewrite some key names for convenience. This list, if provided, must be of the same length as query.

Returns

data – An ordered dictionary (organized as 2-tuples) containing the queried parameters (rename versions). If nvidia-smi is not available, returns None. Percentage information is left alone, memory information is transformed to gigabytes (floating-point).

Return type

tuple, None

bob.med.tb.utils.resources.gpu_constants()[source]

Returns GPU (static) information using nvidia-smi

See run_nvidia_smi() for operational details.

Returns

data – If nvidia-smi is not available, returns None, otherwise, we return an ordered dictionary (organized as 2-tuples) containing the following nvidia-smi query information:

  • gpu_name, as gpu_name (str)

  • driver_version, as gpu_driver_version (str)

  • memory.total, as gpu_memory_total (transformed to gigabytes, float)

Return type

tuple, None

bob.med.tb.utils.resources.gpu_log()[source]

Returns GPU information about current non-static status using nvidia-smi

See run_nvidia_smi() for operational details.

Returns

data – If nvidia-smi is not available, returns None, otherwise, we return an ordered dictionary (organized as 2-tuples) containing the following nvidia-smi query information:

  • memory.used, as gpu_memory_used (transformed to gigabytes, float)

  • memory.free, as gpu_memory_free (transformed to gigabytes, float)

  • 100*memory.used/memory.total, as gpu_memory_percent, (float, in percent)

  • utilization.gpu, as gpu_percent, (float, in percent)

Return type

tuple, None

bob.med.tb.utils.resources.cpu_constants()[source]

Returns static CPU information about the current system.

Returns

data – An ordered dictionary (organized as 2-tuples) containing these entries:

  1. cpu_memory_total (float): total memory available, in gigabytes

  2. cpu_count (int): number of logical CPUs available

Return type

tuple

class bob.med.tb.utils.resources.CPULogger(pid=None)[source]

Bases: object

Logs CPU information using psutil

Parameters

pid (int, Optional) – Process identifier of the main process (parent process) to observe

log()[source]

Returns current process cluster information

Returns

data – An ordered dictionary (organized as 2-tuples) containing these entries:

  1. cpu_memory_used (float): total memory used from the system, in gigabytes

  2. cpu_rss (float): RAM currently used by process and children, in gigabytes

  3. cpu_vms (float): total memory (RAM + swap) currently used by process and children, in gigabytes

  4. cpu_percent (float): percentage of the total CPU used by this process and children (recursively) since last call (first time called should be ignored). This number depends on the number of CPUs in the system and can be greater than 100%

  5. cpu_processes (int): total number of processes including self and children (recursively)

  6. cpu_open_files (int): total number of open files by self and children

Return type

tuple

class bob.med.tb.utils.resources.ResourceMonitor(interval, has_gpu, main_pid, logging_level)[source]

Bases: object

An external, non-blocking CPU/GPU resource monitor

Parameters
  • interval (int, float) – Number of seconds to wait between each measurement (maybe a floating point number as accepted by time.sleep())

  • has_gpu (bool) – A flag indicating if we have a GPU installed on the platform or not

  • main_pid (int) – The main process identifier to monitor

  • logging_level (int) – The logging level to use for logging from launched processes

static monitored_keys(has_gpu)[source]