Command-line Interface#

This section contains an overview of command-line applications shipped with this package.

mednet#

Image classification benchmark.

mednet [OPTIONS] COMMAND [ARGS]...

config#

Command for listing, describing and copying configuration resources.

mednet config [OPTIONS] COMMAND [ARGS]...

copy#

Copy a specific configuration resource so it can be modified locally.

mednet config copy [OPTIONS] SOURCE DESTINATION

Options

-v, --verbose#

Increase the verbosity level from 0 (only error and critical) messages will be displayed, to 1 (like 0, but adds warnings), 2 (like 1, but adds info messags), and 3 (like 2, but also adds debugging messages) by adding the –verbose option as often as desired (e.g. ‘-vvv’ for debug).

Default:: 0

Arguments

SOURCE#: Required argument

DESTINATION#: Required argument

Examples:

1. Make a copy of one of the stock configuration files locally, so it can be

adapted:

$ mednet config copy montgomery -vvv newdataset.py

describe#

Describe a specific configuration file.

mednet config describe [OPTIONS] NAME...

Options

-v, --verbose#

Default:: 0

Arguments

NAME#: Required argument(s)

Examples:

1. Describe the Montgomery dataset configuration:

mednet config describe montgomery

2. Describe the Montgomery dataset configuration and lists its

contents:

mednet config describe montgomery -v

list#

List configuration files installed.

mednet config list [OPTIONS]

Options

-v, --verbose#

Default:: 0

Examples:

1. Lists all configuration resources (type: mednet.config) installed:

mednet config list

2. Lists all configuration resources and their descriptions (notice this may

be slow as it needs to load all modules once):

mednet config list -v

database#

Command for listing and verifying databases installed.

mednet database [OPTIONS] COMMAND [ARGS]...

check#

Check file access on one or more DataModules.

mednet database check [OPTIONS] SPLIT

Options

-l, --limit <limit>#: Required Limit check to the first N samples in each split dataset, making the check sensibly faster. Set it to zero (default) to check everything.

-v, --verbose#

Default:: 0

Arguments

SPLIT#: Required argument

Examples:

Check if all files from the split ‘montgomery-f0’ of the Montgomery database can be loaded:
```
mednet datamodule check -vv montgomery-f0
```

list#

List all supported and configured databases.

mednet database list [OPTIONS]

Options

-v, --verbose#

Default:: 0

Examples:

1. To install a database, set up its data directory (“datadir”). For

example, to setup access to Montgomery files you downloaded locally at
the directory “/path/to/montgomery/files”, edit the RC file (typically
$HOME/.config/mednet.toml), and add a line like the following:

[datadir]
montgomery = "/path/to/montgomery/files"
Note

This setting is case-sensitive.

2. List all raw databases supported (and configured):

$ mednet database list

evaluate#

Evaluate predictions (from a model) on a classification task.

It is possible to pass one or several Python files (or names of mednet.config entry points or module names) as CONFIG arguments to the command line which contain the parameters listed below as Python variables. The options through the command-line (see below) will override the values of configuration files. You can run this command with <COMMAND> -H example_config.py to create a template config file.

mednet evaluate [OPTIONS] [CONFIG]...

Options

-p, --predictions <predictions>#: Required Filename in which predictions are currently stored

-o, --output <output>#: Required Path to a JSON file in which to save evaluation results (leading directories are created if they do not exist).

-t, --threshold <threshold>#

Required This value is used to define positives and negatives from probability outputs in predictions, and report performance measures on binary classification tasks. It should either come from the training set or a separate validation set to avoid biasing the analysis. Optionally, if you provide a multi-split set of predictions as input, this may also be the name of an existing split (e.g. validation) from which the threshold will be estimated (by calculating the threshold leading to the highest F1-score on that set) and then applied to the subsequent sets. This value is not used for multi-class classification tasks.

Default:: 0.5

-b, --binning <binning>#

Required The binning algorithm to use for computing the bin widths and distribution for histograms. Choose from algorithms supported by numpy.histogram(), or a simple integer indicating the number of bins to have in the interval [0, 1].

Default:: 50

-P, --plot, --no-plot#

Required If set, then also produces figures containing the plots of performance curves and score histograms.

Default:: True

-v, --verbose#

Default:: 0

-H, --dump-config <dump_config>#: Name of the config file to be generated

Arguments

CONFIG#: Optional argument(s)

Examples:

Run evaluation on an existing prediction output:

mednet evaluate -vv --predictions=path/to/predictions.json --output=evaluation.json

Run evaluation on an existing prediction output, tune threshold a priori on the validation set:

mednet evaluate -vv --predictions=path/to/predictions.json --output=evaluation.json --threshold=validation

experiment#

Run a complete experiment, from training, to prediction and evaluation.

This script is just a wrapper around the individual scripts for training, running prediction, and evaluating. It organises the output in a preset way:
 \b
└─ <output-folder>/
   ├── command.sh
   ├── model/  # the generated model will be here
   ├── predictions.json  # the prediction outputs for the sets
   └── evaluation/  # the outputs of the evaluations for the sets

mednet experiment [OPTIONS] [CONFIG]...

Options

-o, --output-folder <output_folder>#: Required Directory in which to store results (created if does not exist)

-m, --model <model>#: Required A lightning module instance implementing the network to be trained

-d, --datamodule <datamodule>#: Required A lightning DataModule containing the training and validation sets.

-b, --batch-size <batch_size>#

Required Number of samples in every batch (this parameter affects memory requirements for the network). If the number of samples in the batch is larger than the total number of samples available for training, this value is truncated. If this number is smaller, then batches of the specified size are created and fed to the network until there are no more new samples to feed (epoch is finished). If the total number of training samples is not a multiple of the batch-size, the last batch will be smaller than the first, unless –drop-incomplete-batch is set, in which case this batch is not used.

Default:: 1

-c, --batch-chunk-count <batch_chunk_count>#

Required Number of chunks in every batch (this parameter affects memory requirements for the network). The number of samples loaded for every iteration will be batch-size/batch-chunk-count. batch-size needs to be divisible by batch-chunk-count, otherwise an error will be raised. This parameter is used to reduce the number of samples loaded in each iteration, in order to reduce the memory usage in exchange for processing time (more iterations). This is especially interesting when one is training on GPUs with limited RAM. The default of 1 forces the whole batch to be processed at once. Otherwise the batch is broken into batch-chunk-count pieces, and gradients are accumulated to complete each batch.

Default:: 1

-D, --drop-incomplete-batch, --no-drop-incomplete-batch#

Required If set, the last batch in an epoch will be dropped if incomplete. If you set this option, you should also consider increasing the total number of epochs of training, as the total number of training steps may be reduced.

Default:: False

-e, --epochs <epochs>#

Required Number of epochs (complete training set passes) to train for. If continuing from a saved checkpoint, ensure to provide a greater number of epochs than was saved in the checkpoint to be loaded.

Default:: 1000

-p, --validation-period <validation_period>#

Required Number of epochs after which validation happens. By default, we run validation after every training epoch (period=1). You can change this to make validation more sparse, by increasing the validation period. Notice that this affects checkpoint saving. While checkpoints are created after every training step (the last training step always triggers the overriding of latest checkpoint), and this process is independent of validation runs, evaluation of the ‘best’ model obtained so far based on those will be influenced by this setting.

Default:: 1

-x, --device <device>#

Required A string indicating the device to use (e.g. “cpu” or “cuda:0”)

Default:: cpu

--cache-samples, --no-cache-samples#

Required If set to True, loads the sample into memory, otherwise loads them at runtime.

Default:: False

-s, --seed <seed>#

Seed to use for the random number generator

Default:: 42

-P, --parallel <parallel>#

Required Use multiprocessing for data loading: if set to -1 (default), disables multiprocessing data loading. Set to 0 to enable as many data loading instances as processing cores available in the system. Set to >= 1 to enable that many multiprocessing instances for data loading.

Default:: -1

-I, --monitoring-interval <monitoring_interval>#

Required Time between checks for the use of resources during each training epoch, in seconds. An interval of 5 seconds, for example, will lead to CPU and GPU resources being probed every 5 seconds during each training epoch. Values registered in the training logs correspond to averages (or maxima) observed through possibly many probes in each epoch. Notice that setting a very small value may cause the probing process to become extremely busy, potentially biasing the overall perception of resource usage.

Default:: 5.0

-B, --balance-classes, -N, --no-balance-classes#

Required If set, balances weights of the random sampler during training so that samples from all sample classes are picked equitably.

Default:: True

-v, --verbose#

Default:: 0

-H, --dump-config <dump_config>#: Name of the config file to be generated

Arguments

CONFIG#: Optional argument(s)

Examples:

1. Train a pasa model with montgomery dataset, on the CPU, for only two

epochs, then runs inference and evaluation on stock datasets, report

performance as a table and figures:

$ mednet experiment -vv pasa montgomery --epochs=2

predict#

Run inference (generates scores) on all input images, using a pre-trained model.

mednet predict [OPTIONS] [CONFIG]...

Options

-o, --output <output>#: Required Path to a JSON file in which to save predictions for all samples in the input DataModule (leading directories are created if they do not exist).

-m, --model <model>#: Required A lightning module instance implementing the network architecture (not the weights, necessarily) to be used for prediction.

-d, --datamodule <datamodule>#: Required A lightning DataModule that will be asked for prediction data loaders. Typically, this includes all configured splits in a DataModule, however this is not a requirement. A DataModule that returns a single dataloader for prediction (wrapped in a dictionary) is acceptable.

-b, --batch-size <batch_size>#

Required Number of samples in every batch (this parameter affects memory requirements for the network).

Default:: 1

-d, --device <device>#

Required A string indicating the device to use (e.g. “cpu” or “cuda:0”)

Default:: cpu

-w, --weight <weight>#: Required Path or URL to pretrained model file (.ckpt extension), corresponding to the architecture set with –model. Optionally, you may also pass a directory containing the result of a training session, in which case either the best (lowest validation) or latest model will be loaded.

-P, --parallel <parallel>#

Default:: -1

-v, --verbose#

Default:: 0

-H, --dump-config <dump_config>#: Name of the config file to be generated

Arguments

CONFIG#: Optional argument(s)

Examples:

Run prediction on an existing DataModule configuration:

mednet predict -vv pasa montgomery --weight=path/to/model.ckpt --output=path/to/predictions.json

Enable multi-processing data loading with 6 processes:

mednet predict -vv pasa montgomery --parallel=6 --weight=path/to/model.ckpt --output=path/to/predictions.json

saliency#

Generate, evaluate and view saliency maps.

mednet saliency [OPTIONS] COMMAND [ARGS]...

completeness#

Evaluate saliency map algorithm completeness using RemOve And Debias

(ROAD).

For the selected saliency map algorithm, evaluates the completeness of explanations using the RemOve And Debias (ROAD) algorithm. The ROAD algorithm was first described at [ROAD-2022]. It estimates explainability (in the completeness sense) of saliency mapping algorithms by substituting relevant pixels in the input image by a local average, re-running prediction on the altered image, and measuring changes in the output classification score when said perturbations are in place. By substituting most or least relevant pixels with surrounding averages, the ROAD algorithm estimates the importance of such elements in the produced saliency map. As 2023, this measurement technique is considered to be one of the state-of-the-art metrics of explainability.

This program outputs a .json file containing the ROAD evaluations (using most-relevant-first, or MoRF, and least-relevant-first, or LeRF for each sample in the DataModule. Values for MoRF and LeRF represent averages by removing 20, 40, 60 and 80% of most or least relevant pixels respectively from the image, and averaging results for all these percentiles.

Note

This application is relatively slow when processing a large DataModule with many (positive) samples.

mednet saliency completeness [OPTIONS] [CONFIG]...

Options

-m, --model <model>#: Required A lightning module instance implementing the network architecture (not the weights, necessarily) to be used for inference. Currently, only supports pasa and densenet models.

-d, --datamodule <datamodule>#: Required A lightning DataModule that will be asked for prediction DataLoaders. Typically, this includes all configured splits in a DataModule, however this is not a requirement. A DataModule that returns a single DataLoader for prediction (wrapped in a dictionary) is acceptable.

-o, --output-json <output_json>#: Required Directory in which to store the output .json file containing all measures.

-x, --device <device>#

Required A string indicating the device to use (e.g. “cpu” or “cuda:0”)

Default:: cpu

--cache-samples, --no-cache-samples#

Required If set to True, loads the sample into memory, otherwise loads them at runtime.

Default:: False

-w, --weight <weight>#: Required Path or URL to pretrained model file (.ckpt extension), corresponding to the architecture set with –model. Optionally, you may also pass a directory containing the result of a training session, in which case either the best (lowest validation) or latest model will be loaded.

-P, --parallel <parallel>#

Required Use multiprocessing for data loading processing: if set to -1 (default), disables multiprocessing. Set to 0 to enable as many data processing instances as processing cores available in the system. Set to >= 1 to enable that many multiprocessing instances. Note that if you activate this option, then you must use –device=cpu, as using a GPU concurrently is not supported.

Default:: -1

-s, --saliency-map-algorithm <saliency_map_algorithm>#

Saliency map algorithm to be used.

Default:: gradcam
Options:: ablationcam | eigencam | eigengradcam | fullgrad | gradcam | gradcamelementwise | gradcam++ | gradcamplusplus | hirescam | layercam | randomcam | scorecam | xgradcam

-C, --target-class <target_class>#

This option should only be used with multiclass models. It defines the class to target for saliency estimation. Can be either set to “all” or “highest”. “highest” (the default), means only saliency maps for the class with the highest activation will be generated.

Options:: highest | all

-z, --positive-only, -Z, --no-positive-only#: If set, and the model chosen has a single output (binary), then saliency maps will only be generated for samples of the positive class. This option has no effect for multiclass models.

-e, --percentile <percentile>#

One or more percentiles (percent x100) integer values indicating the proportion of pixels to perturb in the original image to calculate both MoRF and LeRF scores.

Default:: 20, 40, 60, 80

-v, --verbose#

Default:: 0

-H, --dump-config <dump_config>#: Name of the config file to be generated

Arguments

CONFIG#: Optional argument(s)

Examples:

Calculate the ROAD scores for an existing dataset configuration and stores them in .json files:

mednet saliency completeness -vv pasa tbx11k-v1-healthy-vs-atb --device="cuda" --weight=path/to/model-at-lowest-validation-loss.ckpt --output-json=path/to/completeness-scores.json

evaluate#

Calculate summary statistics for a saliency map algorithm.

mednet saliency evaluate [OPTIONS] [CONFIG]...

Options

-e, --entry <entry>#: Required ENTRY is a triplet containing the algorithm name, the path to the scores issued from the completness analysis (mednet saliency completness) and scores issued from the interpretability analysis (mednet saliency interpretability), both in JSON format. Paths to score files must exist before the program is called. Valid values for saliency map algorithms are ablationcam|eigencam|eigengradcam|fullgrad|gradcam|gradcamelementwise|gradcam++|gradcamplusplus|hirescam|layercam|randomcam|scorecam|xgradcam

-o, --output-folder <output_folder>#: Directory in which to store the analysis result (created if does not exist)

-v, --verbose#

Default:: 0

-H, --dump-config <dump_config>#: Name of the config file to be generated

Arguments

CONFIG#: Optional argument(s)

Examples:

Tabulate and generates plots for two saliency map algorithms:

mednet saliency evaluate -vv -e gradcam path/to/gradcam-completeness.json path/to/gradcam-interpretability.json -e gradcam++ path/to/gradcam++-completeness.json path/to/gradcam++-interpretability.json

generate#

Generate saliency maps for locations on input images that affected the

prediction.

The quality of saliency information depends on the saliency map algorithm and trained model.

mednet saliency generate [OPTIONS] [CONFIG]...

Options

-m, --model <model>#: Required A lightning module instance implementing the network architecture (not the weights, necessarily) to be used for inference. Currently, only supports pasa and densenet models.

-d, --datamodule <datamodule>#: Required A lightning DataModule that will be asked for prediction data loaders. Typically, this includes all configured splits in a DataModule, however this is not a requirement. A DataModule that returns a single dataloader for prediction (wrapped in a dictionary) is acceptable.

-o, --output-folder <output_folder>#: Required Directory in which to store saliency maps (created if does not exist)

-x, --device <device>#

Required A string indicating the device to use (e.g. “cpu” or “cuda:0”)

Default:: cpu

--cache-samples, --no-cache-samples#

Required If set to True, loads the sample into memory, otherwise loads them at runtime.

Default:: False

-w, --weight <weight>#: Required Path or URL to a pretrained model file (.ckpt extension), corresponding to the architecture set with –model. Optionally, you may also pass a directory containing the result of a training session, in which case either the best (lowest validation) or latest model will be loaded.

-P, --parallel <parallel>#

Default:: -1

-s, --saliency-map-algorithm <saliency_map_algorithm>#

Saliency map algorithm to be used.

Default:: gradcam
Options:: ablationcam | eigencam | eigengradcam | fullgrad | gradcam | gradcamelementwise | gradcam++ | gradcamplusplus | hirescam | layercam | randomcam | scorecam | xgradcam

-C, --target-class <target_class>#

Options:: highest | all

-z, --positive-only, -Z, --no-positive-only#: If set, and the model chosen has a single output (binary), then saliency maps will only be generated for samples of the positive class. This option has no effect for multiclass models.

-v, --verbose#

Default:: 0

-H, --dump-config <dump_config>#: Name of the config file to be generated

Arguments

CONFIG#: Optional argument(s)

Examples:

Generate saliency maps for all prediction dataloaders on a DataModule, using a pre-trained DenseNet model, and saves them as numpy-pickeled objects on the output directory:
```
mednet saliency generate -vv densenet tbx11k-v1-healthy-vs-atb --weight=path/to/model-at-lowest-validation-loss.ckpt --output-folder=path/to/output
```

interpretability#

Evaluate saliency map agreement with annotations (human

interpretability).

The evaluation happens by comparing saliency maps with ground-truth provided by any other means (typically following a manual annotation procedure).

Note

For obvious reasons, this evaluation is limited to datasets that contain built-in annotations which corroborate classification.

As a result of the evaluation, this application creates a single .json file that resembles the original DataModule, with added information containing the following measures, for each sample:

Proportional Energy: A measure that compares (UNthresholed) saliency maps with annotations (based on [SCORECAM-2020]). It estimates how much activation lies within the ground truth boxes compared to the total sum of the activations.
Average Saliency Focus: estimates how much of the ground truth bounding boxes area is covered by the activations. It is similar to the proportional energy measure in the sense that it does not need explicit thresholding.

mednet saliency interpretability [OPTIONS] [CONFIG]...

Options

-m, --model <model>#: Required A lightning module instance implementing the network architecture (not the weights, necessarily) to be used for inference. Currently, only supports pasa and densenet models.

-d, --datamodule <datamodule>#: Required A lightning DataModule that will be asked for prediction data loaders. Typically, this includes all configured splits in a DataModule, however this is not a requirement. A DataModule that returns a single dataloader for prediction (wrapped in a dictionary) is acceptable.

-i, --input-folder <input_folder>#: Required Path from where to load saliency maps. You can generate saliency maps with mednet saliency generate.

-t, --target-label <target_label>#: Required The target label that will be analysed. It must match the target label that was used to generate the saliency maps provided with option --input-folder. Samples with all other labels are ignored.

-o, --output-json <output_json>#: Required Path to the .json file in which all measures will be saved.

-v, --verbose#

Default:: 0

-H, --dump-config <dump_config>#: Name of the config file to be generated

Arguments

CONFIG#: Optional argument(s)

Examples:

Evaluate the generated saliency maps for their localization performance:

mednet saliency interpretability -vv pasa tbx11k-v1-healthy-vs-atb --input-folder=parent-folder/saliencies/ --output-json=path/to/interpretability-scores.json

view#

Generate heatmaps for input CXRs based on existing saliency maps.

mednet saliency view [OPTIONS] [CONFIG]...

Options

-m, --model <model>#: Required A lightning module instance implementing the network to be used for applying the necessary data transformations.

-d, --datamodule <datamodule>#: Required A lightning DataModule containing the training, validation and test sets.

-i, --input-folder <input_folder>#: Required Path to the directory containing the saliency maps for a specific visualization type.

-o, --output-folder <output_folder>#: Required Directory in which to store the visualizations (created if does not exist)

-G, --show-groundtruth, -g, --no-show-groundtruth#: If set, visualizations for ground truth labels will be generated. Only works for datasets with bounding boxes.

-t, --threshold <threshold>#

Required The pixel values above threshold % of max value are kept in the original saliency map. Everything else is set to zero. The value proposed on [SCORECAM-2020] is 0.2. Use this value if unsure.

Default:: 0.2

-v, --verbose#

Default:: 0

-H, --dump-config <dump_config>#: Name of the config file to be generated

Arguments

CONFIG#: Optional argument(s)

Examples:

Generate visualizations in the form of heatmaps from existing saliency maps for a dataset configuration:

mednet saliency view -vv pasa tbx11k-v1-healthy-vs-atb --input-folder=parent_folder/gradcam/ --output-folder=path/to/visualizations

train#

Train an CNN to perform image classification.

Training is performed for a configurable number of epochs, and generates checkpoints. Checkpoints are model files with a .ckpt extension that are used in subsequent tasks or from which training can be resumed.

mednet train [OPTIONS] [CONFIG]...

Options

-o, --output-folder <output_folder>#: Required Directory in which to store results (created if does not exist)

-m, --model <model>#: Required A lightning module instance implementing the network to be trained

-d, --datamodule <datamodule>#: Required A lightning DataModule containing the training and validation sets.

-b, --batch-size <batch_size>#

Default:: 1

-c, --batch-chunk-count <batch_chunk_count>#

Default:: 1

-D, --drop-incomplete-batch, --no-drop-incomplete-batch#

Default:: False

-e, --epochs <epochs>#

Default:: 1000

-p, --validation-period <validation_period>#

Default:: 1

-x, --device <device>#

Required A string indicating the device to use (e.g. “cpu” or “cuda:0”)

Default:: cpu

--cache-samples, --no-cache-samples#

Required If set to True, loads the sample into memory, otherwise loads them at runtime.

Default:: False

-s, --seed <seed>#

Seed to use for the random number generator

Default:: 42

-P, --parallel <parallel>#

Default:: -1

-I, --monitoring-interval <monitoring_interval>#

Default:: 5.0

-B, --balance-classes, -N, --no-balance-classes#

Required If set, balances weights of the random sampler during training so that samples from all sample classes are picked equitably.

Default:: True

-v, --verbose#

Default:: 0

-H, --dump-config <dump_config>#: Name of the config file to be generated

Arguments

CONFIG#: Optional argument(s)

Examples:

Train a pasa model with the montgomery dataset, on a GPU (cuda:0):

mednet train -vv pasa montgomery --batch-size=4 --device="cuda:0"

train-analysis#

Create a plot for each metric in the training logs and saves them in a .pdf file.

mednet train-analysis [OPTIONS]

Options

-l, --logdir <logdir>#: Required Path to the directory containing the Tensorboard training logs

-o, --output <output>#

Required Name of the output file to create (multi-page .pdf)

Default:: trainlog.pdf

-v, --verbose#

Default:: 0

Examples:

1. Analyze a training log and produces various plots:

mednet train-analysis -vv results/logs