Command-line Interface#
This section contains an overview of command-line applications shipped with this package.
binseg#
Binary Segmentation Benchmark.
binseg [OPTIONS] COMMAND [ARGS]...
analyze#
Runs a complete evaluation from prediction to comparison.
This script is just a wrapper around the individual scripts for running prediction and evaluating FCN models. It organises the output in a preset way:
└─ <output-folder>/ ├── predictions/ #the prediction outputs for the train/test set ├── overlayed/ #the overlayed outputs for the train/test set ├── predictions/ #predictions overlayed on the input images ├── analysis/ #predictions overlayed on the input images ├ #including analysis of false positives, negatives ├ #and true positives └── second-annotator/ #if set, store overlayed images for the #second annotator here └── analysis / #the outputs of the analysis of both train/test sets #includes second-annotator "mesures" as well, if # configuredN.B.: The tool is designed to prevent analysis bias and allows one to provide separate subsets for training and evaluation. Instead of using simple datasets, datasets for full experiment running should be dictionaries with specific subset names:
__train__
: dataset used for training, prioritarily. It is typically the dataset containing data augmentation pipelines.
train
(optional): a copy of the__train__
dataset, without data augmentation, that will be evaluated alongside other sets available
*
: any other name, not starting with an underscore character (_
), will be considered a test set for evaluation.N.B.2: The threshold used for calculating the F1-score on the test set, or overlay analysis (false positives, negatives and true positives overprinted on the original image) also follows the logic above.
It is possible to pass one or several Python
files (or names of deepdraw.config
entry points or module names) as
CONFIG arguments to the command line which contain the parameters listed below as Python variables. The options through the command-line (see below)
will override the values of configuration files. You can run this command with
<COMMAND> -H example_config.py
to create a template config file.
binseg analyze [OPTIONS] [CONFIG]...
Options
- -o, --output-folder <output_folder>#
Required Path where to store experiment outputs (created if does not exist)
- -m, --model <model>#
Required A torch.nn.Module instance implementing the network to be trained, and then evaluated
- -d, --dataset <dataset>#
Required A dictionary mapping string keys to deepdraw.data.utils.SampleList2TorchDataset’s. At least one key named ‘train’ must be available. This dataset will be used for training the network model. All other datasets will be used for prediction and evaluation. Dataset descriptions include all required pre-processing, including eventual data augmentation, which may be eventually excluded for prediction and evaluation purposes
- -S, --second-annotator <second_annotator>#
A dataset or dictionary, like in –dataset, with the same sample keys, but with annotations from a different annotator that is going to be compared to the one in –dataset
- -b, --batch-size <batch_size>#
Required Number of samples in every batch (this parameter affects memory requirements for the network). If the number of samples in the batch is larger than the total number of samples available for training, this value is truncated. If this number is smaller, then batches of the specified size are created and fed to the network until there are no more new samples to feed (epoch is finished). If the total number of training samples is not a multiple of the batch-size, the last batch will be smaller than the first.
- Default:
1
- -d, --device <device>#
Required A string indicating the device to use (e.g. “cpu” or “cuda:0”)
- Default:
cpu
- -O, --overlayed, --no-overlayed#
Creates overlayed representations of the output probability maps, similar to –overlayed in prediction-mode, except it includes distinctive colours for true and false positives and false negatives. If not set, or empty then do NOT output overlayed images.
- Default:
False
- -w, --weight <weight>#
Required Path or URL to pretrained model file (.pth extension)
- -S, --steps <steps>#
Required This number is used to define the number of threshold steps to consider when evaluating the highest possible F1-score on test data.
- Default:
1000
- -P, --parallel <parallel>#
Required Use multiprocessing for data processing: if set to -1 (default), disables multiprocessing. Set to 0 to enable as many data loading instances as processing cores as available in the system. Set to >= 1 to enable that many multiprocessing instances for data processing.
- Default:
-1
- -L, --plot-limits <plot_limits>#
If set, this option affects the performance comparison plots. It must be a 4-tuple containing the bounds of the plot for the x and y axis respectively (format: x_low, x_high, y_low, y_high]). If not set, use normal bounds ([0, 1, 0, 1]) for the performance curve.
- Default:
0.0, 1.0, 0.0, 1.0
- -v, --verbose#
Increase the verbosity level from 0 (only error and critical) messages will be displayed, to 1 (like 0, but adds warnings), 2 (like 1, but adds info messags), and 3 (like 2, but also adds debugging messages) by adding the –verbose option as often as desired (e.g. ‘-vvv’ for debug).
- Default:
0
- -H, --dump-config <dump_config>#
Name of the config file to be generated
Arguments
- CONFIG#
Optional argument(s)
Examples:
$ deepdraw analyze -vv m2unet drive --weight=model.path
compare#
binseg compare [OPTIONS] [LABEL_PATH]...
Options
- -f, --output-figure <output_figure>#
Path where write the output figure (any extension supported by matplotlib is possible). If not provided, does not produce a figure.
- -T, --table-format <table_format>#
Required The format to use for the comparison table
- Default:
rst
- Options:
asciidoc | double_grid | double_outline | fancy_grid | fancy_outline | github | grid | heavy_grid | heavy_outline | html | jira | latex | latex_booktabs | latex_longtable | latex_raw | mediawiki | mixed_grid | mixed_outline | moinmoin | orgtbl | outline | pipe | plain | presto | pretty | psql | rounded_grid | rounded_outline | rst | simple | simple_grid | simple_outline | textile | tsv | unsafehtml | youtrack
- -u, --output-table <output_table>#
Path where write the output table. If not provided, does not write write a table to file, only to stdout.
- -t, --threshold <threshold>#
This number is used to select which F1-score to use for representing a system performance. If not set, we report the maximum F1-score in the set, which is equivalent to threshold selection a posteriori (biased estimator), unless the performance file being considered already was pre-tunned, and contains a ‘threshold_a_priori’ column which we then use to pick a threshold for the dataset. You can override this behaviour by either setting this value to a floating-point number in the range [0.0, 1.0], or to a string, naming one of the systems which will be used to calculate the threshold leading to the maximum F1-score and then applied to all other sets.
- -L, --plot-limits <plot_limits>#
If set, must be a 4-tuple containing the bounds of the plot for the x and y axis respectively (format: x_low, x_high, y_low, y_high]). If not set, use normal bounds ([0, 1, 0, 1]) for the performance curve.
- Default:
0.0, 1.0, 0.0, 1.0
- -v, --verbose#
Increase the verbosity level from 0 (only error and critical) messages will be displayed, to 1 (like 0, but adds warnings), 2 (like 1, but adds info messags), and 3 (like 2, but also adds debugging messages) by adding the –verbose option as often as desired (e.g. ‘-vvv’ for debug).
- Default:
0
Arguments
- LABEL_PATH#
Optional argument(s)
Examples:
$ deepdraw compare -vv A path/to/A/train.csv B path/to/B/test.csv
config#
Commands for listing, describing and copying configuration resources.
binseg config [OPTIONS] COMMAND [ARGS]...
copy#
Copy a specific configuration resource so it can be modified locally.
binseg config copy [OPTIONS] SOURCE DESTINATION
Options
- -v, --verbose#
Increase the verbosity level from 0 (only error and critical) messages will be displayed, to 1 (like 0, but adds warnings), 2 (like 1, but adds info messags), and 3 (like 2, but also adds debugging messages) by adding the –verbose option as often as desired (e.g. ‘-vvv’ for debug).
- Default:
0
Arguments
- SOURCE#
Required argument
- DESTINATION#
Required argument
Examples:
$ deepdraw config copy montgomery -vvv newdataset.py
describe#
Describes a specific configuration file.
binseg config describe [OPTIONS] NAME...
Options
- -v, --verbose#
Increase the verbosity level from 0 (only error and critical) messages will be displayed, to 1 (like 0, but adds warnings), 2 (like 1, but adds info messags), and 3 (like 2, but also adds debugging messages) by adding the –verbose option as often as desired (e.g. ‘-vvv’ for debug).
- Default:
0
Arguments
- NAME#
Required argument(s)
Examples:
deepdraw config describe montgomery
deepdraw config describe montgomery -v
list#
Lists configuration files installed.
binseg config list [OPTIONS]
Options
- -v, --verbose#
Increase the verbosity level from 0 (only error and critical) messages will be displayed, to 1 (like 0, but adds warnings), 2 (like 1, but adds info messags), and 3 (like 2, but also adds debugging messages) by adding the –verbose option as often as desired (e.g. ‘-vvv’ for debug).
- Default:
0
Examples:
deepdraw config list
deepdraw config list -v
dataset#
Commands for listing and verifying datasets.
binseg dataset [OPTIONS] COMMAND [ARGS]...
check#
Checks file access on one or more datasets.
binseg dataset check [OPTIONS] [DATASET]...
Options
- -l, --limit <limit>#
Required Limit check to the first N samples in each dataset, making the check sensibly faster. Set it to zero to check everything.
- -v, --verbose#
Increase the verbosity level from 0 (only error and critical) messages will be displayed, to 1 (like 0, but adds warnings), 2 (like 1, but adds info messags), and 3 (like 2, but also adds debugging messages) by adding the –verbose option as often as desired (e.g. ‘-vvv’ for debug).
- Default:
0
Arguments
- DATASET#
Optional argument(s)
Examples:
deepdraw dataset check -vv montgomery
deepdraw dataset check -vv montgomery shenzhen
deepdraw dataset check
list#
Lists all supported and configured datasets.
binseg dataset list [OPTIONS]
Options
- -v, --verbose#
Increase the verbosity level from 0 (only error and critical) messages will be displayed, to 1 (like 0, but adds warnings), 2 (like 1, but adds info messags), and 3 (like 2, but also adds debugging messages) by adding the –verbose option as often as desired (e.g. ‘-vvv’ for debug).
- Default:
0
Examples:
$HOME/.config/deepdraw.toml
), and add a line like the following:[datadir] montgomery = "/path/to/montgomery/files"Note
This setting is case-sensitive.
$ deepdraw dataset list
evaluate#
It is possible to pass one or several Python
files (or names of deepdraw.config
entry points or module names) as
CONFIG arguments to the command line which contain the parameters listed below as Python variables. The options through the command-line (see below)
will override the values of configuration files. You can run this command with
<COMMAND> -H example_config.py
to create a template config file.
binseg evaluate [OPTIONS] [CONFIG]...
Options
- -o, --output-folder <output_folder>#
Required Path where to store the analysis result (created if does not exist)
- -p, --predictions-folder <predictions_folder>#
Required Path where predictions are currently stored
- -d, --dataset <dataset>#
Required A torch.utils.data.dataset.Dataset instance implementing a dataset to be used for evaluation purposes, possibly including all pre-processing pipelines required or, optionally, a dictionary mapping string keys to torch.utils.data.dataset.Dataset instances. All keys that do not start with an underscore (_) will be processed.
- -S, --second-annotator <second_annotator>#
A dataset or dictionary, like in –dataset, with the same sample keys, but with annotations from a different annotator that is going to be compared to the one in –dataset. The same rules regarding dataset naming conventions apply
- -O, --overlayed <overlayed>#
Creates overlayed representations of the output probability maps, similar to –overlayed in prediction-mode, except it includes distinctive colours for true and false positives and false negatives. If not set, or empty then do NOT output overlayed images. Otherwise, the parameter represents the name of a folder where to store those
- -t, --threshold <threshold>#
This number is used to define positives and negatives from probability maps, and report F1-scores (a priori). It should either come from the training set or a separate validation set to avoid biasing the analysis. Optionally, if you provide a multi-set dataset as input, this may also be the name of an existing set from which the threshold will be estimated (highest F1-score) and then applied to the subsequent sets. This number is also used to print the test set F1-score a priori performance
- -S, --steps <steps>#
Required This number is used to define the number of threshold steps to consider when evaluating the highest possible F1-score on test data.
- Default:
1000
- -P, --parallel <parallel>#
Required Use multiprocessing for data processing: if set to -1 (default), disables multiprocessing. Set to 0 to enable as many data loading instances as processing cores as available in the system. Set to >= 1 to enable that many multiprocessing instances for data processing.
- Default:
-1
- -v, --verbose#
Increase the verbosity level from 0 (only error and critical) messages will be displayed, to 1 (like 0, but adds warnings), 2 (like 1, but adds info messags), and 3 (like 2, but also adds debugging messages) by adding the –verbose option as often as desired (e.g. ‘-vvv’ for debug).
- Default:
0
- -H, --dump-config <dump_config>#
Name of the config file to be generated
Arguments
- CONFIG#
Optional argument(s)
Examples:
$ deepdraw evaluate -vv drive --predictions-folder=path/to/predictions --output-folder=path/to/results
$ deepdraw config copy csv-dataset-example mydataset.py # modify "mydataset.py" to your liking $ deepdraw evaluate -vv mydataset.py --predictions-folder=path/to/predictions --output-folder=path/to/results
experiment#
Runs a complete experiment, from training, to prediction and evaluation.
This script is just a wrapper around the individual scripts for training, running prediction, evaluating and comparing FCN model performance. It organises the output in a preset way:
└─ <output-folder>/ ├── model/ #the generated model will be here ├── predictions/ #the prediction outputs for the train/test set ├── overlayed/ #the overlayed outputs for the train/test set ├── predictions/ #predictions overlayed on the input images ├── analysis/ #predictions overlayed on the input images ├ #including analysis of false positives, negatives ├ #and true positives └── second-annotator/ #if set, store overlayed images for the #second annotator here └── analysis / #the outputs of the analysis of both train/test sets #includes second-annotator "mesures" as well, if # configuredTraining is performed for a configurable number of epochs, and generates at least a final_model.pth. It may also generate a number of intermediate checkpoints. Checkpoints are model files (.pth files) that are stored during the training and useful to resume the procedure in case it stops abruptly.
N.B.: The tool is designed to prevent analysis bias and allows one to provide (potentially multiple) separate subsets for training, validation, and evaluation. Instead of using simple datasets, datasets for full experiment running should be dictionaries with specific subset names:
__train__
: dataset used for training, prioritarily. It is typically the dataset containing data augmentation pipelines.
__valid__
: dataset used for validation. It is typically disjoint from the training and test sets. In such a case, we checkpoint the model with the lowest loss on the validation set as well, throughout all the training, besides the model at the end of training.
train
(optional): a copy of the__train__
dataset, without data augmentation, that will be evaluated alongside other sets available
__valid_extra__
: a list of datasets that are tracked during validation, but do not affect checkpoiting. If present, an extra column with an array containing the loss of each set is kept on the training log.
*
: any other name, not starting with an underscore character (_
), will be considered a test set for evaluation.N.B.2: The threshold used for calculating the F1-score on the test set, or overlay analysis (false positives, negatives and true positives overprinted on the original image) also follows the logic above.
It is possible to pass one or several Python
files (or names of deepdraw.config
entry points or module names) as
CONFIG arguments to the command line which contain the parameters listed below as Python variables. The options through the command-line (see below)
will override the values of configuration files. You can run this command with
<COMMAND> -H example_config.py
to create a template config file.
binseg experiment [OPTIONS] [CONFIG]...
Options
- -o, --output-folder <output_folder>#
Required Path where to store experiment outputs (created if does not exist)
- -m, --model <model>#
Required A torch.nn.Module instance implementing the network to be trained, and then evaluated
- -d, --dataset <dataset>#
Required A dictionary mapping string keys to torch.utils.data.dataset.Dataset instances implementing datasets to be used for training and validating the model, possibly including all pre-processing pipelines required or, optionally, a dictionary mapping string keys to torch.utils.data.dataset.Dataset instances. At least one key named
train
must be available. This dataset will be used for training the network model. The dataset description must include all required pre-processing, including eventual data augmentation. If a dataset named__train__
is available, it is used prioritarily for training instead oftrain
. If a dataset named__valid__
is available, it is used for model validation (and automatic check-pointing) at each epoch. If a dataset list named__valid_extra__
is available, then it will be tracked during the validation process and its loss output at the training log as well, in the format of an array occupying a single column. All other keys are considered test datasets and only used during analysis, to report the final system performance
- -S, --second-annotator <second_annotator>#
A dataset or dictionary, like in –dataset, with the same sample keys, but with annotations from a different annotator that is going to be compared to the one in –dataset
- --optimizer <optimizer>#
Required A torch.optim.Optimizer that will be used to train the network
- --criterion <criterion>#
Required A loss function to compute the FCN error for every sample respecting the PyTorch API for loss functions (see torch.nn.modules.loss)
- --scheduler <scheduler>#
Required A learning rate scheduler that drives changes in the learning rate depending on the FCN state (see torch.optim.lr_scheduler)
- -b, --batch-size <batch_size>#
Required Number of samples in every batch (this parameter affects memory requirements for the network). If the number of samples in the batch is larger than the total number of samples available for training, this value is truncated. If this number is smaller, then batches of the specified size are created and fed to the network until there are no more new samples to feed (epoch is finished). If the total number of training samples is not a multiple of the batch-size, the last batch will be smaller than the first, unless –drop-incomplete-batch is set, in which case this batch is not used.
- Default:
2
- -c, --batch-chunk-count <batch_chunk_count>#
Required Number of chunks in every batch (this parameter affects memory requirements for the network). The number of samples loaded for every iteration will be batch-size/batch-chunk-count. batch-size needs to be divisible by batch-chunk-count, otherwise an error will be raised. This parameter is used to reduce number of samples loaded in each iteration, in order to reduce the memory usage in exchange for processing time (more iterations). This is specially interesting whe one is running with GPUs with limited RAM. The default of 1 forces the whole batch to be processed at once. Otherwise the batch is broken into batch-chunk-count pieces, and gradients are accumulated to complete each batch.
- Default:
1
- -D, --drop-incomplete-batch, --no-drop-incomplete-batch#
Required If set, then may drop the last batch in an epoch, in case it is incomplete. If you set this option, you should also consider increasing the total number of epochs of training, as the total number of training steps may be reduced
- Default:
False
- -e, --epochs <epochs>#
Required Number of epochs (complete training set passes) to train for. If continuing from a saved checkpoint, ensure to provide a greater number of epochs than that saved on the checkpoint to be loaded.
- Default:
1000
- -p, --checkpoint-period <checkpoint_period>#
Required Number of epochs after which a checkpoint is saved. A value of zero will disable check-pointing. If checkpointing is enabled and training stops, it is automatically resumed from the last saved checkpoint if training is restarted with the same configuration.
- Default:
0
- -d, --device <device>#
Required A string indicating the device to use (e.g. “cpu” or “cuda:0”)
- Default:
cpu
- -s, --seed <seed>#
Seed to use for the random number generator
- Default:
42
- -P, --parallel <parallel>#
Required Use multiprocessing for data loading and processing: if set to -1 (default), disables multiprocessing altogether. Set to 0 to enable as many data loading instances as processing cores as available in the system. Set to >= 1 to enable that many multiprocessing instances for data processing.
- Default:
-1
- -I, --monitoring-interval <monitoring_interval>#
Required Time between checks for the use of resources during each training epoch. An interval of 5 seconds, for example, will lead to CPU and GPU resources being probed every 5 seconds during each training epoch. Values registered in the training logs correspond to averages (or maxima) observed through possibly many probes in each epoch. Notice that setting a very small value may cause the probing process to become extremely busy, potentially biasing the overall perception of resource usage.
- Default:
5.0
- -O, --overlayed, --no-overlayed#
Creates overlayed representations of the output probability maps, similar to –overlayed in prediction-mode, except it includes distinctive colours for true and false positives and false negatives. If not set, or empty then do NOT output overlayed images.
- Default:
False
- -S, --steps <steps>#
Required This number is used to define the number of threshold steps to consider when evaluating the highest possible F1-score on test data.
- Default:
1000
- -L, --plot-limits <plot_limits>#
If set, this option affects the performance comparison plots. It must be a 4-tuple containing the bounds of the plot for the x and y axis respectively (format: x_low, x_high, y_low, y_high]). If not set, use normal bounds ([0, 1, 0, 1]) for the performance curve.
- Default:
0.0, 1.0, 0.0, 1.0
- -v, --verbose#
Increase the verbosity level from 0 (only error and critical) messages will be displayed, to 1 (like 0, but adds warnings), 2 (like 1, but adds info messags), and 3 (like 2, but also adds debugging messages) by adding the –verbose option as often as desired (e.g. ‘-vvv’ for debug).
- Default:
0
- -H, --dump-config <dump_config>#
Name of the config file to be generated
Arguments
- CONFIG#
Optional argument(s)
Examples:
$ deepdraw experiment -vv m2unet drive --epochs=2
mkmask#
Commands for generating masks for images in a dataset.
It is possible to pass one or several Python
files (or names of None
entry points or module names) as
CONFIG arguments to the command line which contain the parameters listed below as Python variables. The options through the command-line (see below)
will override the values of configuration files. You can run this command with
<COMMAND> -H example_config.py
to create a template config file.
binseg mkmask [OPTIONS] [CONFIG]...
Options
- -o, --output-folder <output_folder>#
Required Path where to store the generated model (created if does not exist)
- -d, --dataset <dataset>#
Required The base path to the dataset to which we want to generate the masks. In case you have already configured the path for the datasets supported by deepdraw, you can just use the name of the dataset as written in the config.
- -g, --globs <globs>#
Required The global path to the dataset to which we want to generate the masks.We need to specify the path for the images ,Ex : –globs=”images/*.jpg”It also can be used multiple time.
- -t, --threshold <threshold>#
Required Generating a mask needs a threshold to be fixed in order to transform the image to binary
- -v, --verbose#
Increase the verbosity level from 0 (only error and critical) messages will be displayed, to 1 (like 0, but adds warnings), 2 (like 1, but adds info messags), and 3 (like 2, but also adds debugging messages) by adding the –verbose option as often as desired (e.g. ‘-vvv’ for debug).
- Default:
0
- -H, --dump-config <dump_config>#
Name of the config file to be generated
Arguments
- CONFIG#
Optional argument(s)
Examples:
$ deepdraw mkmask --dataset="refuge" --globs="Training400/*Glaucoma/*.jpg" --globs="Training400/*AMD/*.jpg" --threshold=5Or you can generate the same results with this command
$ deepdraw mkmask -d "refuge" -g "Training400/*Glaucoma/*.jpg" -g "Training400/*AMD/*.jpg" -t 5
$ deepdraw mkmask -d "Path/to/dataset" -g "glob1" -g "glob2" -g glob3 -t 4
predict#
Predicts vessel map (probabilities) on input images.
It is possible to pass one or several Python
files (or names of deepdraw.config
entry points or module names) as
CONFIG arguments to the command line which contain the parameters listed below as Python variables. The options through the command-line (see below)
will override the values of configuration files. You can run this command with
<COMMAND> -H example_config.py
to create a template config file.
binseg predict [OPTIONS] [CONFIG]...
Options
- -o, --output-folder <output_folder>#
Required Path where to store the predictions (created if does not exist)
- -m, --model <model>#
Required A torch.nn.Module instance implementing the network to be evaluated
- -d, --dataset <dataset>#
Required A torch.utils.data.dataset.Dataset instance implementing a dataset to be used for running prediction, possibly including all pre-processing pipelines required or, optionally, a dictionary mapping string keys to torch.utils.data.dataset.Dataset instances. All keys that do not start with an underscore (_) will be processed.
- -b, --batch-size <batch_size>#
Required Number of samples in every batch (this parameter affects memory requirements for the network)
- Default:
1
- -d, --device <device>#
Required A string indicating the device to use (e.g. “cpu” or “cuda:0”)
- Default:
cpu
- -w, --weight <weight>#
Required Path or URL to pretrained model file (.pth extension)
- -O, --overlayed <overlayed>#
Creates overlayed representations of the output probability maps on top of input images (store results as PNG files). If not set, or empty then do NOT output overlayed images. Otherwise, the parameter represents the name of a folder where to store those
- -P, --parallel <parallel>#
Required Use multiprocessing for data loading: if set to -1 (default), disables multiprocessing data loading. Set to 0 to enable as many data loading instances as processing cores as available in the system. Set to >= 1 to enable that many multiprocessing instances for data loading.
- Default:
-1
- -v, --verbose#
Increase the verbosity level from 0 (only error and critical) messages will be displayed, to 1 (like 0, but adds warnings), 2 (like 1, but adds info messags), and 3 (like 2, but also adds debugging messages) by adding the –verbose option as often as desired (e.g. ‘-vvv’ for debug).
- Default:
0
- -H, --dump-config <dump_config>#
Name of the config file to be generated
Arguments
- CONFIG#
Optional argument(s)
Examples:
$ deepdraw predict -vv m2unet drive --weight=path/to/model_final_epoch.pth --output-folder=path/to/predictions
$ deepdraw config copy csv-dataset-example mydataset.py # modify "mydataset.py" to include the base path and required transforms $ deepdraw predict -vv m2unet mydataset.py --weight=path/to/model_final_epoch.pth --output-folder=path/to/predictions
significance#
- Evaluates how significantly different are two models on the same
dataset.
This application calculates the significance of results of two models operating on the same dataset, and subject to a priori threshold tunning.
It is possible to pass one or several Python
files (or names of deepdraw.config
entry points or module names) as
CONFIG arguments to the command line which contain the parameters listed below as Python variables. The options through the command-line (see below)
will override the values of configuration files. You can run this command with
<COMMAND> -H example_config.py
to create a template config file.
binseg significance [OPTIONS] [CONFIG]...
Options
- -n, --names <names>#
Required Names of the two systems to compare
- -p, --predictions <predictions>#
Required Path where predictions of system 2 are currently stored. You may also input predictions from a second-annotator. This application will adequately handle it.
- -d, --dataset <dataset>#
Required A dictionary mapping string keys to torch.utils.data.dataset.Dataset instances
- -t, --threshold <threshold>#
Required This number is used to define positives and negatives from probability maps, and report F1-scores (a priori). By default, we expect a set named ‘validation’ to be available at the input data. If that is not the case, we use ‘train’, if available. You may provide the name of another dataset to be used for threshold tunning otherwise. If not set, or a string is input, threshold tunning is done per system, individually. Optionally, you may also provide a floating-point number between [0.0, 1.0] as the threshold to use for both systems.
- Default:
validation
- -e, --evaluate <evaluate>#
Required Name of the dataset to evaluate
- Default:
test
- -S, --steps <steps>#
Required This number is used to define the number of threshold steps to consider when evaluating the highest possible F1-score on train/test data.
- Default:
1000
- -s, --size <size>#
Required This is a tuple with two values indicating the size of windows to be used for sliding window analysis. The values represent height and width respectively.
- Default:
128, 128
- -t, --stride <stride>#
Required This is a tuple with two values indicating the stride of windows to be used for sliding window analysis. The values represent height and width respectively.
- Default:
32, 32
- -f, --figure <figure>#
Required The name of a performance figure (e.g. f1_score, or jaccard) to use when comparing performances
- Default:
accuracy
- -o, --output-folder <output_folder>#
Path where to store visualizations
- -R, --remove-outliers, --no-remove-outliers#
Required If set, removes outliers from both score distributions before running statistical analysis. Outlier removal follows a 1.5 IQR range check from the difference in figures between both systems and assumes most of the distribution is contained within that range (like in a normal distribution)
- Default:
False
- -R, --remove-zeros, --no-remove-zeros#
Required If set, removes instances from the statistical analysis in which both systems had a performance equal to zero.
- Default:
False
- -x, --parallel <parallel>#
Required Set the number of parallel processes to use when running using multiprocessing. A value of zero uses all reported cores.
- Default:
1
- -k, --checkpoint-folder <checkpoint_folder>#
Path where to store checkpointed versions of sliding window performances
- -v, --verbose#
Increase the verbosity level from 0 (only error and critical) messages will be displayed, to 1 (like 0, but adds warnings), 2 (like 1, but adds info messags), and 3 (like 2, but also adds debugging messages) by adding the –verbose option as often as desired (e.g. ‘-vvv’ for debug).
- Default:
0
- -H, --dump-config <dump_config>#
Name of the config file to be generated
Arguments
- CONFIG#
Optional argument(s)
Examples:
$ deepdraw significance -vv drive --names system1 system2 --predictions=path/to/predictions/system-1 path/to/predictions/system-2
$ deepdraw significance -vv drive --names system1 system2 --predictions=path/to/predictions/system-1 path/to/predictions/system-2 --threshold=train --evaluate=alternate-test
train#
Trains an FCN to perform binary segmentation.
Training is performed for a configurable number of epochs, and generates at least a final_model.pth. It may also generate a number of intermediate checkpoints. Checkpoints are model files (.pth files) that are stored during the training and useful to resume the procedure in case it stops abruptly.
Tip: In case the model has been trained over a number of epochs, it is possible to continue training, by simply relaunching the same command, and changing the number of epochs to a number greater than the number where the original training session stopped (or the last checkpoint was saved).
It is possible to pass one or several Python
files (or names of deepdraw.config
entry points or module names) as
CONFIG arguments to the command line which contain the parameters listed below as Python variables. The options through the command-line (see below)
will override the values of configuration files. You can run this command with
<COMMAND> -H example_config.py
to create a template config file.
binseg train [OPTIONS] [CONFIG]...
Options
- -o, --output-folder <output_folder>#
Required Path where to store the generated model (created if does not exist)
- -m, --model <model>#
Required A torch.nn.Module instance implementing the network to be trained
- -d, --dataset <dataset>#
Required A dictionary mapping string keys to torch.utils.data.dataset.Dataset instances implementing datasets to be used for training and validating the model, possibly including all pre-processing pipelines required or, optionally, a dictionary mapping string keys to torch.utils.data.dataset.Dataset instances. At least one key named
train
must be available. This dataset will be used for training the network model. The dataset description must include all required pre-processing, including eventual data augmentation. If a dataset named__train__
is available, it is used prioritarily for training instead oftrain
. If a dataset named__valid__
is available, it is used for model validation (and automatic check-pointing) at each epoch. If a dataset list named__extra_valid__
is available, then it will be tracked during the validation process and its loss output at the training log as well, in the format of an array occupying a single column. All other keys are considered test datasets and are ignored during training
- --optimizer <optimizer>#
Required A torch.optim.Optimizer that will be used to train the network
- --criterion <criterion>#
Required A loss function to compute the FCN error for every sample respecting the PyTorch API for loss functions (see torch.nn.modules.loss)
- --scheduler <scheduler>#
Required A learning rate scheduler that drives changes in the learning rate depending on the FCN state (see torch.optim.lr_scheduler)
- -b, --batch-size <batch_size>#
Required Number of samples in every batch (this parameter affects memory requirements for the network). If the number of samples in the batch is larger than the total number of samples available for training, this value is truncated. If this number is smaller, then batches of the specified size are created and fed to the network until there are no more new samples to feed (epoch is finished). If the total number of training samples is not a multiple of the batch-size, the last batch will be smaller than the first, unless –drop-incomplete-batch is set, in which case this batch is not used.
- Default:
2
- -c, --batch-chunk-count <batch_chunk_count>#
Required Number of chunks in every batch (this parameter affects memory requirements for the network). The number of samples loaded for every iteration will be batch-size/batch-chunk-count. batch-size needs to be divisible by batch-chunk-count, otherwise an error will be raised. This parameter is used to reduce number of samples loaded in each iteration, in order to reduce the memory usage in exchange for processing time (more iterations). This is specially interesting whe one is running with GPUs with limited RAM. The default of 1 forces the whole batch to be processed at once. Otherwise the batch is broken into batch-chunk-count pieces, and gradients are accumulated to complete each batch.
- Default:
1
- -D, --drop-incomplete-batch, --no-drop-incomplete-batch#
Required If set, then may drop the last batch in an epoch, in case it is incomplete. If you set this option, you should also consider increasing the total number of epochs of training, as the total number of training steps may be reduced
- Default:
False
- -e, --epochs <epochs>#
Required Number of epochs (complete training set passes) to train for. If continuing from a saved checkpoint, ensure to provide a greater number of epochs than that saved on the checkpoint to be loaded.
- Default:
1000
- -p, --checkpoint-period <checkpoint_period>#
Required Number of epochs after which a checkpoint is saved. A value of zero will disable check-pointing. If checkpointing is enabled and training stops, it is automatically resumed from the last saved checkpoint if training is restarted with the same configuration.
- Default:
0
- -d, --device <device>#
Required A string indicating the device to use (e.g. “cpu” or “cuda:0”)
- Default:
cpu
- -s, --seed <seed>#
Seed to use for the random number generator
- Default:
42
- -P, --parallel <parallel>#
Required Use multiprocessing for data loading: if set to -1 (default), disables multiprocessing data loading. Set to 0 to enable as many data loading instances as processing cores as available in the system. Set to >= 1 to enable that many multiprocessing instances for data loading.
- Default:
-1
- -I, --monitoring-interval <monitoring_interval>#
Required Time between checks for the use of resources during each training epoch. An interval of 5 seconds, for example, will lead to CPU and GPU resources being probed every 5 seconds during each training epoch. Values registered in the training logs correspond to averages (or maxima) observed through possibly many probes in each epoch. Notice that setting a very small value may cause the probing process to become extremely busy, potentially biasing the overall perception of resource usage.
- Default:
5.0
- -v, --verbose#
Increase the verbosity level from 0 (only error and critical) messages will be displayed, to 1 (like 0, but adds warnings), 2 (like 1, but adds info messags), and 3 (like 2, but also adds debugging messages) by adding the –verbose option as often as desired (e.g. ‘-vvv’ for debug).
- Default:
0
- -H, --dump-config <dump_config>#
Name of the config file to be generated
Arguments
- CONFIG#
Optional argument(s)
Examples:
cuda:0
):$ deepdraw train -vv unet drive --batch-size=4 --device="cuda:0"
cuda:0
):$ deepdraw train -vv hed hrf --batch-size=8 --device="cuda:0"
$ deepdraw train -vv m2unet covd-drive --batch-size=8
train-analysis#
- Analyze the training logs for loss evolution and resource
utilisation.
It is possible to pass one or several Python
files (or names of deepdraw.config
entry points or module names) as
CONFIG arguments to the command line which contain the parameters listed below as Python variables. The options through the command-line (see below)
will override the values of configuration files. You can run this command with
<COMMAND> -H example_config.py
to create a template config file.
binseg train-analysis [OPTIONS] LOG CONSTANTS [CONFIG]...
Options
- -o, --output-pdf <output_pdf>#
Required Name of the output file to dump
- Default:
trainlog.pdf
- -v, --verbose#
Increase the verbosity level from 0 (only error and critical) messages will be displayed, to 1 (like 0, but adds warnings), 2 (like 1, but adds info messags), and 3 (like 2, but also adds debugging messages) by adding the –verbose option as often as desired (e.g. ‘-vvv’ for debug).
- Default:
0
- -H, --dump-config <dump_config>#
Name of the config file to be generated
Arguments
- LOG#
Required argument
- CONSTANTS#
Required argument
- CONFIG#
Optional argument(s)
Examples:
$ deepdraw train-analysis -vv log.csv constants.csv