deepdraw.engine.evaluator#

Defines functionality for the evaluation of predictions.

Functions

compare_annotators(baseline, other, name, ...)

Compares annotations on the same dataset.

run(dataset, name, predictions_folder[, ...])

Runs inference and calculates measures.

sample_measures_for_threshold(pred, gt, ...)

Calculates counts on one single sample, for a specific threshold.

deepdraw.engine.evaluator.sample_measures_for_threshold(pred, gt, mask, threshold)[source]#

Calculates counts on one single sample, for a specific threshold.

Parameters:
  • pred (torch.Tensor) – pixel-wise predictions

  • gt (torch.Tensor) – ground-truth (annotations)

  • mask (torch.Tensor) – region mask (used only if available). May be set to None.

  • threshold (float) – a particular threshold in which to calculate the performance measures

Returns:

  • tp (int)

  • fp (int)

  • tn (int)

  • fn (int)

deepdraw.engine.evaluator.run(dataset, name, predictions_folder, output_folder=None, overlayed_folder=None, threshold=None, steps=1000, parallel=-1)[source]#

Runs inference and calculates measures.

Parameters:
  • dataset (py:class:torch.utils.data.Dataset) – a dataset to iterate on

  • name (str) – the local name of this dataset (e.g. train, or test), to be used when saving measures files.

  • predictions_folder (str) – folder where predictions for the dataset images have been previously stored

  • output_folder (str, Optional) – folder where to store results. If not provided, then do not store any analysis (useful for quickly calculating overlay thresholds)

  • overlayed_folder (str, Optional) – if not None, then it should be the name of a folder where to store overlayed versions of the images and ground-truths

  • threshold (float, Optional) – if overlayed_folder, then this should be threshold (floating point) to apply to prediction maps to decide on positives and negatives for overlaying analysis (graphical output). This number should come from the training set or a separate validation set. Using a test set value may bias your analysis. This number is also used to print the a priori F1-score on the evaluated set.

  • steps (float, Optional) – number of threshold steps to consider when evaluating thresholds.

  • parallel (int, Optional) – If set to a value different >= 0, uses multiprocessing for estimating thresholds for each sample through a processing pool. A value of zero will create as many processes in the pool as cores in the machine. A negative value disables multiprocessing altogether. A value greater than zero will spawn as many processes as requested.

Returns:

  • threshold (float) – Threshold to achieve the highest possible F1-score for this dataset

deepdraw.engine.evaluator.compare_annotators(baseline, other, name, output_folder, overlayed_folder=None, parallel=-1)[source]#

Compares annotations on the same dataset.

Parameters:
  • baseline (py:class:torch.utils.data.Dataset) – a dataset to iterate on, containing the baseline annotations

  • other (py:class:torch.utils.data.Dataset) – a second dataset, with the same samples as baseline, but annotated by a different annotator than in the first dataset. The key values must much between baseline and this dataset.

  • name (str) – the local name of this dataset (e.g. train-second-annotator, or test-second-annotator), to be used when saving measures files.

  • output_folder (str) – folder where to store results

  • overlayed_folder (str, Optional) – if not None, then it should be the name of a folder where to store overlayed versions of the images and ground-truths

  • parallel (int, Optional) – If set to a value different >= 0, uses multiprocessing for estimating thresholds for each sample through a processing pool. A value of zero will create as many processes in the pool as cores in the machine. A negative value disables multiprocessing altogether. A value greater than zero will spawn as many processes as requested.