Running complete experiments

We provide an aggregator command called “experiment” that runs training, followed by prediction, evaluation and comparison. After running, you will be able to find results from model fitting, prediction, evaluation and comparison under a single output directory.

For example, to train a Mobile V2 U-Net architecture on the STARE dataset (optic vessel segmentation), evaluate both train and test set performances, output prediction maps and overlay analysis, together with a performance curve, run the following:

$ bob binseg experiment -vv m2unet stare --batch-size=16 --overlayed
# check results in the "results" folder

You may run the system on a GPU by using the --device=cuda:0 option.

Using your own dataset

To use your own dataset, we recommend you read our instructions at bob.ip.binseg.configs.datasets.csv, and setup one or more CSV file describing input data and ground-truth (segmentation maps), and potential test data. Then, prepare a configuration file by copying our configuration example and edit it to apply the required transforms to your input data. Once you are happy with the result, use it in place of one of our datasets:

$ bob binseg config copy csv-dataset-example mydataset.py
# edit mydataset following instructions
$ bob binseg experiment ... mydataset.py ...

Changing defaults

We provide a large set of preset configurations to build models from known datasets. You can copy any of the existing configuration resources and edit to build your own customized version. Once you’re happy, you may use the newly created files directly on your command line. For example, suppose you wanted to slightly change the DRIVE pre-processing pipeline. You could do the following:

$ bob binseg config copy drive my_drive_remix.py
# edit my_drive_remix.py to your needs
$ bob binseg train -vv <model> ./my_drive_remix.py

Running at Idiap’s SGE grid

If you are at Idiap, you may install the package gridtk (conda install gridtk) on your environment, and submit the job like this:

$ jman submit --queue=gpu --memory=24G --name=myjob -- bob binseg train --device='cuda:0' ... #paste the rest of the command-line

This bash-script function can be of help when switching between local and SGE-based running. Just copy and source this file, then call the function run as many times as required to benchmark your task.

#!/usr/bin/env bash

# set output directory and location of "bob" executable
OUTDIR=/path/where/to/dump/results
BOB=/path/to/bob

# this function just makes running/submitting jobs a bit easier for extensive
# benchmark jobs.
# usage: run <modelconfig> <dbconfig> <batchsize> [<device> [<queue>]]
function run() {
    local device="cpu"
    [ $# -gt 3 ] && device="${4}"

    local cmd=(${BOB} binseg experiment)
    cmd+=("-vv" "--device=${device}" ${1} ${2})
    cmd+=("--batch-size=${3}" "--output-folder=${OUTDIR}/${1}/${2}")
    # add --multiproc-data-loading=0 to increase data loading/transform speeds,
    # but pay by making your results harder to reproduce (OS-random data loading)
    #cmd+=("--multiproc-data-loading=0")

    mkdir -pv ${OUTDIR}/${1}/${2}

    [ $# -gt 4 ] && cmd=(jman submit "--log-dir=${OUTDIR}/${1}/${2}" "--name=$(basename ${OUTDIR})-${1}-${2}" "--memory=24G" "--queue=${5}" -- "${cmd[@]}")

    if [ $# -le 4 ]; then
        # executing locally, capture stdout and stderr
        ("${cmd[@]}" | tee "${OUTDIR}/${1}/${2}/stdout.log") 3>&1 1>&2 2>&3 | tee "${OUTDIR}/${1}/${2}/stderr.log"
    else
        "${cmd[@]}"
    fi
}