.. vim: set fileencoding=utf-8 :

.. Copyright (c) 2016 Idiap Research Institute, http://www.idiap.ch/          ..
.. Contact: beat.support@idiap.ch                                             ..
..                                                                            ..
.. This file is part of the beat.docs module of the BEAT platform.            ..
..                                                                            ..
.. Commercial License Usage                                                   ..
.. Licensees holding valid commercial BEAT licenses may use this file in      ..
.. accordance with the terms contained in a written agreement between you     ..
.. and Idiap. For further information contact tto@idiap.ch                    ..
..                                                                            ..
.. Alternatively, this file may be used under the terms of the GNU Affero     ..
.. Public License version 3 as published by the Free Software and appearing   ..
.. in the file LICENSE.AGPL included in the packaging of this file.           ..
.. The BEAT platform is distributed in the hope that it will be useful, but   ..
.. WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY ..
.. or FITNESS FOR A PARTICULAR PURPOSE.                                       ..
..                                                                            ..
.. You should have received a copy of the GNU Affero Public License along     ..
.. with the BEAT platform. If not, see http://www.gnu.org/licenses/.          ..


.. _beat-system-experiments:

=============
 Experiments
=============

An experiment is the reunion of algorithms, datasets, a toolchain and
parameters that allow the system to schedule and run the prescribed recipe
to produce displayable results. Defining a BEAT experiment can be seen as
configuring the processing blocks of a toolchain, such as selecting which
database, algorithms and algorithm parameters to use.

The graphical interface of BEAT provides user-friendly editors to configure the main components of the system (for example: experiments, data formats, etc.), which simplifies their `JSON`_ declaration definition. One needs only to declare an experiment using the described specifications when not using this graphical interface.

.. note:: **Naming Convention**

   Experiments are named using five values joined by a ``/`` (slash)
   operator:

     * **username**: indicates the author of the experiment
     * **toolchain username**: indicates the author of the toolchain used for
       that experiment
     * **toolchain name**: indicates the name of the toolchain used for that
       experiment
     * **toolchain version**: indicates the version (integer starting from
       ``1``) of the toolchain used for the experiment
     * **name**: an identifier for the object

   Each tuple of these five components defines a *unique* experiment name.


.. _beat-system-experiments-declaration:

Declaration of an experiment
----------------------------
An experiment is declared in a JSON file, and must contain at least the following
fields:

.. code-block:: javascript

    {
        "datasets": [
        ],
        "blocks": [
        ],
        "analyzers": [
        ],
        "globals": [
        ]
    }


.. _beat-system-experiments-datasets:

Declaration of the dataset(s)
-----------------------------

The dataset inputs are defined by the toolchain. However, the toolchain does
not describe which data to plug in each dataset input.

This is the role of the field `datasets` from an experiment.
For each dataset, an experiment must specify three attributes as follows:

.. code-block:: javascript

    {
        "datasets": [
            "templates": {
                "set": "templates",
                "protocol": "idiap",
                "database": "atnt/1"
            },
            ...
        ],
        ...
    }


The key of an experiment dataset must correspond to the desired dataset name
from the toolchain. Then, three fields must be given:

* `database`: the database name and version
* `protocol`: the protocol name
* `set`: the dataset name of this database to associate to this toolchain
  dataset


.. _beat-system-experiments-blocks:

Declaration of the block(s)
---------------------------

The blocks are defined by the toolchain. However, the toolchain does not
describe which algorithm to run in each processing block, and how each of these
algorithms are parametrized.

This is the role of the field `blocks` from an experiment.
For each block, an experiment must specify four attributes as follows:

.. code-block:: javascript

    {
        "blocks": {
            "linear_machine_training": {
                "inputs": {
                    "image": "image"
                },
                "parameters": {},
                "algorithm": "tutorial/pca/1",
                "outputs": {
                    "subspace": "subspace"
                }
            },
            ...
        },
        ...
    }

The key of an experiment block must correspond to the desired block from the
toolchain. Then, four fields must be given:

* `algorithm`: the algorithm to use (author_name/algorithm_name/version)
* `inputs`: the list of inputs. The key is the algorithm input, while the
  value is the corresponding toolchain input.
* `outputs`: the list of outputs. The key is the algorithm output, while the
  value is the corresponding toolchain output.
* `parameters`: the algorithm parameters to use for this processing block


.. note:: **Algorithms, Datasets and Blocks**

   While configuring the experiment, your objective is to fill-in all
   containers defined by the toolchain with valid datasets and algorithms or
   analyzers. **BEAT will check connected datasets, algorithms and
   analyzers produce or consume data in the right format**. It only presents
   options which are *compatible* with adjacent blocks.

   For example, if you chose dataset ``A`` for block ``train`` of your
   experiment that outputs objects in the format ``user/format/1``, then the
   algorithm running on the block following ``train``, **must** consume
   ``user/format/1`` on its input. Therefore, the choices for algorithms that
   can run after ``train`` become limited at the moment you chose the dataset
   ``A``. The configuration system will *dynamically* update to take those
   constraints into consideration every time you make a selection, increasing
   the global constraints for the experiment.

.. _beat-system-experiments-analyzers:

Declaration of the analyzer(s)
------------------------------

Analyzers are similar to algorithms, except that they run on toolchain
endpoints. There configuration is very similar to the one of regular blocks,
except that they have no `outputs`:

.. code-block:: javascript

    {
        "analyzers": {
            "analysis": {
                "inputs": {
                    "scores": "scores"
                },
                "algorithm": "tutorial/postperf/1"
            }
        },
    }


Global parameters
-----------------

Each block and analyzer may rely on its own local parameters. However, several
blocks may rely on the exact same parameters. In this case, it is more
convenient to define those globally.

For an experiment, this is achieved using the `globals` field in its JSON
declaration. For instance:

.. code-block:: javascript

    {
        "globals": {
            "queue": "Default",
            "environment": {
                "version": "1.1.0",
                "name": "Scientific Python 2.7"
            },
            "tutorial/pca/1": {
                "number-of-components": "5"
            }
        },
        ...
    }

.. include:: links.rst