1. Overall Architecture

The BEAT platform is constituted of several networked applications. This allows its deployment on multiple machines, with as many processing nodes as necessary.

../../../../_images/platform-overview.svg

Fig. 1.1 Software architecture of the platform

Fig. 1.1 represents the software architecture of the platform, where the interaction between the following modules are shown:

  • The Web Server, allows the users to interact, online, with the platform.

  • The Experiments, describes fully parametrized scientific workflows as a set of organized transformations, from the use of raw data from databases such as images, to the generation of results such as ROC plots. Each experiment can, hence, be decomposed into a set of execution jobs.

  • The Scheduler, assigns jobs on the Worker Nodes, also called processing nodes.

  • The Experiment, State and Cache Database, contains all the objects required to define scientific experiments, such as algorithm implementations and experiment parameters, as well the current load state of the processing farm together with intermediary data produced by the experiments (cache). The backend communicates to the frontend via this database.

../../../../_images/hardware-architecture.svg

Fig. 1.2 Hardware architecture of the platform

Fig. 1.2 represents the matching hardware architecture of the platform. Each component in this figure could be deployed on a different computer, as long as it can establish to the central database server and storage. In this case, this would make possible to distribute the load on several machines. It should be equally possible to accommodate all software components into a single (multi-core) computer for tests or demonstrations. To benefit from commodity computing, plain Intel-compatible PC’s either in desktop or rack-mountable format are recommended for deployment.

1.1. Wedding List

Because of the rapid change of commodity computing, it is difficult to define a precise list of items (Wedding List) for the assembly of a BEAT platform instance. Furthermore, each instance may be deployed with different baseline requirements attending the modality or modalities one wishes to explore. This way, we opt for determining which points should be taken into consideration when dimensioning the hardware for an installation:

  • The size of the storage depends on the amount of data generated by each experiment, and the number of experiments that can be performed at the same time;

  • The bandwidth between each computer and the storage must be fast enough to not slow down experiments;

  • The number of processing nodes depends on the number of algorithms that can be executed at the same time;

  • The processing power and memory size on the processing nodes depends of the kind of algorithms and amount of data that will be processed by the platform.

A number of software counters is available in the BEAT software, such as the CPU and memory load on the worker nodes, empowering administrators to understand and fix performance issues.