.. vim: set fileencoding=utf-8 : .. Copyright (c) 2016 Idiap Research Institute, http://www.idiap.ch/ .. .. Contact: beat.support@idiap.ch .. .. .. .. This file is part of the beat.web module of the BEAT platform. .. .. .. .. Commercial License Usage .. .. Licensees holding valid commercial BEAT licenses may use this file in .. .. accordance with the terms contained in a written agreement between you .. .. and Idiap. For further information contact tto@idiap.ch .. .. .. .. Alternatively, this file may be used under the terms of the GNU Affero .. .. Public License version 3 as published by the Free Software and appearing .. .. in the file LICENSE.AGPL included in the packaging of this file. .. .. The BEAT platform is distributed in the hope that it will be useful, but .. .. WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY .. .. or FITNESS FOR A PARTICULAR PURPOSE. .. .. .. .. You should have received a copy of the GNU Affero Public License along .. .. with the BEAT platform. If not, see http://www.gnu.org/licenses/. .. .. _administratorguide-idiap_platform: Example: The Platform Deployed at Idiap Research Institute ========================================================== This section gives some insight into the BEAT platform deployed at Idiap Research Institute, which is now `publicly available `_. .. _administratorguide-idiap_platform-strategy: Deployment Strategy ------------------- BEAT has been carefully designed, such that a platform can be easily deployed in a distributed manner. At Idiap, we have opted for a deployment strategy, which is somehow in between :ref:`administratorguide-hardware_guidelines-distributed-nodes` and :ref:`administratorguide-hardware_guidelines-distributed-architecture`. First, a server called `beatweb` hosts both the web server as well as a dedicated PostgreSQL database server. Second, a server called `beatsched` hosts the scheduler, which is in charge of splitting jobs across several worker nodes (named according to the pattern `beatproc*`). Finally, for administration purposes, a dedicated server called `beatadm` is employed. Cache data are stored on an NFS infrastructure. .. _administratorguide-idiap_platform-hardware: Hardware Specifications ----------------------- Computing Hardware .................. Following a thorough comparison and evaluation of all aspects - features, operations, maintenance, warranty, cost, etc. - of several IT solutions, Idiap chose in 2011 to rely on IBM BladeCenter (H) solutions as its processing resources. Retrospectively, experience has shown that if such solutions do possess some caveats - IBM hardware undoubtedly requires greater knowledge (and patience) to reach configuration objectives - they do allow in the end to lower the overall operational burden (and cost) as well as provide the means to significantly/easily increase the global system performances. Based on that experience, Idiap chose in early 2014 to extend its processing hardware base with IBM FlexSystem solutions for the BEAT platform. Overall, there are TODO nodes, each node consisting of two Intel Xeon E5-2690v2 (20 cores) with 256GB of DDR3 RAM. Storage ....... Historically, Idiap has relied on NetApp filers as its main storage resource. Even though competitors alternatives have been analyzed when major new investments were looked into, Idiap has stuck to this original choice for the BEAT platform, and chose a NetApp 3220 dual head network filer with 20TB of *mirrored* storage (10 TB usable capacity). Summary ....... .. _administratorguide-idiapplatform-hardware-physical: .. figure:: img/physical-platform.* :width: 80% Physical hardware of the platform deployed at Idiap The resulting hardware infrastructure is summarized in :numref:`administratorguide-idiapplatform-hardware-physical`. Communication between each machine and the storage is through a 10Gbits/s switch HP Procurve E8212zl. .. _administratorguide-idiap_platform-virtualization: Virtualization -------------- Virtualizing resources - servers, storage, networks - is now part of every IT departments life. It consists of creating a virtual (rather than actual) version of something. For instance, a virtual machine (VM) is an abstraction of the computer hardware that allows a single machine to behave as if it were many machines. While virtualization is not a strong required for the deployment of a BEAT platform, this provides significant benefits such as: * Dynamic load balancing, by moving virtual machines to underutilized servers and/or by reallocating and instantiating resources whenever required. * Improving flexibility by allowing several applications requiring different environments to run on the same physical machine. * Enabling a virtual image on a machine to be instantly moved on another server, e.g. if a machine failure occurs. * Improving system reliability and availability, since virtualization may prevent system crashes due to memory corruption caused by software like device drivers, and it helps to avoid service interruption in case of physical maintenance. For the platform deployed at Idiap, virtualization is a versatile tool to perform dynamic load balancing. All the previously described servers (`beatweb`, `beatsched` and `beatadm`) as well as the workers are indeed virtual machines. This allows the creation of several single core workers with different computing environment from a single powerful multi-core machine, as well as to adapt the worker specifications without too much effort whenever required. Similarly, increasing the capacity (RAM of number of cores) of a server such as `beatweb` is possible, when the platform become more mature, with, hence, an increased website traffic. .. _administratorguide-idiap_platform-software: Software Specifications ----------------------- Though its Unix history had it venture on the soil of various Unix-like operating systems, Idiap nowadays rely solely on the Debian Linux (64-bit) distribution to power its servers infrastructure. Favoring stability and security over leading-edginess of open source software - as far as servers are concerned - Idiap relies in particular on the Debian/Stable branch, also known as *Debian/Wheezy*. Since 2011, Idiap has been virtualizing its servers resources using the open source virtualization and high-availability software described below, all readily available as (appropriately bundled and pre-configured) Debian packages: * `KVM `_ (hardware-accelerated virtualization) * `QEMU `_ (x86 hardware emulation/virtualization) * `libvirt `_ (virtualization (abstraction) API) * `Corosync `_ (group (cluster) communication system) * `Pacemaker `_ (high-availability resource manager) .. _administratorguide-idiap_platform-storage: Storage Organization -------------------- An NFS infrastructure is employed to store data in a distributed manner. In particular, data are organized into several partitions as follows: * **/remote/dataset** contains the raw data from the scientific databases on which experiments are conducted. * **/remote/cache** contains the data generated by the scientific experiments, outputs of all intermediate blocks included. * **/remote/sw** contains the full stack of BEAT software. Installing the software centrally rather than locally on each server/machine reduces maintenance efforts, since software update is performed once centrally rather than on several nodes. * **/remote/environment** contains environments to run scientific experiments. * **/remote/prefix** contains the definition of user-defined objects such as algorithms, toolchains and experiments. The BEAT machines have different access levels to each of these partitions. This is useful for security purposes, since BEAT servers only have the file permission they need to do their work. On one side, the administrative server `beatadm` has read and write acess to all partitions. In contrast, other BEAT servers have limited access to these partitions. `beatsched` has read and write access to a single partition `/remote/cache/`, such that cache files can be moved to their final destinations, when a worker successfully completes a job. In contrast, the partitions `/remote/dataset`, `/remote/environment`, `/remote/prefix` and `/remote/sw` are only accessible in read mode. Similarly, the workers `beatproc*` have read only access to the partitions `/remote/dataset`, `/remote/environment`, `/remote/prefix` and `/remote/sw`. Besides, they have a full read and write access to `/remote/cache/` to be able to read the inputs and to write the outputs of the jobs they are assigned to. Finally, `beatweb` has read and write access to `/remote/prefix`, allowing users to add and remove content through the BEAT website. Furthermore, it has only read access to `/remote/cache`, `/remote/sw` and `/remote/environment`. Besides, it has no access to `/remote/dataset`, since this would be useless.