aberration_correction Sorting and scanning aberration correction of periodic image time-series.
acoustic-simulator Implementation of audio degradation processes
ACT ACT for Accuracy of Connective Translation is a reference-based metric to measure the accuracy of discourse connective translation, mainly for statistical machine translation systems.
APT The APT software is a reference-based metric to evaluate the accuracy of pronoun translation.
asrt A python library that facilitate the extraction of text sentences from multilingual 'pdf' documents
Attentive Residual Connections NMT Implementation and output data of "Global-Context Neural Machine Translation through Target-Side Attentive Residual Connections"
Attention Sampling Python library to accelerate the training and inference of neural networks on large data. This code is the reference implementation of the methods described in our ICML 2019 publication "Processing Megapixel Images with Deep Attention-Sampling Models".
BEAT platform The BEAT platform is a European computing e-infrastructure for Open Science proposing a solution for open access, scientific information sharing and re-use including data and source code while protecting privacy and confidentiality. It allows easy online access to experimentation and testing in computational science.
bioformats_io Takes as input NPY files and saves them to OME or OME-TIFF, and conversely, takes as input microscopy-format files and saves them as NPY.
BOB Bob is a free signal-processing and machine learning toolbox developed by the Biometrics group at Idiap Research Institute, Switzerland. The toolbox is written in a mix of Python and C++ and is designed to be both efficient and reduce development time.
bob bio spear Implements speaker recognition algorithms
bob bio vein Vein biometrics recognition baselines
bob ip binseg Binary Segmentation Benchmark Package for Bob
Bob's library of image-quality feature-extractors This package is part of the signal-processing and machine learning toolbox Bob. It provides functions for extracting image-quality features proposed for PAD experiments by different research groups. Image quality measures proposed by Galbally et al. (IEEE TIP 2014) and by Wen et al. (IEEE TIFS 2015) are implemented in this package.


This package contains python code to reproduce experiments and results described in the IEEE ICIP paper: "CNN Patch Pooling for Detecting 3D Mask Presentation Attacks in NIR"

bob paper eusipco 2018 Speaker Inconsistency Detection in Tampered Video. Source code for reproducing the speaker inconsistency detection experiments of the paper "Speaker Inconsistency Detection in Tampered Video" in EUSIPCO 2018 conference
bob paper icassp 2020 domain guided pruning Code to reproduce "Domain Adaptation for Generalization of Face presentation Attack Detection in Mobile Settings with Minimal Information" ICASSP 2020 paper.
bob paper icassp 2020 facepad generalization infovae Code to reproduce "Improving Cross-dataset Performance Of Face Presentation Attack Detection Systems Using Face Recognition Datasets" ICASSP 2020 paper.
bob paper makeup aim This package contains python code to reproduce experiments and results described in the IEEE T-BIOM paper: "Detection of Age-Induced Makeup Attacks on Face Recognition Systems Using Multi-Layer Deep Features"
bob paper mcae icb 2019 Face PAD using multi-channel autoencoders
bob paper mccnn tifs 2018 Face PAD using Multi-Channel CNN
bob paper xcsmad facepad Face PAD for Silicone mask-based attack detection
BuSLR Build System for Speech and Language Research
CNN-based Models CNN-based Models for ALS and stressed MNs cultures classification
CNN_QbE_STD Implementation of the work presented in "CNN based Query by Example Spoken Term Detection"
CNN-voice-PAD The purpose of this software is to train Convolutional Neural Networks on raw speech signals in order to detect voice presentation attacks.
Content-Based Recommendation Generator (CBRec v1.0) A Python library which generates content-based recommendations for a set of items described by textual metadata using four possible vector space methods, namely TF-IDF, LSI, RP and LDA.
Data Cryptographer Bundle The Data Cryptographer Bundle is a PHP/Symfony bundle which provides a cryptographer resource/service for common cryptographic operations

A Deep Learning Optimizer Benchmark Suite

Deep Pixel-wise Binary Supervision for Face PAD

This package is part of the signal-processing and machine learning toolbox Bob.
This package contains source code to replicate the experimental results published in the following paper:

Deep Pixel-wise Binary Supervision for Face Presentation Attack Detection

DiscoConn Classifier Classifier models and feature extractors for discourse relations
drill Deep residual output layers for neural language generation
DocRec - Keyword Extraction and Document Recommendation in Conversations The package contains several pieces of Matlab code. Taken together, they extract keywords from a conversation, then use them to build implicit queries, and then consolidate the sets of retrieved documents to recommend to the conversation participants.
eakmeans - Implementation of fast exact k-means algorithms Implementation of fast exact k-means algorithms
Eigenposterior Eigenposterior (Senone Class Principal Components) based approach for purifying DNN posterior estimates
Emotion-Based Recommendation Generator (EMORec v1.0) A Python library which performs emotion-based analysis and recommendation using a multiple-instance regression algorithm for a set of multimedia items described by transcripts
Exact Acceleration of Linear Object Detectors We describe a general and exact method to considerably speed up linear object detection systems operating in a sliding, multi-scale window fashion, such as the individual part detectors of part-based models.
Face Color Model This page contains the source code and data needed to train and use a model for skin, hair, clothing and background color modelling and segmentation.
facereclib - The Face Recognition Library This library is designed to perform a fair comparison of face recognition algorithms. It contains scripts to execute various kinds of face recognition experiments on a variety of facial image databases
fast pose machines Efficient Pose Machines for Multi-Person Pose Estimation
fluoMNs_models CNN-based Models for ALS and stressed MNs cultures classification
fullgrad saliency This code is the reference implementation of the methods described in our NeurIPS 2019 publication "Full-Gradient Representation for Neural Network Visualization.

This repository implements two methods: the reference FullGrad algorithm, and a variant called "simple FullGrad", which omits computation of bias parameters for bias-gradients.
GC.MI The gc_MI.cpp file includes C++ code implementing the GC.MI algorithm presented in the paper:
HAN_NMT Document-Level Neural Machine Translation with Hierarchical Attention Networks
HEAT Image Retrieval System HEAT is an image retrieval web-application that is intended for large unstructured collections of images without semantic annotations. The system implements a novel searching paradigm that does not require any explicit query. At each iteration, the system displays a small set of images and the user chooses the image that best matches what she is looking for. After a few iterations, the sets of displayed images are gradually concentrated on images that satisfy the user.



Temporal Super-Resolution Microscopy Using a Hue-Encoded Shutter
HG3D - A module for 3D head pose and gaze tracking from RGB-D sensors This software contains the implementation of algorithms related to 3D head pose and gaze tracking tasks based on RGB-D cameras (standard vision and depth).
HOOSC Histogram of Orientation Shape Context
hpca hpca is a C++ toolkit providing an efficient implementation of the Hellinger PCA for computing word embeddings
human-detection Background substraction and Human Detection
HTS-VTLN This software is a patch to HMM based statistical parametric speech synthesis toolkit (HTS 2.2).
IdiapTTS Idiap Text-to-Speech system developed at the Idiap Research Institute
IHPER (Idiap human perception system) An audio-visual system for human perception, human-robot interaction. This ROS-compatible system detects tracks faces, re-identifies people, detect speaking people, and non-verbal cues (nod, visual focus of attention).
Importance Sampling This python package provides a library that accelerates the training of arbitrary neural networks created with Keras using importance sampling.
inv-tn Inverse Text Normalization using NMT models
ISS The Idiap Speech Scripts (ISS) is a collection of speech databases and dictionaries, and for training and testing of models for ASR. The scripts in turn are reliant on many other packages including HTK/HTS, Juicer and the ICSI speech tools.
joint-embedding-nmt Pytorch implementation of the structure-aware output layer for neural machine translation which was presented at WMT 2018
kaldi-ivector The code is an implementation of the standard i-vector extraction algorithm for the Kaldi toolkit.
KiSC K.I.S.S. Cluster (KiSC) - with K.I.S.S. as in "Keep It Stupid Simple" - is a utility that aims to simplify the life of administrators managing resources accross a cluster of hosts
libssp Library for speech signal processing
Fast Transformers This library aims to facilitate research on efficient transformer models and provides PyTorch implementations for several efficient transformers.
LR-CNN Trains low-rank CNNs from raw speech using Keras/Tensorflow, with inputs from Kaldi directories.
MASH Framework Back-end of the MASH computation farm
mash-simulator mash-simulator is a 3D simulator for Linux and MacOS where a robot must complete a certain number of tasks in different randomized environments.
mash-web Front-end of the MASH computation farm
ML3 ML3 is an open source implementation of the Multiclass Latent Locally Linear Support Vector Machine algorithm, a multi-class local classifier based on a latent SVM formulation.
mhan Multilingual hierarchical attention networks toolkit
MSER Linear time Maximally Stable Extremal Regions (MSER) implementation as described in D. Nistér and H. Stewénius, Linear Time Maximally Stable Extremal Regions"
Multi Camera Calibration Suite This toolset provides the basics for calibrating a multi-camera scene. it contains six utilities for different purposes. In this README I will walk the user through the calibration of a multi camera scene using this toolset.
nnsslm Neural Network based Sound Source Localization Models
pbdlib-matlab PbDlib is a set of tools combining statistical learning, dynamical systems and optimal control approaches for programming-by-demonstration applications
phonvoc: Phonetic and phonological vocoding platform Phonvoc is a cascaded deep neural network composed of speech analyser and synthesizer that use shared phonological speech representation.

This is a (yet another!) python wrapper for Kaldi. The main goal is to be able to train acoustic models in Pytorch so that we can

  • use MMI cost function during training
  • use NG-SGD for affine transformations, which enables multi-GPU training with SGE
Probabilistic Models: temporal topic models and more Topic models such as Latent Dirichlet Allocation (LDA) have been used successfully in many domains for data mining. Originally designed for text documents, these methods find some hidden “topics” considering that each document is a weighted mixture of topics. Each topic expresses itself in a document by generating some specific words with more probability than others.
PSF Estimation Code for the PyTorch implementation of "Spatially-Variant CNN-based Point Spread Function Estimation for Blind Deconvolution and Depth Estimation in Optical Microscopy", IEEE Transactions on Image Processing, 2020.
Raw Speech Classification Trains CNN (or any neural network based) classifiers from raw speech using Keras and tests them. The inputs are lists of wav files, where each file is labelled. It then creates fixed length signals and processes them. During testing, it computes scores at the utterance or speaker levels by averaging the corresponding frame-level scores from the fixed length signals.
Remote heart rate measurement from face video sequences This package provides three baseline algorithms to perform remote photoplethysmography (rPPG), which consists in measuring the heart rate from a face video sequence. The software package implements three different algorithms to retrieve the pulse signal from skin color variations: an approach based on colorspace transformation, another approach solely based on signal processing, and a more recent approach, which analyzes the subspace spanned by skin-colored pixels in the RGB colorspace.
Residual pose Residual Pose: A Decoupled Approach for Depth-based 3D Human Pose Estimation
RGBD: A Python based RGB-D data processing module This python module implements the streaming, calibration and visualization of RGB-D data, that is, combined color and depth images.
sae_lang_detect: Supervised Autoencoder for Language Detection The Supervised Autoencoder (SAE) with Bayesian Optimization (BO) for the language detection task found effectively for discriminating between very close languages or dialects. This library contains the PyTorch implementation of SAE with one sample code for using it for the language detection task. The library can be used for other NLP classification tasks (e.g. Fake News Detection, Operant Motive Detection) easily. It supports both CPU and GPU versions with just turn on/off the GPU flag ("is_gpu = True or False").
Semi-Blind Spatially-Variant Deconvolution Code for "Semi-Blind Spatially-Variant Deconvolution in Optical Microscopy with Local Point Spread Function Estimation By Use Of Convolutional Neural Networks" ICIP 2018
Simple Imager Simple Imager (Linux Imaging and Deployment Made Easy) is a set of tools allowing an imaging server to retrieve a copy of Linux reference hosts (sources) and allowing those images to be deployed to other target hosts by the mean of RSync or BitTorrent files download.
SLOG - Similarity Learning on Graph SLOG contains implementation of similarity learning methods over relational data, where the relation between data points are given explicitly
Speaker Diarization Toolkit The toolkit is intended to facilitate research in multistream speaker diarization providing a platform for research in novel audio, video or location features. It is based on the Information Bottleneck principle and is explicitely designed to use of several hetergenous feature streams.
SSP SSP stands for Speech Signal Processing. It is a fairly small package written in python. Its functionality is similar to tracter, with some overlap and some additional capabilities. In particular, SSP contains a parametric vocoder, a pitch extractor and feature extraction for ASR.
symfony bundle datajukebox The Data Jukebox Bundle is a PHP/Symfony bundle which aims to provide - for common CRUD (Create-Read-Update-Delete) operations - the same level of abstraction that Symfony does for forms.
Tasting Families of Features for Image Classification Please find below the code necessary to reproduce the experiments of the paper Tasting Families of Features for Image Classification" under the GPL v2 license. "
tf robot learning Tensorflow robot learning library
The Multi-Tracked Paths This is an implementation of the variant of KSP for tracking presented in (Berclaz et al. 2011). You can get more information and the reference implementation from the CVLab's web page about multi-camera tracking.
Torch Statistical machine learning library containing most of the state-of-the-art algorithms. Written in Lua and C, the library is distributed under a BSD license.
Torgo ASR This is a Kaldi recipe to build automatic speech recognition systems on the Torgo corpus of dysarthric speech.
trimed The trimed algorithm for obtaining the medoid of a set
t-softmax t-softmax pytorch reproducibility code. The repository contains the code to reproduce the results of the paper: Niccolò Antonello, Philip N. Garner "A t-distribution based operator for enhancing out of distribution robustness of neural network classifiers," IEEE Signal Processing Letters, 2020, to appear.
unet interspeech 2019 U-NET based feature extractor for text-independent speaker verification
warca WARCA is a simple and fast algorithm for metric learning.
Webvalidation This software is a multi users, multi projects web annotation tool that help to organize the process of validating automatically generated transcriptions.
wmil-sgd A weighted multiple-instance learning algorithm based on stochastic gradient descent
xbob thesis elshafey 2014 This package contains scripts to reproduce the experiments of Laurent El Shafey's Ph.D. thesis at Ecole Polytechnique Fédérale de Lausanne (EPFL).
zentas Software for doing k-medoids using an accelerated CLARANS algorithm

Obsoleted Software

Juicer Juicer is a Weighted Finite State Transducer (WFST) based decoder for Automatic Speech Recognition (ASR).
Tracter Tracter is a data flow framework.
Torch3vision Common software library for computer vision with machine learning algorithms. Written in simple C++, this library is based on Torch and distributed under a BSD license.