Personal tools
You are here: Home Dataset

Idiap datasets listing



Name Short desc Size Numb. files License yes/no Dist. type
Name:
+ 3DMAD
Short Description:
The 3D Mask Attack Database (3DMAD) currently contains 76500 frames of 17 persons, recorded using Kinect
Size:
39G
Number of files:
5
License:
1
Distribution type:
WEB
Name:
+ AMI
Short Description:
AMI Meeting Corpus is a multi-modal data set consisting of 100 hours of meeting recordings.
Size:
875G
Number of files:
174
License:
1
Distribution type:
WEB
Name:
+ AREX
Short Description:
AMI Requests for Explanations and Relevance Judgments for their Answers
Size:
1M
Number of files:
1
License:
0
Distribution type:
WEB
Name:
+ AV16-3
Short Description:
Audio-Visual Corpus for Speaker Localization and Tracking
Size:
7G
Number of files:
6
License:
1
Distribution type:
WEB
Name:
+ avspoof
Short Description:
Database including 10 types of voice recognition attacks
Size:
29G
Number of files:
3
License:
1
Distribution type:
WEB
Name:
+ bioscote
Short Description:
This dataset contains raw scores in plain text format of several biometric (face and speaker) recognition systems applied on several databases.
Size:
80GB
Number of files:
9
License:
0
Distribution type:
WEB
Name:
+ CCC
Short Description:
Cursive Character Challenge
Size:
215M
Number of files:
4
License:
0
Distribution type:
WEB
Name:
+ COHFACE
Short Description:
The COHFACE dataset contains RGB video sequences of faces, synchronized with heart-rate and breathing-rate of the recorded subjects.
Size:
310M
Number of files:
1
License:
1
Distribution type:
WEB
Name:
+ csmad
Short Description:
Custom Silicone Mask Attack Dataset
Size:
182G
Number of files:
3
License:
1
Distribution type:
WEB
Name:
+ DeepfakeTIMIT
Short Description:
DeepfakeTIMIT is a database of videos where faces are swapped using the open source GAN-based approach, which, in turn, was developed from the original autoencoder-based Deepfake algorithm.
Size:
217M
Number of files:
1
License:
0
Distribution type:
WEB
Name:
+ DIH
Short Description:
DIH (Depth Images with Humans). A dataset of depth images of people for the tasks of body pose estimation and body landmark detection in depth images.
Size:
38G
Number of files:
7
License:
1
Distribution type:
WEB
Name:
+ Disco-Annotation
Short Description:
Disco-Annotation is a collection of training and test sets with manually annoted discourse relations for 8 English discourse connectives in europarl texts.
Size:
204K
Number of files:
1
License:
0
Distribution type:
WEB
Name:
+ ELEA
Short Description:
The corpus was gathered with the aim of analyzing emergent leadership as a social phenomenon that occurs in newly formed groups.
Size:
4.1G
Number of files:
1
License:
1
Distribution type:
WEB
Name:
+ ERPA
Short Description:
This is a small dataset representing face-image data from 5 subjects (‘subject1’ – ‘subject5’). For each subject, images have been captured with two cameras – the Intel Realsense SR300, and the Xenics Gobi thermal (LWIR) camera.
Size:
2.4G
Number of files:
1
License:
1
Distribution type:
WEB
Name:
+ Europarl-direct
Short Description:
Europarl-direct These files provide statement pair extractions from the Europarl corpus of the same known source language directly translated to the target languages
Size:
149M
Number of files:
1
License:
1
Distribution type:
WEB
Name:
+ eyediap
Short Description:
The EYEDIAP dataset was designed to train and evaluate gaze estimation algorithms from RGB and RGB-D data. It contains a diversity of participants, head poses, gaze targets and sensing conditions.
Size:
54G
Number of files:
17
License:
1
Distribution type:
WEB
Name:
+ fvspoofingattack
Short Description:
The Spoofing-Attack Database for finger vein spoofing consists of 440 index real and fake finger images attempts to 110 clients.
Size:
54M
Number of files:
1
License:
1
Distribution type:
WEB
Name:
+ HeadPose
Short Description:
The objective was to construct a video database allowing to perform quantitative evaluation of algorithms extracting information related to the head pose of people, such as head tracking and pose estimation algorithms, or focus of attention analysis.
Size:
2.6GB
Number of files:
1
License:
1
Distribution type:
WEB
Name:
+ idiap-poster-data
Short Description:
The Idiap Poster Data consists of images extracted from 6 hours of videos shot during a poster session.
Size:
43 GB
Number of files:
6
License:
1
Distribution type:
WEB
Name:
+ maya-codex
Short Description:
The Maya Codex Dataset contains high-quality representation of the ancient Maya hieroglyph data, and a statistic glyph co-occurrence information that we extracted from the Thompson catalog [1].
Size:
61M
Number of files:
1
License:
1
Distribution type:
WEB
Name:
+ MDC
Short Description:
MDC consists of large quantities of continuous data pertaining to the behaviour of individuals and social networks, recorded via mobile phones from 2009 to 2011 in the Lausanne/Geneva area. About 200 persons participated in the data collecting campaign.
Size:
50 GB
Number of files:
1
License:
1
Distribution type:
HDD
Name:
+ Mediaparl
Short Description:
Mediaparl is a Swiss accented bilingual database containing recordings in both French and German as they are spoken in Switzerland
Size:
4.8GB
Number of files:
1
License:
1
Distribution type:
WEB
Name:
+ MOBIO
Short Description:
The MOBIO database currently consists of 152 people (audio and video samples) with 12 sessions each
Size:
~135GB
Number of files:
18
License:
1
Distribution type:
WEB
Name:
+ msspoof
Short Description:
Multispectral-Spoof contains face images and printed spoofing attacks recorded in Visible (VIS) and Near-Infrared (NIR) spectra for 22 identities.
Size:
1.9G
Number of files:
1
License:
1
Distribution type:
WEB
Name:
+ PrintAttack
Short Description:
The Print-Attack Replay Database for consists of 200 video clips of printed-photo attack attempts to 50 clients, under different lighting conditions. It also contains 200 real-access attempt videos from the same clients
Size:
1.1Gb
Number of files:
7
License:
1
Distribution type:
WEB
Name:
+ Replay-Mobile
Short Description:
The Replay-Mobile database for face anti-spoofing on mobile-devices consists of 1190 videos of 40 subjects, including real-access videos and attack videos. The database was produced at the IDIAP, Switzerland, in collaboration with Gradiant, Spain.
Size:
15G
Number of files:
2
License:
1
Distribution type:
WEB
Name:
+ ReplayAttack
Short Description:
The Replay-Attack Database for face spoofing consists of 1300 video clips of photo and video attack attempts to 50 clients, under different lighting conditions. This Database was produced at the Idiap Research Institute, in Switzerland.
Size:
~3 Gb (compressed)
Number of files:
7
License:
1
Distribution type:
WEB
Name:
+ sslr
Short Description:
Pepper Sound Localization Dataset
Size:
~150G
Number of files:
0
License:
1
Distribution type:
WEB
Name:
+ TA2
Short Description:
The TA2 database consists of high-definition, simultaneous A/V recordings and annotations from two separate rooms, where the participants play games and communicate with each other over a video-conferencing system.
Size:
~50GB
Number of files:
2
License:
1
Distribution type:
WEB
Name:
+ TED
Short Description:
A dataset for recommendations collected from ted.com which contains metadata fields for TED talks and user profiles with rating and commenting transactions.
Size:
100.77 MB
Number of files:
1
License:
0
Distribution type:
WEB
Name:
+ Tense-Annotation
Short Description:
This dataset provides parallel texts in English/French from Europarl, along with an alignment of the verbs in the sentences with information on their position, tense and voice.
Size:
300M
Number of files:
2
License:
0
Distribution type:
WEB
Name:
+ UBIPose
Short Description:
The UBIPose dataset is a subset of the UBImpressed dataset. It is intended for the evaluation of head pose estimation algorithms in natural and challenging scenarios. This dataset provides the annotation of the positions of 6 facial landmarks (two corner
Size:
97G
Number of files:
5
License:
1
Distribution type:
WEB
Name:
+ unicity
Short Description:
UNICITY contains top-view depth images of people entering a security airlock.
Size:
Number of files:
58000
License:
1
Distribution type:
WEB
Name:
+ vera-fingervein
Short Description:
The VERA Fingervein Database for fingervein recognition consists of 440 images from 110 clients.
Size:
33M
Number of files:
2
License:
1
Distribution type:
WEB
Name:
+ vera-palmvein
Short Description:
The VERA Palmvein Database for palmvein recognition consists of 2200 images from 110 clients. This Database was produced at the Idiap Research Institute in Martigny and at Haute Ecole Spécialisée de Suisse Occidentale in Sion, in Switzerland.
Size:
209M
Number of files:
1
License:
1
Distribution type:
WEB
Name:
+ vera-spoofingfingervein
Short Description:
The VERA Spoofing Fingervein Database for direct attacks fingervein recognition consists of 200 images attempts to the 50 first clients from the Idiap Research Institute VERA Fingervein Database. This Database was produced at the Idiap Research Institute
Size:
15M
Number of files:
1
License:
1
Distribution type:
WEB
Name:
+ vera-spoofingpalmvein
Short Description:
The VERA Spoofing Palmvein Database for direct attacks palmvein recognition consists of 1000 images attempts to the 50 first clients from the Idiap Research Institute VERA Palmvein Database. This Database was produced at the Idiap Research Institute in Ma
Size:
218M
Number of files:
1
License:
1
Distribution type:
WEB
Name:
+ voicePA
Short Description:
The database with speech data from 44 speakers and 28 presentation attacks, including synthetic and replay attacks, recorded in different environments by using different speakers and microphones (mobile phones and laptop)
Size:
39G
Number of files:
1
License:
1
Distribution type:
WEB
Name:
+ walliserdeutsch
Short Description:
News bulletins in the upper valaisan german dialect, broadcast by RRO (radio rottu oberwallis), taken from their web site and annotated at Idiap.
Size:
~4G
Number of files:
1
License:
1
Distribution type:
WEB
Name:
+ WOLF
Short Description:
The WOLF corpus is an audio-visual data set containing around 81 hours of conversational data among groups of 8-12 people playing a role playing game.
Size:
~100GB
Number of files:
15
License:
1
Distribution type:
WEB
Name:
+ youtube-personality
Short Description:
The YouTube personality dataset consists of a collection of behavorial features, speech transcriptions, and personality impression scores for a set of 404 YouTube vloggers that explicitly show themselves in front of the a webcam talking about a variety of
Size:
496KB
Number of files:
1
License:
0
Distribution type:
WEB