PAMIR

PAMIR is a machine learning algorithm to learn a ranking function, i.e. a function which orders documents given a query. It has been primarily designed for multimodal retrieval, such as the retrieval of images from text queries. Its main advantages are scalability (it relies on online learning, which allows training from large datasets) and discriminative training (its training procedure optimizes a loss related to the final retrieval quality). Pamir is also a mountain range in Central Asia, but that's a different story...

Introduction

PAMIR is described in the following papers,

A Discriminative Kernel-based Model to Rank Images from Text Queries,
IEEE Transactions on Pattern Analysis and Machine Intelligence (in press), 2008.
A Discriminative Apporach for the Retrieval of Images from Text Queries,
D. Grangier, F. Monay and S. Bengio, European Conference on Machine Learning (ECML), 2006,
Learning to Retrieve Images from Text Queries,
D. Grangier, F. Monay and S. Bengio, Workshop on Adaptive Multimedia Retrieval (AMR), 2006,

Code

The source code of PAMIR is free, distributed under BSD license. It is simple C++, built upon the Torch machine learning library. Hence, your first step to use it is to install Torch3, as instructed on the Torch3 website. Then, you simply add the PAMIR package to Torch, and that it ! The package comes with a README file that describes the class hierarchy. The two main example files trainImg2.cc and testImg2.cc can be compiled with the same methodology as the examples provided with Torch.

Documentation

Data format

Training and Testing

Two main files are provided as examples with the package, trainImg2 and testImg2.

trainImg2 can train a model, it takes as arguments

train_query_f is a file describing the training query,
the dimension of this matrix is hence (number of queries) x (textual vocabulary size)
train_image_f is the training file for pictures,
the dimension of this matrix is hence (number of pictures) x (visual vocabulary size)
train_relevance_f is the relevance matrix,
this matrix contains only (0/1) values, its dimension is (number of queries) x (number of pictures)
C is a hyper-parameter setting the trade-off between maximizing margin and minimizing errors
n_iter is the number of training iterations
model_file file to save the model

The following options can be provided to measure performance during training,

valid_query_f is a file describing a second set of queries, for validation purposes
valid_image_f is the validation file for pictures
valid_relevance_f is the validation file for the relevance
measure_file is a file containing various measurements on the validation set
measure_freq sets the frequency (in # of iterations) of measures over the validation set

testImg2 can test a model, it takes as arguments

test_query_f is a file describing the test queries
test_image_f is the test file for pictures
test_relevance_f is the test file for the relevance
measure_file is a file containing various measurements on the test set
model_file model to load

Example Rankings

We provide examples, comparing PAMIR to alternative solutions, such as Support Vector Machines (SVM) and Probabilistic Latent Semantic Analysis (PLSA) over the Corel dataset.

Details on these experiments can be found in [Grangier and Bengio, 2008], see above.

Acknowledgments

This work has been supported by the Swiss NSF through the MULTI project and by the Swiss OFES through the PASCAL European Network of Excellence. Part of this research has been performed while Samy Bengio was at the IDIAP Research Institute.

David Grangier	Samy Bengio
	Google Inc.	Idiap Research Institute
	lastname@google.com	info@idiap.ch