Idiap on LinkedIn Idiap youtube channel Idiap on Twitter Idiap on Facebook
Personal tools
You are here: Home Research Resources hpca


— filed under:

hpca is a C++ toolkit providing an efficient implementation of the Hellinger PCA for computing word embeddings

Word embeddings resulting from neural language models have been shown to be a great asset for a large variety of NLP tasks. However, such architecture might be difficult and time-consuming to train. Instead, we propose to drastically simplify the word embeddings computation through a Hellinger PCA of the word cooccurence matrix. We compare those new word embeddings with some well-known embeddings on named entity recognition and movie review tasks and show that we can reach similar or even better performance. Although deep learning is not really necessary for generating good word embeddings, we show that it can provide an easy way to adapt embeddings to specific tasks

See the EACL 2014 paper for more details.

Document Actions
Resource Information
Resource type: software
Date: Sep 17, 2015
Size: 1.2MB
Ownership: Idiap Research Institute
Distribution: Web
Contact: Contact us