Introduction to Temporal Data Mining

Probabilistic Models: temporal topic models and more
Overview Temporal Models Goal Model: PLSM Model: HDLSM Downloads Publications License Acknowledgments	PLSM: introduction PLSM stands for Probabilistic Latent Sequential Motif. It can be seen as a time-sensitive evolution of PLSA (Probabilistic Latent Sequential Analysis) which is the original probabilistic topic model. PLSM, similarily to PLSA, is defined by a probabilistic generative model and learning the parameters of the model can be done using an EM algorithm (Expectation-Maximization). PLSM: understanding the model PLSM can be represented as a graphical model, wherein nodes represent random variables and the absence of link between nodes represents conditional independence. Here, we provide three equivalent views of the PLSM model. The PLSM model explains how the set of all observations is supposed to be generated. Each observation is a triple (d,w,t_a) meaning that a word w occured once at time t_a in the document d. PLSM supposes that there exists a set of K motifs named φ (represented only in the last version). The generative process of each observation goes as follow: draw the document d from a distribution p(d), draw a pair (z,t_s) made of a motif index and a starting time, drawn from a per document starting distribution p(z,t_s\|d), given this z, draw a pair (w,t_r) of a word and a relative time, drawn from the corresponding motif defined as a distribution p(w,t_r\|z) (or φ_z(w,t_r)). set the absolute time of the observation as the sum of the motif starting time and the drawn relative time: t_a = t_s + t_r. Given a set observations, an Expectation Maximization algorithm allows to find the most likely parameters. The set of parameters is made of the p(z,t_s\|d) distribution and the p(w,t_r\|z) distributions (φ in the third representation).

Copyright © 2012-2014 by Idiap research institute, All Rights Reserved.

PLSM: introduction

PLSM: understanding the model