Gaussian Mixture Models (GMM)

gmm.h is an implementation of Gaussian Mixture Models (GMMs). The main functionality provided by this module is learning GMMs from data by maximum likelihood. Model optimization uses the Expectation Maximization (EM) algorithm [6] . The implementation supports float or double data types, is parallelized, and is tuned to work reliably and effectively on datasets of visual features. Stability is obtained in part by regularizing and restricting the parameters of the GMM.

Getting started demonstreates how to use the C API to compute the FV representation of an image. For further details refer to:

Getting started

In order to use gmm.h to learn a GMM from training data, create a new VlGMM object instance, set the parameters as desired, and run the training code. The following example learns numClusters Gaussian components from numData vectors of dimension dimension and storage class float using at most 100 EM iterations:

float * means ;
float * covariances ;
float * priors ;
float * posteriors ;
double loglikelihood ;
// create a new instance of a GMM object for float data
gmm = vl_gmm_new (VL_TYPE_FLOAT, dimension, numClusters) ;
// set the maximum number of EM iterations to 100
// set the initialization to random selection
// cluster the data, i.e. learn the GMM
vl_gmm_cluster (gmm, data, numData);
// get the means, covariances, and priors of the GMM
means = vl_gmm_get_means(gmm);
covariances = vl_gmm_get_covariances(gmm);
priors = vl_gmm_get_priors(gmm);
// get loglikelihood of the estimated GMM
loglikelihood = vl_gmm_get_loglikelihood(gmm) ;
// get the soft assignments of the data points to each cluster
posteriors = vl_gmm_get_posteriors(gmm) ;
Note
VlGMM assumes that the covariance matrices of the GMM are diagonal. This reduces significantly the number of parameters to learn and is usually an acceptable compromise in vision applications. If the data is significantly correlated, it can be beneficial to de-correlate it by PCA rotation or projection in pre-processing.

vl_gmm_get_loglikelihood is used to get the final loglikelihood of the estimated mixture, vl_gmm_get_means and vl_gmm_get_covariances to obtain the means and the diagonals of the covariance matrices of the estimated Gaussian modes, and vl_gmm_get_posteriors to get the posterior probabilities that a given point is associated to each of the modes (soft assignments).

The learning algorithm, which uses EM, finds a local optimum of the objective function. Therefore the initialization is crucial in obtaining a good model, measured in term of the final loglikelihood. VlGMM supports a few methods (use vl_gmm_set_initialization to choose one) as follows:

Method VlGMMInitialization enumeration Description
Random initialization VlGMMRand Random initialization of the mixture parameters
KMeans VlGMMKMeans Initialization of the mixture parameters using VlKMeans
Custom VlGMMCustom User specified initialization

Note that in the case of VlGMMKMeans initialization, an object of type VlKMeans object must be created and passed to the VlGMM instance (see K-means clustering to see how to correctly set up this object).

When a user wants to use the VlGMMCustom method, the initial means, covariances and priors have to be specified using the vl_gmm_set_means, vl_gmm_set_covariances and vl_gmm_set_priors methods.