Tutorials - Pegasos

VLFeat includes a fast SVM solver, called vl_svmpegasos. The function implements the Pegasos SVM algorithm [1], with a few adds such online Homogeneous kernel map expansion and SVM online statistics.

Pegasos SVM

A simple example on how to use vl_svmpegasos is presented below. Let's first build the training data

% Set up training data
Np = 200 ;
Nn = 200 ;

Xp = diag([1 3])*randn(2, Np) ;
Xn = diag([1 3])*randn(2, Nn) ;
Xp(1,:) = Xp(1,:) + 2  ;
Xn(1,:) = Xn(1,:) - 2  ;

X = [Xp Xn] ;
y = [ones(1,Np) -ones(1,Nn)] ;

Plotting X and y we have

Training Data.

Learning a linear classifier can be easily done with the following 2 lines of code:

dataset = vl_maketrainingset(X, int8(y)) ;
[w b info] = vl_svmpegasos(dataset, 0.01, ...
                           'MaxIterations',5000) ;

where we first create a struct containing the training data using vl_maketraining data and then we call the the SVM solver. The output model w is plotted over the training data in the following figure.

Learned model.

The output b is equal to 0 since the training data admits an SVM model passing from the origins.

The output info is a struct containing some statistic on the learned SVM:

info = 

           dimension: 2
          iterations: 5000
       maxIterations: 5000
             epsilon: -1
              lambda: 0.0100
      biasMultiplier: 0
    biasLearningRate: 1
     energyFrequency: 100
         elapsedTime: 0.0022
              energy: 0.1727
     regularizerTerm: 0.0168
             lossPos: 0.1003
             lossNeg: 0.0556
         hardLossPos: 0.1050
         hardLossNeg: 0.0750

It is also possible to use under some assumptions[2] an homogeneous kernel map expanded online inside the solver. This can be done with the following command:

dataset = vl_maketrainingset(X, int8(y).'homkermap',2,'KChi2') ;

The above code creates a training set without applying any homogeneous kernel map to the data. When the solver is called it will expand each data point with a Chi Squared kernel of period 2.

Real-time diagnostics

VLFeat allows to get statistics during the training process. It is sufficient to pass a function handle to the solver. The function will be then called every energyFrequency time.

The following function simply plots the SVM energy values for all the past iterations:

function energy = diagnostics(svm,energy)
  figure(2) ; 
  energy = [energy svm.energy] ;
  plot(energy) ;
  drawnow ;

The energy value for the past iterations are kept in the row vector energy. The following code produces a plot of the energy value in real-time during the learning process.

energy = [] ;
dataset = vl_maketrainingset(X, int8(y)) ;
[w b info] = vl_svmpegasos(dataset, lambda, ...
                           'MaxIterations',5000,...
                           'DiagnosticFunction',@diagnostics,...
                           'DiagnosticCallRef',energy) ;
SVM real-time energy values plot.

References