W = VL_PEGASOS(X, Y, LAMBDA) learns a linear SVM W given training vectors X, their labels Y, and the regularization parameter LAMBDA using the PEGASOS [1] solver. The algorithm finds a minimizer W of the objective function

  LAMBDA/2 |W|^2 + 1/N SUM_i LOSS(W, X(:,i), Y(i))

where LOSS(W,X,Y) = MAX(0, 1 - Y W'X) is the hinge loss and N is the number of training vectors in X.

[W B INFO] = VL_SVMPEGASOS(X, Y, LAMBDA) learns a linear SVM W and a bias B given training vectors X, their labels Y, and the regularization parameter LAMBDA using the PEGASOS [1] solver. INFO is a struct containing the input parameters plus diagnostic informations:

energy

SVM energy value.

iterations

Number of iterations performed.

elapseTime

Elapsed time since the start of the SVM learning.

regulizerTerm

Value of the SVM regulizer term.

lossPos

Value of loss function only for data points labeled positives.

lossNeg

Value of loss function onlt for data points labeled negatives.

hardLossPos

Number of mislabeled positive points.

hardLossNeg

Number of mislabeled negative points.

ALGORITHM. PEGASOS is an implementation of stochastic subgradient descent. At each iteration a data point is selected at random, the subgradient of the cost function relative to that data point is computed, and a step is taken in that direction. The step size is inversely proportional to the iteration number. See [1] for details.

VL_SVMPEGASOS() accepts the following options:

Epsilon [empty]

Specify the SVM stopping criterion threshold. If not specified VL_SVMPEGASOS will finish when the maximum number of iterations is reached. The stopping criterion is tested after each ENERGYFREQ iteration.

MaxIterations [10 / LAMBDA]

Sets the maximum number of iterations.

BiasMultiplier [0]

Appends to the data X the specified scalar value B. This approximates the training of a linear SVM with bias.

StartingModel [null vector]

Specify the initial value for the weight vector W.

StartingIteration [1]

Specify the iteration number to start from. The only effect is to change the step size, as this is inversely proportional to the iteration number.

StartingBias [0]

Specify the inital bias value.

BiasLearningRate [1]

Specify the frequency of the bias learning. The default setting updates the bias at each iteration.

Permutation [empty]

Specify a permutation PERM to be used to sample the data (this disables random sampling). Specifically, at the T-th iteration the algorithm takes a step w.r.t. the PERM[T']-th data point, where T' is T modulo the number of data samples (i.e. MOD(T'-1,NUMSAMPLES)+1). PERM needs not to be bijective. This allows specifying certain data points more or less frequently, implicitly increasing their relative weight in the error term. A common application is to balance an unbalanced dataset.

DiagnosticFunction [empty]

Specify a function handle to be called every ENERGYFREQ iterations.

DiagnosticCallRef [empty]

Specify a paramater to be passed to the DIAGNOSTICFUNCTION handle.

EnergyFreq [100]

Specify how often the SVM energy is computed.

HOMKERMAP [empty]

Specify the use of an Homogeneus Kernel map for the training data (See [2],[3]). The passed value N is such that a 2*N+1 dimensional approximated kernel map is computed. Each training data point is expanded online into a vector of dimension 2*N+1.

KChi2

Compute the map for the Chi2 kernel.

KINTERS

Compute the map for the intersection kernel.

KL1

Same as KINTERS, but deprecated as the name is not fully accurate.

KJS

Compute the map for the JS (Jensen-Shannon) kernel.

Period [automatically tuned]

Set the period of the kernel specturm. The approximation is based on periodicizing the kernel specturm. If not specified, the period is automatically set based on the heuristic described in [2].

Window [RECTANGULAR]

Set the window used to truncate the spectrum before The window can be either RECTANGULAR or UNIFORM window. See [2] and the API documentation for details.

Gamma [1]

Set the homogeneity degree of the kernel. The standard kernels are 1-homogeneous, but sometimes smaller values perform better in applications. See [2] for details.

Verbose

Be verbose.

Example

The options StartingModel and StartingIteration can be used to continue training. I.e., the command

  vl_twister('state',0) ;
  w = vl_pegasos(x,y,lambda,'NumIterations',1000) ;

produces the same result as the sequence

  vl_twister('state',0) ;
  w = vl_pegasos(x,y,lambda,'NumIterations',500) ;
  w = vl_pegasos(x,y,lambda,'NumIterations',500, ...
                 'StartingIteration', 501, ...
                 'StartingModel', w) ;
REFERENCES

[1] S. Shalev-Shwartz, Y. Singer, N. Srebro, and A. Cotter. Pegasos: Primal Estimated sub-GrAdient SOlver for SVM. MBP, 2010.

[2] A. Vedaldi and A. Zisserman `Efficient Additive Kernels via Explicit Feature Maps', Proc. CVPR, 2010.

[3] A. Vedaldi and A. Zisserman `Efficient Additive Kernels via Explicit Feature Maps', PAMI, 2011 (submitted).

See also: VL_HOMKERMAP(), VL_HELP().