Tutorials>HOG features

The HOG features are widely use for object detection. HOG decomposes an image into small squared cells, computes an histogram of oriented gradients in each cell, normalizes the result using a block-wise pattern, and return a descriptor for each cell.

Stacking the cells into a squared image region can be used as an image window descriptor for object detection, for example by means of an SVM.

This tutorial shows how to use the VLFeat function vl_hog to compute HOG features of various kind and manipulate them.

Basic HOG computation

We start by considering an example input image:

An example image.

HOG is computed by calling the vl_hog function:

cellSize = 8 ;
hog = vl_hog(im, cellSize, 'verbose') ;

The same function can also be used to generate a pictorial rendition of the features, although this unavoidably destroys some of the information contained in the feature itself. To this end, use the render command:

imhog = vl_hog('render', hog, 'verbose') ;
clf ; imagesc(imhog) ; colormap gray ;

This should produce the following image:

Standard HOG features with a cell size of eight pixels.

HOG is an array of cells, with the third dimension spanning feature components:

> size(hog)

ans =

    16    16    31

In this case the feature has 31 dimensions. HOG exists in many variants. VLFeat supports two: the UoCTTI variant (used by default) and the original Dalal-Triggs variant (with 2×2 square HOG blocks for normalization). The main difference is that the UoCTTI variant computes bot directed and undirected gradients as well as a four dimensional texture-energy feature, but projects the result down to 31 dimensions. Dalal-Triggs works instead with undirected gradients only and does not do any compression, for a total of 36 dimension. The Dalal-Triggs variant can be computed as

% Dalal-Triggs variant
cellSize = 8 ;
hog = vl_hog(im, cellSize, 'verbose', 'variant', 'dalaltriggs') ;
imhog = vl_hog('render', hog, 'verbose', 'variant', 'dalaltriggs') ;

The result is visually very similar:

Dalal-Triggs variant. Differences with the standard version are difficult to appreciated in the rendition.

Flipping HOG from left to right

Often it is necessary to flip HOG features from left to right (for example in order to model an axis symmetric object). This can be obtained analytically from the feature itself by permuting the histogram dimensions appropriately. The permutation is obtained as follows:

% Get permutation to flip a HOG cell from left to right
perm = vl_hog('permutation') ;

Then these two examples produce identical results (provided that the image contains an exact number of cells:

imHog = vl_hog('render', hog) ;
imHogFromFlippedImage = vl_hog('render', hogFromFlippedImage) ;
imFlippedHog = vl_hog('render', flippedHog) ;

This is shown in the figure:

Flipping HOG features from left to right either by flipping the input image or the features directly.

Other HOG parameters

vl_hog supports other parameters as well. For example, one can specify the number of orientations in the histograms by the numOrientations option:

% Specify the number of orientations
hog = vl_hog(im, cellSize, 'verbose', 'numOrientations', o) ;
imhog = vl_hog('render', hog, 'verbose', 'numOrientations', o) ;

Changing the number of orientations changes the features quite significantly:

HOG features for numOrientations equal to 3, 4, 5, 9, and 21 repsectively.

Another useful option is BilinearOrientations switching on the bilinear orientation assignment of the gradient (this is not used in certain implementation like UoCTTI).

% Specify the number of orientations
hog = vl_hog(im,cellSize,'numOrientations', 4) ;
imhog = vl_hog('render', hog, 'numOrientations', 4) ;

resulting in

From left to right: input image, hard orientation assigments for numOrientations equals to four, and soft orientation assigments.