dhog.h File Reference


Detailed Description

This module implements the Dense Histogram of Gradient descriptor (DHOG). The DHOG descriptor is equivalent or similar to a SIFT descriptor computed at each pixel of an image, for a fixed scale and orientation.

Each DHOG descriptor is an histogram of the gradient orientations inside a square image patch, very similar to a SIFT descriptor. The histogram has $N_b$ bins along each spatial direction x and y and $N_o$ bins along the orientation dimension. Each spatial bin correspond a portion of image which has size equal to $\Delta$ pixels (along x and y), where $\Delta$ is an even number. In addition, bins are bilinearly interpolated and partially overlap. The following figure illustrates which pixels (tick marks along the axis) are associated to one of two bins, and with weights (triangular signals).

dhog-bins.png

The top row shows bins of extension $\Delta=2$ whose center is aligned to a pixel. The bottom row shows the same bins, but whose center sits in between two adjacent pixels. Notice that while the bin extent is always the same, in the first case the pixels that actually contributes to a bin (i.e. the ones with weights greater than zero) are $2\Delta -1$, while in the second case they are $2\Delta$.

While both arrangements could be used, this implementation uses only the first (centers aligned to pixels), which is valid also for the case $\Delta=1$.

Covering with descriptor and downsampling

DHOG extracts a dense collection of descriptors, one each few pixels. Since most calculations use convolutions, for simplicity we require the centers of all descriptor bins to lie at integer coordinates within the image boundaries.

The center of the top-left bin of the top-left descriptor has coordinates

\[ x_0(0) = 0,\qquad y_0(0) = 0 \]

The center of the top-left bin of any other descriptor is

\[ x_0(i) = x_0(0) + i \delta, \qquad y_0(j) = y_0(0) + j \delta \]

where $\delta$ is the sampling step (an integer not smaller than one). The center of the bottom-right bin is related to the center of the top-left one by

\[ x_{N_s-1}(i) = x_0(i) + \Delta (N_s - 1), \qquad y_{N_s-1}(i) = y_0(i) + \Delta (N_s - 1). \]

Notice that if $N_s = 1$ the two bins coincide. Our conditions translate into $0 \leq x_0(0) \leq x_{N_s-1}(i_{\mathrm{max}}) \leq \mathrm{width} - 1$. Hence

\[ 0 \leq i \leq \lfloor \frac{\mathrm{width} - 1 - \Delta (N_s - 1)}{\delta} \rfloor \]

and similarly for the y coordinate. Notice that the center of each descriptor is then given by

\[ x_{(N_s-1)/2}(i) = \delta i + \frac{\Delta (N_s - 1)}{2}, \qquad y_{(N_s-1)/2}(i) = \delta j + \frac{\Delta (N_s - 1)}{2}. \]

Author:
Andrea Vedaldi

Definition in file dhog.h.

#include "generic.h"

Go to the source code of this file.


Data Structures

struct  VlDhogKeypoint_
 DHOG keypoint. More...
struct  VlDhogFilter_
 DHOG filter. More...

Functions

VL_EXPORT VlDhogFiltervl_dhog_new (int width, int height, int sampling_step, int bin_size)
 Allocate and initialize a new DHOG filter.
VL_EXPORT void vl_dhog_delete (VlDhogFilter *f)
 Delete DHOG filter.
VL_EXPORT void vl_dhog_process (VlDhogFilter *f, float const *im, vl_bool fast)
 Compute Dense Feature Transform.
Retrieving data and parameters
VL_INLINE float * vl_dhog_get_descriptors (VlDhogFilter *f)
 Get descriptors.
VL_INLINE int vl_dhog_get_keypoint_num (VlDhogFilter *f)
 Get number of keypoints.
VL_INLINE VlDhogKeypointvl_dhog_get_keypoints (VlDhogFilter *f)
 Get keypoints.
VL_INLINE void vl_dhog_transpose_descriptor (float *dst, float const *src)
 Transpose descriptor.

Function Documentation

VL_EXPORT void vl_dhog_delete ( VlDhogFilter f  ) 

Parameters:
f filter to delete.

Definition at line 166 of file dhog.c.

References VlDhogFilter_::descr, VlDhogFilter_::hist, VlDhogFilter_::keys, VlDhogFilter_::tmp, VlDhogFilter_::tmp2, and vl_free().

float * vl_dhog_get_descriptors ( VlDhogFilter f  ) 

Parameters:
f DHOG filter.
Returns:
descriptors.

Definition at line 67 of file dhog.h.

References VlDhogFilter_::descr.

int vl_dhog_get_keypoint_num ( VlDhogFilter f  ) 

Parameters:
f DHOG filter.

Definition at line 89 of file dhog.h.

References VlDhogFilter_::nkeys.

VlDhogKeypoint * vl_dhog_get_keypoints ( VlDhogFilter f  ) 

Parameters:
f DHOG filter.

Definition at line 78 of file dhog.h.

References VlDhogFilter_::keys.

VL_EXPORT VlDhogFilter* vl_dhog_new ( int  width,
int  height,
int  sampling_step,
int  bin_size 
)

Parameters:
width 
height 
sampling_step step used to sample descriptors (must be 1 or a multiple of bin_size).
bin_size 
Returns:
new filter.

Definition at line 135 of file dhog.c.

References VlDhogFilter_::dheight, VlDhogFilter_::dwidth, VlDhogFilter_::height, VlDhogFilter_::hist, VlDhogFilter_::nkeys, vl_malloc(), and VlDhogFilter_::width.

VL_EXPORT void vl_dhog_process ( VlDhogFilter f,
float const *  im,
vl_bool  fast 
)

VL_INLINE void vl_dhog_transpose_descriptor ( float *  dst,
float const *  src 
)

Parameters:
dst destination buffer.
src source buffer.
The function writes to dst the transpose of the SIFT descriptor src. Let I be an image. The transpose operator satisfies the equation transpose(dhog(I,x,y)) = dhog(transpose(I),y,x)

Definition at line 108 of file dhog.h.