C API

dsift.h File Reference


Detailed Description

Author:
Andrea Vedaldi

Brian Fulkerson

Dense Scale Invariant Feature Transform

This module implements a dense version of SIFT. This is an object that can quickly compute descriptors for densely sampled keypoints with identical size and orientation. It can be reused for multiple images of the same size.

Overview

See also:
The SIFT module, Technical details
This module implements a fast algorithm for the calculation of a large number of SIFT descriptors of densely sampled keypoints of the same scale and orientation. See the SIFT section for an overview of SIFT.

The keypoints are indirectly specified by the sampling steps (vl_dsift_set_steps). The descriptor geometry (number and size of the spatial bins and number of orientation bins) can be customized (vl_dsift_set_geometry).

dsift-geom.png

Dense SIFT descriptor geometry

By default, SIFT uses a Gaussian windowing function that discounts contributions of gradients further away from the descriptor centers. This function can be changed to a flat window by invoking vl_dsift_set_flat_window (this greatly speeds-up the calculation).

Keypoints are sampled in such a way that all bin centers are at integer coordinates within the image boundaries. vl_dsift_set_bounds can be used to further restrict sampling to the keypoints in an image subregion.

Remarks:
This descriptor is not equivalent to N. Dalal and B. Triggs. Histograms of Oriented Gradients for Human Detection. CVPR 2005. It is instead just a dense version of SIFT.

Usage

DSIFT is implemented by a filter, i.e. an object which can be reused to process sequentially similar images. To use the DSIFT filter object:

Technical details

The calculation of the SIFT descriptor is discussed in the SIFT descriptor section and this section follows that notation.

Dense descriptors

When computing descriptors for many keypoints differing only by their position (and with null rotation), further simplifications are possible. In this case, in fact,

\begin{eqnarray*} \mathbf{x} &=& m \sigma \hat \mathbf{x} + T,\\ h(t,i,j) &=& m \sigma \int g_{\sigma_\mathrm{win}}(\mathbf{x} - T)\, w_\mathrm{ang}(\angle J(\mathbf{x}) - \theta_t)\, w\left(\frac{x - T_x}{m\sigma} - \hat{x}_i\right)\, w\left(\frac{y - T_y}{m\sigma} - \hat{y}_j\right)\, |J(\mathbf{x})|\, d\mathbf{x}. \end{eqnarray*}

Since many different values of T are sampled, this is conveniently expressed as a separable convolution. First, we translate by $ \mathbf{x}_{ij} = m\sigma(\hat x_i,\ \hat y_i)^\top $ and we use the symmetry of the various binning and windowing functions to write

\begin{eqnarray*} h(t,i,j) &=& m \sigma \int g_{\sigma_\mathrm{win}}(T' - \mathbf{x} - \mathbf{x}_{ij})\, w_\mathrm{ang}(\angle J(\mathbf{x}) - \theta_t)\, w\left(\frac{T'_x - x}{m\sigma}\right)\, w\left(\frac{T'_y - y}{m\sigma}\right)\, |J(\mathbf{x})|\, d\mathbf{x}, \\ T' &=& T + m\sigma \left[\begin{array}{cc} x_i \\ y_j \end{array}\right]. \end{eqnarray*}

Then we define kernels

\begin{eqnarray*} k_i(x) &=& \frac{1}{\sqrt{2\pi} \sigma_{\mathrm{win}}} \exp\left( -\frac{1}{2} \frac{(x-x_i)^2}{\sigma_{\mathrm{win}}^2} \right) w\left(\frac{x}{m\sigma}\right), \\ k_j(y) &=& \frac{1}{\sqrt{2\pi} \sigma_{\mathrm{win}}} \exp\left( -\frac{1}{2} \frac{(y-y_j)^2}{\sigma_{\mathrm{win}}^2} \right) w\left(\frac{y}{m\sigma}\right), \end{eqnarray*}

and obtain

\begin{eqnarray*} h(t,i,j) &=& (k_ik_j * \bar J_t)\left( T + m\sigma \left[\begin{array}{cc} x_i \\ y_j \end{array}\right] \right), \\ \bar J_t(\mathbf{x}) &=& w_\mathrm{ang}(\angle J(\mathbf{x}) - \theta_t)\,|J(\mathbf{x})|. \end{eqnarray*}

Furthermore, if we use a flat rather than Gaussian windowing function, the kernels do not depend on the bin, and we have

\begin{eqnarray*} k(z) &=& \frac{1}{\sigma_{\mathrm{win}}} w\left(\frac{z}{m\sigma}\right), \\ h(t,i,j) &=& (k(x)k(y) * \bar J_t)\left( T + m\sigma \left[\begin{array}{cc} x_i \\ y_j \end{array}\right] \right), \end{eqnarray*}

(here $ \sigma_\mathrm{win} $ is the side of the flat window).

Note:
In this case the binning functions $ k(z) $ are triangular and the convolution can be computed in time independent on the filter (i.e. descriptor bin) support size by integral signals.

Sampling

To avoid resampling and dealing with special boundary conditions, we impose some mild restrictions on the geometry of the descriptors that can be computed. In particular, we impose that the bin centers $ T + m\sigma (x_i,\ y_j) $ are always at integer coordinates within the image boundaries. This eliminates the need for costly interpolation. This condition amounts to (expressed in terms of the x coordinate, and equally applicable to y)

\[ \{0,\dots, W-1\} \ni T_x + m\sigma x_i = T_x + m\sigma i - \frac{N_x-1}{2} = \bar T_x + m\sigma i, \qquad i = 0,\dots,N_x-1. \]

Notice that for this condition to be satisfied, the descriptor center $ T_x $ needs to be either fractional or integer depending on $ N_x $ being even or odd. To eliminate this complication, it is simpler to use as a reference not the descriptor center T, but the coordinates of the upper-left bin $ \bar T $. Thus we sample the latter on a regular (integer) grid

\[ \left[\begin{array}{cc} 0 \\ 0 \end{array}\right] \leq \bar T = \left[\begin{array}{cc} \bar T_x^{\min} + p \Delta_x \\ \bar T_y^{\min} + q \Delta_y \\ \end{array}\right] \leq \left[\begin{array}{cc} W - 1 - m\sigma N_x \\ H - 1 - m\sigma N_y \end{array}\right], \quad \bar T = \left[\begin{array}{cc} T_x - \frac{N_x - 1}{2} \\ T_y - \frac{N_y - 1}{2} \\ \end{array}\right] \]

and we impose that the bin size $ m \sigma $ is integer as well.

brief Dense SIFT (DSIFT) author Andrea Vedaldi

Definition in file dsift.h.

#include "generic.h"

Go to the source code of this file.


Data Structures

struct  VlDsiftKeypoint_
 Dense SIFT keypoint. More...
struct  VlDsiftDescriptorGeometry_
 Dense SIFT descriptor geometry. More...
struct  VlDsiftFilter_
 Dense SIFT filter. More...

Functions

VL_EXPORT VlDsiftFiltervl_dsift_new (int width, int height)
 Create a new DSIFT filter.
VL_EXPORT VlDsiftFiltervl_dsift_new_basic (int width, int height, int step, int binSize)
 Create a new DSIFT filter (basic interface).
VL_EXPORT void vl_dsift_delete (VlDsiftFilter *self)
 Delete DSIFT filter.
VL_EXPORT void vl_dsift_process (VlDsiftFilter *self, float const *im)
 Compute keypoints and descriptors.
VL_INLINE void vl_dsift_transpose_descriptor (float *dst, float const *src, int numBinT, int numBinX, int numBinY)
 Transpose descriptor.
VL_EXPORT void _vl_dsift_update_buffers (VlDsiftFilter *self)
Setting parameters
VL_INLINE void vl_dsift_set_steps (VlDsiftFilter *self, int stepX, int stepY)
 Set steps.
VL_INLINE void vl_dsift_set_bounds (VlDsiftFilter *self, int minX, int minY, int maxX, int maxY)
 Set bounds.
VL_INLINE void vl_dsift_set_geometry (VlDsiftFilter *self, VlDsiftDescriptorGeometry const *geom)
 Set SIFT descriptor geometry.
VL_INLINE void vl_dsift_set_flat_window (VlDsiftFilter *self, int useFlatWindow)
 Set flat window flag.
Retrieving data and parameters
VL_INLINE float const * vl_dsift_get_descriptors (VlDsiftFilter const *self)
 Get descriptors.
VL_INLINE int vl_dsift_get_descriptor_size (VlDsiftFilter const *self)
 Get descriptor size.
VL_INLINE int vl_dsift_get_keypoint_num (VlDsiftFilter const *self)
 Get number of keypoints.
VL_INLINE VlDsiftKeypoint const * vl_dsift_get_keypoints (VlDsiftFilter const *self)
 Get keypoints.
VL_INLINE void vl_dsift_get_bounds (VlDsiftFilter const *self, int *minX, int *minY, int *maxX, int *maxY)
 Get bounds.
VL_INLINE void vl_dsift_get_steps (VlDsiftFilter const *self, int *stepX, int *stepY)
 Get steps.
VL_INLINE
VlDsiftDescriptorGeometry
const * 
vl_dsift_get_geometry (VlDsiftFilter const *self)
 Get SIFT descriptor geometry.
VL_INLINE vl_bool vl_dsift_get_flat_window (VlDsiftFilter const *self)
 Get flat window flag.

Function Documentation

VL_EXPORT void _vl_dsift_update_buffers ( VlDsiftFilter self  ) 

For internal use only.

Definition at line 329 of file dsift.c.

Referenced by _vl_dsift_alloc_buffers(), vl_dsift_new(), vl_dsift_set_bounds(), vl_dsift_set_geometry(), and vl_dsift_set_steps().

VL_EXPORT void vl_dsift_delete ( VlDsiftFilter self  ) 

Parameters:
self filter to delete.

Definition at line 472 of file dsift.c.

References _vl_dsift_free_buffers(), and vl_free().

void vl_dsift_get_bounds ( VlDsiftFilter const *  self,
int *  minX,
int *  minY,
int *  maxX,
int *  maxY 
)

Parameters:
self DSIFT filter.
minX bounding box minimum X coordinate. minY bounding box minimum Y coordinate.
maxX bounding box maximum X coordinate.
maxY bounding box maximum Y coordinate.

Definition at line 189 of file dsift.h.

int vl_dsift_get_descriptor_size ( VlDsiftFilter const *  self  ) 

Parameters:
self DSIFT filter.
Returns:
size of a descriptor.

Definition at line 125 of file dsift.h.

Referenced by _vl_dsift_alloc_buffers(), _vl_dsift_with_flat_window(), _vl_dsift_with_gaussian_window(), and vl_dsift_process().

float const * vl_dsift_get_descriptors ( VlDsiftFilter const *  self  ) 

Parameters:
f DSIFT filter.
Returns:
descriptors.

Definition at line 137 of file dsift.h.

int vl_dsift_get_flat_window ( VlDsiftFilter const *  self  ) 

Parameters:
self DSIFT filter.
Returns:
TRUE if the DSIFT filter uses a flat window.

Definition at line 205 of file dsift.h.

VlDsiftDescriptorGeometry const * vl_dsift_get_geometry ( VlDsiftFilter const *  self  ) 

Parameters:
self DSIFT filter.
numBinT 
numBinX numBinY
binSizeX 
binSizeY 

Definition at line 174 of file dsift.h.

Referenced by vl_dsift_new_basic().

int vl_dsift_get_keypoint_num ( VlDsiftFilter const *  self  ) 

Parameters:
self DSIFT filter.

Definition at line 159 of file dsift.h.

Referenced by _vl_dsift_alloc_buffers().

VlDsiftKeypoint const * vl_dsift_get_keypoints ( VlDsiftFilter const *  self  ) 

Parameters:
self DSIFT filter.

Definition at line 148 of file dsift.h.

void vl_dsift_get_steps ( VlDsiftFilter const *  self,
int *  stepX,
int *  stepY 
)

Parameters:
self DSIFT filter.
stepX sampling step along X.
stepY sampling step along Y.

Definition at line 218 of file dsift.h.

VL_EXPORT VlDsiftFilter* vl_dsift_new ( int  imWidth,
int  imHeight 
)

Parameters:
imWidth width of the image.
imHeight height of the image
Returns:
new filter.

Definition at line 400 of file dsift.c.

References _vl_dsift_update_buffers(), VL_FALSE, and vl_malloc().

Referenced by vl_dsift_new_basic().

VL_EXPORT VlDsiftFilter* vl_dsift_new_basic ( int  imWidth,
int  imHeight,
int  step,
int  binSize 
)

Parameters:
imWidth width of the image.
imHeight height of the image.
step sampling step.
binSize bin size.
The descriptor geometry matches the standard SIFT descriptor.

Returns:
new filter.

Definition at line 454 of file dsift.c.

References VlDsiftDescriptorGeometry_::binSizeX, VlDsiftDescriptorGeometry_::binSizeY, vl_dsift_get_geometry(), vl_dsift_new(), vl_dsift_set_geometry(), and vl_dsift_set_steps().

void vl_dsift_set_bounds ( VlDsiftFilter self,
int  minX,
int  minY,
int  maxX,
int  maxY 
)

Parameters:
self DSIFT filter.
minX bounding box minimum X coordinate.
minY bounding box minimum Y coordinate.
maxX bounding box maximum X coordinate.
maxY bounding box maximum Y coordinate.

Definition at line 253 of file dsift.h.

References _vl_dsift_update_buffers().

void vl_dsift_set_flat_window ( VlDsiftFilter self,
int  useFlatWindow 
)

Parameters:
self DSIFT filter.
useFlatWindow true if the DSIFT filter should use a flat window.

Definition at line 284 of file dsift.h.

void vl_dsift_set_geometry ( VlDsiftFilter self,
VlDsiftDescriptorGeometry const *  geom 
)

Parameters:
self DSIFT filter.
geom descriptor geometry parameters.

Definition at line 270 of file dsift.h.

References _vl_dsift_update_buffers().

Referenced by vl_dsift_new_basic().

void vl_dsift_set_steps ( VlDsiftFilter self,
int  stepX,
int  stepY 
)

Parameters:
self DSIFT filter.
stepX sampling step along X.
stepY sampling step along Y.

Definition at line 234 of file dsift.h.

References _vl_dsift_update_buffers().

Referenced by vl_dsift_new_basic().

VL_INLINE void vl_dsift_transpose_descriptor ( float *  dst,
float const *  src,
int  numBinT,
int  numBinX,
int  numBinY 
)

Parameters:
dst destination buffer.
src source buffer.
numBinT 
numBinX 
numBinY The function writes to dst the transpose of the SIFT descriptor src. Let I be an image. The transpose operator satisfies the equation transpose(dsift(I,x,y)) = dsift(transpose(I),y,x)

Definition at line 306 of file dsift.h.