Tutorials>Local feature frames

This page introduces the notion of local feature frame used extensively in VLFeat. A feature frame or simply a frame, is a geometric object such as a point, a circle, or an ellipse representing the location and shape of an image feature. Frame types are closed under certain classes of transformations of the plane (for example circles are closed under similarity transformations) and can be used in corresponding co-variant feature detectors.

Types of frames

VLFeat uses five types of frames:

  • points defined by their center $(x,y)$;
  • circles defined by their center $(x,y)$ and radius $\sigma$;
  • ellipses defined by their center $T = (x,y)$, and a positive semidefinte matrix $\Sigma$ such that the ellipse is the set of points $\{\bx \in \real^2: (\bx-T)^\top\Sigma^{-1}(\bx-T)=1\}$;
  • oriented circles defined by their center $(x,y)$, their radius $\sigma$, and rotation $\theta$;
  • and oriented ellipses defined by an affine transformation $(A,T)$, where $A\in\real^{2\times2}$ is the linear component and $T\in\real^2$ the translation.

A frame of each of these types can then be represented by 2, 3, 4, 5, or 6 numbers respectively, packed into a vector frame using the conventions detailed in vl_plotframe.

Features frames as geometric frames

The purpose of a frame is twofold. First, it specifies a local image region. Second, and perhaps more importantly, it specifies an image transformation. A frame instance can in fact be thought as a transformed variant of a canonical or standard frame.

For example, a point $(x,y)$ can be seen as the translated version of the origin $(0,0)$ taken as canonical point frame. Likewise, a circle with center $(x,y)$ and radius $\sigma$ can be seen as the translated and rescaled version of a unit circle centered at the origin, taken as canonical circular frame.

In general, different classes of frames are closed under different classes of 2D transformations. For instance, points are closed under all transformations, while disks are closed under translations, rigid motions, similarity, but not general affine transformations. Within a class of compatible transformations, a frame may specify one uniquely if it can be obtained by transforming the standard frame in only one way. For instance, a point $(x,y)$ can be obtained from $(0,0)$ through a unique translation $T=(x,y)$. Likewise, a circle can be obtained from the standard circle by a unique translation and rescaling. However, neither a point or a circle is sufficient to fully specify a similarity transformation (e.g. a circle leaves the rotation undetermined).

Since frames specify transformations of the image domain, i.e. coordinate changes, they are surrogates of geometric reference frames. In particular, the mapping from a standard frame to one measured by a local feature detector is often undone to normalize the local image appearance, a key process in the computation of invariant feature descriptors.

Oriented frames

While unoriented frames (points, circles, and ellipses) are easy to understand, a few words should be spent illustrating their oriented variants. Intuitively, an oriented circle (ellipse) is a circle (ellipse) with a radius representing its orientation, such as the following:

The standard oriented frame: a unit circle, centered at the origin, with a radius pointing downwards. This frame can be seen as an oriented disc with null translation, unit radius, and null rotation, encoded as the 4D vector [0;0;1;0]; alternatively, it can be seen as an oriented ellipse with affine transformation $(I,0)$ encoded as a 6D vector [0;0;1;0;0;1]. Figure generated by vl_demo_frame.

This figure was generated by using the vl_plotframe function:

A = eye(2) ;
T = [0;0] ;
f = [T ; A(:)] ;
vl_plotframe(f) ;

This particular oriented frame is conventionally deemed to be standard and, as shown in the code fragment above, it corresponds to the identity affine transformation. Since this ellipse is also a circle, the frame can equivalently be represented by an oriented circle with unit radius and null orientation:

radius = 1 ;
theta = 0 ;
f = [T ; radius ; theta] ;
vl_plotframe(f) ;

A positive rotation of the frame appears clockwise because the image coordinate system is left-handed (Y axis pointing downwards):

A frame rotated by 45 degrees; note that the rotation is clockwise: this is because the image uses a left-handed coordinate system (Y axis pointing downwards). Figure generated by vl_demo_frame.
radius = 1 ;
theta = pi/4 ;
f = [T ; radius ; theta] ;
vl_plotframe(f) ;

As indicated above, frames are often used to specify image transformations. In particular, oriented ellipses and oriented circles can be obtained by a unique affine transformation of the standard oriented circle shown above (the difference is that, different from oriented ellipses, oriented circles are not close with respect to all affine transformations).

For the oriented ellipse, this affine transformation $(A,T)$ is encoded explicitly in the frame vector used to represent it numerically. For example, the code fragment

f = [T ; A(:)] ;
vl_plotframe(f) ;

produces the plot

An oriented ellipse is specified as the affine transformation $(A,T)$ of the standard oriented frame shown above. Figure generated by vl_demo_frame.

Note that, when features extracted by a detector such as vl_covdet or vl_sift, are normalized, this is done by applying the affine transformation which is the inverse of the one specified by the feature frame; in this way, in fact, the frame is transformed back to its standardized version.

Similarly, unoriented frames can all be seen as affine transformations of the standard unoriented frame (the unit circle centered at the origin). In this case, however, the affine transformation $(A,T)$ is determined only up to a rotation $(AR, T)$. >When this ambiguity exists and an affine transformation $(A,T)$ needs to be selected, it is customary to choose $R$ such that the Y axis of the image is mapped onto itself (see below).

Converting between frame types

The function vl_frame2oell can be used to convert any frame type to an oriented ellipse.

Since all oriented frames are special cases of oriented ellipses, this transformation is trivial for oriented circles and ellipses. On the other hand, rewriting unoriented frames as oriented ellipses requires assigning (arbitrarily) an orientation to them.

By default, when an arbitrary orientation has to be selected in the conversion, this is done in such a way that the affine transformation $(A,T)$ is upright. This means that $A$ maps the Y axis to itself:

\[ A\begin{bmatrix}1\\ 0\end{bmatrix} \propto \begin{bmatrix}1\\ 0\end{bmatrix}. \]

This effect can be better understood by starting from some oriented frames, removing the orientation, and then using vl_frame2oell to generate oriented ellipses back: in the process, orientation information is lost and replaced by a conventional orientation:

Top: randomly sampled oriented ellipses. Middle: the same ellipses with the orientation removed. Bottom: oriented ellipses again, obtained by calling vl_frame2oell; note that the orientation is upright. Figure generated by vl_demo_frame.