VLBenchmarks is organised into four parts, corresponding to an equal number of MATLAB packages (namespaces):
localFeatures
). This package contains
wrappers for features detectors and descriptors. Add your own wrapper
here to evaluate your features.datasets
) This package contains code
that manages (downloads and reads) benchmark data. The most common use
is to adopt one of the supported standard benchmarks, but you may want
to add a wrapper to your own dataset here.benchmarks
). This package contains the
benchmarking code.helpers
). Options of functions and of class objects are passed as optional function/constructor arguments in the following format:
Available options are listed in help string of functions or classes. It is not possible to change options of already constructed object.
All classes from package localFeatures
are subclasses
of GenericLocalFeaturesExtractor
. In VLBenchmarks those
feature extractors wrappers are already implemented:
Class name | Extracts | Supported platforms | Note | |||
---|---|---|---|---|---|---|
Feat. | Descr. | WIN | LNX | OS X | ||
VlFeatSift | Y | Y | Y | Y | Y | Built-in [1], DoG detector, SIFT descriptor |
VlFeatCovdet | Y | Y | Y | Y | Y | Built-in, DoG, Hessian and Harris detectors and their variants. |
VlFeatMser | Y | N | Y | Y | Y | Built-in [1] |
VggAffine | Y | N | N | Y | N | Affine feature frame detection[2] |
VggDescriptor | N | Y | N | Y | N | Descriptor calculation from [2] |
Ebr | Y | N | N | Y | N | Edge based region detector [2] |
Ibr | Y | N | N | Y | N | Intensity based region detector [2] |
CmpBinHessian | Y | N | N | Y | N | Hessian affine [3] |
The VLBenchmarks framework is easily extensible with your own image feature extraction
algorithm. Your feature extractor wrapper has to be a subclass
of GenericLocalFeaturesExtractor
class and has to implement
method extractFeatures(imgPath)
and/or
extractDescriptors(imgPath, frames)
depending on your
algorithm abilities.
You can use existing infrastructure of the benchmark.
For example, by inheriting from GenericLocalFeatureExtractor
you also inherit helpers.Logger
class which implements
simple logger. See Logging for details.
Another helper class used with the most of the built in detectors is
helpers.GenericInstaller
. This class handles installation process
and supports to define class dependencies on web-located archives, mex files
and other classes.
This example shows a feature extractor which supports both feature frame detection and descriptor calculation. This class supports downloading and compiling mex source code of the detector.
Method extractFeatures(imgPath)
can be called with one output
argument when only feature frames need to be detected. When called with two
output arguments, it extracts both feature frames and descriptors. This may
seem to be dual to the extractDescriptors()
method however some
detectors does not support computation of descriptors of given frames.
To cache your detected features, you can use loadFeatures()
or obj.storeFeatures()
methods. However these methods need to
implement method obj.getSignature()
which generates unique string
signature of the detector parameters. It is also useful to include signature
of the detector binary file (helpers.fileSignature
). Caching can
be enabled/disables with methods obj.enableCaching()
,
obj.disableCaching()
To see details about logging, class options and installation framework,
see the localFeatures.ExampleLocalFeatureExtractor
class.
All datasets are subclasses of abstract GenericDataset
which
defines method getImagePath(imageNumber)
to access
dataset images. Number of images in the dataset can be obtained from
object property obj.NumImages
.
Repeatability benchmark needs datasets which inherits from class
GenericTransfDataset
. This benchmark needs method getTransformation(imageNumber)
which returns homography
between a dataset image and a reference image (first image).
Wrapper of the datasets from [2] is implemented in
class VggAffineDataset
and is a subclass of
GenericTransfDataset
. Datasets data are available and downloaded
from VGG
website.
Available categories (accessible through option 'Category'
) are:
Category Name | 'Category' option value | Image transformation |
---|---|---|
Wall | 'wall' |
Viewpoint angle |
Boat | 'boat' |
Scale changes |
Bark | 'bark' |
Scale changes |
Bikes | 'bikes' |
Increasing blur |
Trees | 'trees' |
Increasing blur |
Leuven | 'leuven' |
Decreasing light |
UBC | 'ubc' |
JPEG compression |
For the retrieval benchmark VggRetrievalDataset
wraps around
datasets introduced in [4], [5]
These datasets provide both a set of images and a set of queries. Each query
is specified by the query image, a bounding box and
three subset of images:
A query is accessible through method getQuery(queryNum)
which
returns a structure with the following format:
The image subsets are numeric arrays with image IDs. To obtain a path of a certain image from these subsets, you can simply call:
Number of all images in the dataset is stored in property
obj.NumQueries
.
Currently, two dataset categories are available:
Category Name | 'Category' option value | Number of images | Number of queries | Source |
---|---|---|---|---|
The Oxford Buildings Dataset [4] | 'oxbuild' |
5062 | 55 | link |
The Paris Dataset[5] | 'paris' |
6412 | 55 | link |
These datasets usually contain thousands of images. It is possible to work only with its subset specified by these constructor parameters:
Subsets are sampled used uniform random sampling. You can change the
particular sampling by changing the seed of random number generator with
parameter SamplingSeed
.
All benchmark classes are subclasses of abstract class
GenericBenchmark
.
Currently three benchmarks are implemented.
RepeatabilityBenchmark
is based on tests introduced in
[2]. For details about this test see the
help string of the RetrievalBenchmark
class or see
repeatability tutorial.
Because this test is mostly reimplementing the original test, wrapper of the
original benchmark IjcvOriginalBenchmark
is also available. This
class directly calls the repeatability.m
script.
RetrievalBenchmark
class implements simple retrieval benchmark
of image features detectors. For details see help string of the class or see
image retrieval benchmark tutorial.
Currently the image retrieval benchmark depends on the Yael library which is not available for Microsoft Windows platforms.
The way how to parallelise is different for the repeatability and retrieval benchmark but both approaches use Matlab Parallel Computing Toolbox. For details about this toolbox, see its documentation.
Parallelisation of the repeatability benchmark can be done
by interchanging for
with parfor
in any level,
e.g.:
This is possible thanks to the fact that the cache system depends only on the file system context and does not share any variables. However it does not implement any synchronisation mechanisms. If the loops are properly designed, each process works with different data. However cache autoclear has to be disabled as it can produce situations when two processes are trying to delete the same files.
The image retrieval benchmark is already parallelised and the class
RetrievalBenchmark
uses several parfor loops for both features
extraction and KNN search.
In the case of KNN search you can advantage both from symmetric processing and distributed computing as the Yael KNN uses OpenMP to run its algorithms in multiple threads.
Package helpers contains several classes and functions which are used in
the rest of the project. Logger
and DataCache
classes are used almost in all classes therefore are described more in the
following text.
This project extensively caches its results using key-value caching system.
The key string is usually created based on the properties of input data and
processing algorithm. Data itself can be any Matlab data object and are stored
as *.mat
files in directory ./data/cache
. The
target filename is created as MD5 sum of the key.
Binary data are usually distinguished by a signature of the source file. File signature contains the file name and last modification date of the file. This is also used for algorithm binaries. Among that, the signature usually contain algorithm parameters as their values affect the results.
Caching is implemented in class helpers.DataCache
and has only
static methods, in order to be able to use it in parallel where no memory is
shared among processes. This class implements the following methods:
data = DataCache.getData(key)
Get data identified by a
string key. If the data has not been found, return [].storeData(data,key)
Store data identified by key.res = hasData(key)
Check whether data are cached.removeData(key)
Remove particular data from the cache.clearCache()
Delete the last recently used data to limit
the overall size of the cached data to DataCache.maxDataSize
limit.
deleteAllCachedData()
Delete all cached data.disableAutoClear()
Temporarily disable the autoclear
function. Cannot be called in parallel function as it creates a lock file.
enableAutoClear()
Enable autoClear after disableAutoClear.
Cannot be called in parallel part of code as it deletes a lock file.The caching properties can be changed only in the
helpers/DataCache.m
source code by changing the class constant
properties:
maxDataSize
Maximal storage space occupied by
cached data in Bytes.dataPath
Directory where to store cached datadataFileVersion
Version of the .mat file used for data
storage. See 'help save' for details.autoClear
Check whether storage size has not exceeded
the allowed size after each storeData method call. If so, the last recently
used data are removeddisable
Globally disable caching.Please note that the standard Matlab behaviour is to set these values only once when the class file is loaded for the first time. Therefore to apply the options you must call:
Caching system implements a way how to limit the overall size of the
cached data, further denoted as an autoclear function. It
checks the size of the files in the cache directory and removes the last recently used
items (based on the file modification date) to limit the storage usage.
To globally disable autoclear set the class property
helpers.DataCache.autoClear
to false. You can also disable it
temporarily by calling method helpers.DataCache.disableAutoclear()
which creates a lock file.
It is recommended to call this method before running parallel code to prevent
two processes to delete the same files.
Cache can be disabled globally by setting the constant property
helpers.DataCache.autoClear
to true. Also several framework
classes implements methods disableCaching()
and
enableCaching()
to control the caching behaviour of a single
class.
Cache does not support data invalidation. However to invalidate cached data of a certain detector you can use a simple trick to change the detector binary modification date:
As the cache depends heavily on a file system it is basically limited by a number of files in a single directory which depends on your operating system. Big directory structures can lead to slow file operations and therefore the cache operations can take longer. Thus it is recommended to clear the cache once a while (specially when using network file system).
Several classes in the framework uses logging implemented
in class helper.Logger
class. This class supports both sending
the events to std output or writing them to a log file.
Following events, inspired by the apache Log4j framework, are supported:
Event name | 'VerboseLevel' value | Usage |
---|---|---|
TRACE | 3 |
finer-grained informational events than the DEBUG. |
DEBUG | 2 |
Information useful for debugging. |
INFO | 1 |
Informational messages about the application progress. |
WARN | 0 |
Logs potentially harmful situations. Calls
warn command to show the backtrace. |
ERROR | -1 |
Several errors which does not allow application
to continue. Calls Matlab error function. |
The verbose level can be set both globally editing
the constant properties of class helpers.Logger
or locally for each object which inherits the Logger
class by providing the 'OptionName',OptionValue
parameters
to the class constructor call. Supported options are:
If you want to change the format of the log output, just adjust the methods
helpers.Logger.displayLog
and
helpers.Logger.logToFile
according to your
aesthetic preferences.
You can also use logger on your class as well simply by specifying the
helpers.Logger
as a superclass. Here is a simple template how
to do it and how to properly configure the logger:
And constructing your object you will get:
When your computation are being computed with several processes, possibly
on several computers, having a single logging file may lead to inconsistent
data. That is why we would like to create separate files for each Matlab
process. This is generally a difficult task as the parfor
environment does not include any information about the process.
But simple trick can be used. Matlab creates default values for
object properties only once when the file is loaded for the first time.
And in classic matlabpool each lab has got its own MATLAB process,
therefore a log file name in the helpers.Logger
can be set for
example to:
To test we can call the following code:
You can see that each process generated the random string only once.
If the parfor would be distributed on more computers, the
hostname()
value would change as well.