VL_NNCONV - CNN convolution.

Y = VL_NNCONV(X, F, B) computes the convolution of the image X with the filter bank F and biases B. If B is the empty matrix, then no biases are added. If F is the empty matrix, then the function does not filter the image, but still adds the biases and applies downsampling and padding as explained below.

X is an array of dimension H x W x C x N where (H,W) are the height and width of the image stack, C is the number of feature channels, and N is the number of images in the batch.

F is an array of dimension FW x FH x FC x K where (FH,FW) are the filter height and width and K the number o filters in the bank. FC is the number of feature channels in each filter and must match the number of feature channels C in X. Alternatively, FC can

filters works on a consecutive subset of feature channels of the input array X.

[DX, DF, DB] = VL_NNCONV(X, F, B, DY) computes the derivatives of the operator projected onto P. DX, DF, DB, and DY have the same dimensions as X, F, B, and Y, respectively. In particular, if B is the empty matrix, then DB is also empty.

VL_NNCONV() implements a special fully-connected mode: when the support of the filters matches exactly the support of the input image, the code uses an optimized path for faster computation.

VL_NNCONV(..., 'option', value, ...) accepts the following options:

The filter size must be not larger than the padded image, i.e.

  1 <= FH <= H + PADTOP + PADBOTTOM,
  1 <= FW <= W + PADLEFT + PADRIGHT.

The output a is an array of dimension YH x YW x K x N of N images with K feature challens and size:

  YH = floor((H + (PADTOP+PADBOTTOM) - FH)/STRIDEY) + 1,
  YW = floor((W + (PADLEFT+PADRIGHT) - FW)/STRIDEX) + 1.

Accounting for dilation, the formulas become:

  YH = floor((H + (PADTOP+PADBOTTOM) - FH*(DILATEY-1) -1)/STRIDEY) + 1,
  YW = floor((W + (PADLEFT+PADRIGHT) - FW*(DILATEX-1) -1)/STRIDEX) + 1.

Arguments can be SINGLE or DOUBLE and CPU or GPU arrays; however, they must all be of the same type (unless empty).

CUDNN SUPPORT

If compiled in, the function will use cuDNN convolution routines (with the exception of asymmetric left-right or top-bottom padding that are not supported by cuDNN). You can use the 'NoCudnn' option to disable cuDNN or 'Cudnn' to activate it back again (the choice sticks until MATLAB purges the MEX files for any reason).

Some cuDNN algorithms may use a very large amount of memory on the GPU (workspace). By default, MatConvNet limits this to 512MB. To change this behavior, use the CudnnWorskpaceLimit option to specify the maximum size of the workspace in bytes. Set this parameter +inf to remove the limit and use the Verbose flag to check how much memory is being used.