DAGNN - Directed acyclic graph neural network

DagNN is a CNN wrapper alternative to SimpleNN. It is object oriented and allows constructing networks with a directed acyclic graph (DAG) topology. It is therefore far more flexible, although a little more complex and slightly slower for small CNNs.

A DAG object contains the following data members:

There are additional transient data members:

The DagNN is copyable handle, i.e. allows to create a deep copy using copy operator deep_copy = copy(dagnet);. In all cases the deep copy is located in CPU memory (i.e. is transfered from GPU before copy). Remark: As a side effect the original network is being reset (all variables are cleared) and only the network structure and parameters are copied.

See Also: matlab.mixin.Copyable

DAGNN - Initialize an empty DaG

OBJ = DAGNN() initializes an empty DaG.

See Also addLayer(), loadobj(), saveobj().

GETINPUTS - Get the names of the input variables

INPUTS = GETINPUTS(obj) returns a cell array containing the name of the input variables of the DaG obj, i.e. the sources of the DaG (excluding the network parameters, which can also be considered sources).

GETOUTPUTS - Get the names of the output variables

OUTPUT = GETOUTPUTS(obj) returns a cell array containing the name of the output variables of the DaG obj, i.e. the sinks of the DaG.

GETLAYERINDEX - Get the index of a layer

INDEX = GETLAYERINDEX(obj, NAME) returns the index of the layer NAME. NAME can also be a cell array of strings. If no layer with such a name is found, the value NaN is returned for the index.

Layers can then be accessed as the obj.layers(INDEX) property of the DaG.

Indexes are stable unless the DaG is modified (e.g. by adding or removing layers); hence they can be cached for faster variable access.

See Also getParamIndex(), getVarIndex().

GETVARINDEX - Get the index of a variable

INDEX = GETVARINDEX(obj, NAME) obtains the index of the variable with the specified NAME. NAME can also be a cell array of strings. If no variable with such a name is found, the value NaN is returned for the index.

Variables can then be accessed as the obj.vars(INDEX) property of the DaG.

Indexes are stable unless the DaG is modified (e.g. by adding or removing layers); hence they can be cached for faster variable access.

See Also getParamIndex(), getLayerIndex().

GETPARAMINDEX - Get the index of a parameter

INDEX = GETPARAMINDEX(obj, NAME) obtains the index of the parameter with the specified NAME. NAME can also be a cell array of strings. If no parameter with such a name is found, the value NaN is returned for the index.

Parameters can then be accessed as the obj.params(INDEX) property of the DaG.

Indexes are stable unless the DaG is modified (e.g. by adding or removing layers); hence they can be cached for faster parameter access.

See Also getVarIndex(), getLayerIndex().

GETLAYER - Get a copy of a layer definition

LAYER = GETLAYER(obj, NAME) returns a copy of the layer definition structure with the specified NAME. NAME can also be a cell array of strings or an array of indexes. If no parameter with a specified name or index exists, an error is thrown.

See Also getLayerIndex().

GETVAR - Get a copy of a layer definition

VAR = GETVAR(obj, NAME) returns a copy of the network variable with the specified NAME. NAME can also be a cell array of strings or an array of indexes. If no variable with a specified name or index exists, an error is thrown.

See Also getVarIndex().

GETPARAM - Get a copy of a layer parameter

PARAM = GETPARAM(obj, NAME) returns a copy of the network parameter with the specified NAME. NAME can also be a cell array of strings or an array of indexes. If no parameter with a specified name or index exists, an error is thrown.

See Also getParamIndex().

GETLAYEREXECUTIONORDER - Get the order in which layers are evaluated

ORDER = GETLAYEREXECUTIONORDER(obj) returns a vector with the indexes of the layers in the order in which they are executed. This needs not to be the trivial order 1,2,...,L as it depends on the graph topology.

SETPARAMETERSERVER - Set a parameter server for the parameter derivatives

SETPARAMETERSERVER(obj, PS) uses the specified ParameterServer PS to store and accumulate parameter derivatives across multiple MATLAB processes.

After setting this option, net.params.der is always empty and the derivative value must be retrieved from the server.

CLEARPARAMETERSERVER - Remove the parameter server

CLEARPARAMETERSERVER(obj) stopts using the parameter server.

ADDVAR - Add a variable to the DaG

V = ADDVAR(obj, NAME) adds a varialbe with the specified NAME to the DaG. This is an internal function; variables are automatically added when adding layers to the network.

ADDPARAM - Add a parameter to the DaG

V = ADDPARAM(obj, NAME) adds a parameter with the specified NAME to the DaG. This is an internal function; parameters are automatically added when adding layers to the network.

ADDLAYER - Adds a layer to a DagNN

ADDLAYER(NAME, LAYER, INPUTS, OUTPUTS, PARAMS) adds the specified layer to the network. NAME is a string with the layer name, used as a unique indentifier. BLOCK is the object implementing the layer, which should be a subclass of the Layer. INPUTS, OUTPUTS are cell arrays of variable names, and PARAMS of parameter names.

See Also REMOVELAYER().

EVAL - Evaluate the DAGNN

EVAL(obj, inputs) evaluates the DaG for the specified input values. inputs is a cell array of the type {'inputName', inputValue, ...}. This call results in a forward pass through the graph, computing the values of the output variables. These can then be accessed using the obj.vars(outputIndex) property of the DaG object. The index of an output can be obtained using the obj.getOutputIndex(outputName) call.

EVAL(obj, inputs, derOutputs) evaluates the DaG forward and then backward, performing backpropagation. Similar to inputs, derOutputs is a cell array of the type {'outputName', outputDerValue, ...} of output derivatives.

Understanding backpropagation

Only those outputs for which an outputDerValue which is non-empty are involved in backpropagation, while the others are ignored. This is useful to attach to the graph auxiliary layers to compute errors or other statistics, without however involving them in backpropagation.

Usually one starts backpropagation from scalar outptus, corresponding to loss functions. In this case outputDerValue can be interpreted as the weight of that output and is usually set to one. For example: {'objective', 1} backpropagates from the 'objective' output variable with a weight of 1.

However, in some cases the DaG may contain more than one such node, for example because one has more than one loss function. In this case {'objective1', w1, 'objective2', w2, ...} allows to balance the different objectives.

Finally, one can backpropagate from outputs that are not scalars. While this is unusual, it is possible by specifying a value of outputDerValue that has the same dimensionality as the output; in this case, this value is used as a matrix of weights, or projection.

Factors affecting evaluation

There are several factors affecting evaluation:

FROMSIMPLENN - Initialize a DagNN object from a SimpleNN network

FROMSIMPLENN(NET) initializes the DagNN object from the specified CNN using the SimpleNN format.

SimpleNN objects are linear chains of computational layers. These layers exchange information through variables and parameters that are not explicitly named. Hence, FROMSIMPLENN() uses a number of rules to assign such names automatically:

Additionally, given the option CanonicalNames the function can change the names of some variables to make them more convenient to use. With this option turned on:

FROMSIMPLENN(___, 'OPT', VAL, ...) accepts the following options:

GETVARRECEPTIVEFIELDS - Get the receptive field of a variable

RFS = GETVARRECEPTIVEFIELDS(OBJ, VAR) gets the receptivie fields RFS of all the variables of the DagNN OBJ into variable VAR. VAR is a variable name or index.

RFS has one entry for each variable in the DagNN following the same format as has DAGNN.GETRECEPTIVEFIELDS(). For example, RFS(i) is the receptive field of the i-th variable in the DagNN into variable VAR. If the i-th variable is not a descendent of VAR in the DAG, then there is no receptive field, indicated by rfs(i).size == []. If the receptive field cannot be computed (e.g. because it depends on the values of variables and not just on the network topology, or if it cannot be expressed as a sliding window), then rfs(i).size = [NaN NaN].

GETVARSIZES - Get the size of the variables

SIZES = GETVARSIZES(OBJ, INPUTSIZES) computes the SIZES of the DagNN variables given the size of the inputs. inputSizes is a cell array of the type {'inputName', inputSize, ...} Returns a cell array with sizes of all network variables.

Example, compute the storage needed for a batch size of 256 for an imagenet-like network:

batch_size = 256; single_num_bytes = 4;
input_size = [net.meta.normalization.imageSize, batch_size];
var_sizes = net.getVarSizes({'data', input_size});
fprintf('Network activations will take %.2fMiB in single.\n', ...

      sum(prod(cell2mat(var_sizes, 1))) * single_num_bytes ./ 1024^3);

INITPARAM - Initialize the paramers of the DagNN

OBJ.INITPARAM() uses the INIT() method of each layer to initialize the corresponding parameters (usually randomly).

LOADOBJ - Initialize a DagNN object from a structure.

OBJ = LOADOBJ(S) initializes a DagNN objet from the structure S. It is the opposite of S = OBJ.SAVEOBJ(). If S is a string, initializes the DagNN object with data from a mat-file S. Otherwise, if S is an instance of dagnn.DagNN, returns S.

MOVE - Move the DagNN to either CPU or GPU

MOVE(obj, 'cpu') moves the DagNN obj to the CPU.

MOVE(obj, 'gpu') moves the DagNN obj to the GPU.

PRINT(OBJ) displays a summary of the functions and parameters in the network. STR = PRINT(OBJ) returns the summary as a string instead of printing it.

PRINT(OBJ, INPUTSIZES) where INPUTSIZES is a cell array of the type {'input1nam', input1size, 'input2name', input2size, ...} prints information using the specified size for each of the listed inputs.

PRINT(___, 'OPT', VAL, ...) accepts the following options:

See also: DAGNN, DAGNN.GETVARSIZES().

REBUILD - Rebuild the internal data structures of a DagNN object

REBUILD(obj) rebuilds the internal data structures of the DagNN obj. It is an helper function used internally to update the network when layers are added or removed.

REMOVELAYER - Remove a layer from the network

REMOVELAYER(OBJ, NAME) removes the layer NAME from the DagNN object OBJ. NAME can be a string or a cell array of strings.

RENAMELAYER - Rename a layer

RENAMELAYER(OLDNAME, NEWNAME) changes the name of the layer OLDNAME into NEWNAME. NEWNAME should not be the name of an existing layer.

RENAMELAYER - Rename a parameter

RENAMEPARAM(OLDNAME, NEWNAME) changes the name of the parameter OLDNAME into NEWNAME. NEWNAME should not be the name of an existing parameter.

RENAMEVAR - Rename a variable

RENAMEVAR(OLDNAME, NEWNAME) changes the name of the variable OLDNAME into NEWNAME. NEWNAME should not be the name of an existing variable.

RESET - Reset the DagNN

RESET(obj) resets the DagNN obj. The function clears any intermediate value stored in the DagNN object, including parameter gradients. It also calls the reset function of every layer.

SAVEOBJ - Save a DagNN to a vanilla MATLAB structure

S = OBJ.SAVEOBJ() saves the DagNN OBJ to a vanilla MATLAB structure S. This is particularly convenient to preserve future compatibility and to ship networks that are pure structures, instead of embedding dependencies to code.

The object can be reconstructe by obj = DagNN.loadobj(s).

As a side-effect the network is being reset (all variables are cleared) and is transfered to CPU.

See Also: dagnn.DagNN.loadobj, dagnn.DagNN.reset

SETLAYERINPUTS - Set or change the inputs to a layer

Example: NET.SETLAYERINPUTS('layerName', {'input1', 'input2', ...})

SETLAYEROUTPUTS - Set or change the outputs of a layer

Example: NET.SETLAYEROUTPUTS('layerName', {'output1', 'output2', ...})

SETLAYEPARAMS - Set or change the parameters of a layer

Example: NET.SETLAYERPARAMS('layerName', {'param1', 'param2', ...})