# DAGNN - Directed acyclic graph neural network

DagNN is a CNN wrapper alternative to SimpleNN. It is object oriented and allows constructing networks with a directed acyclic graph (DAG) topology. It is therefore far more flexible, although a little more complex and slightly slower for small CNNs.

A DAG object contains the following data members:

• layers: The network layers.

• vars: The network variables.

• params: The network parameters.

• meta: Additional information relative to the CNN (e.g. input

 image format specification).


There are additional transient data members:

• mode [normal]

This flag can either be normal or test. In the latter case, certain blocks switch to a test mode suitable for validation or evaluation as opposed to training. For instance, dropout becomes a pass-through block in test mode.

• accumulateParamDers [false]

If this flag is set to true, then the derivatives of the network parameters are accumulated rather than rewritten the next time the derivatives are computed.

• conserveMemory [true]

If this flag is set to true, the DagNN will discard intermediate variable values as soon as they are not needed anymore in the calculations. This is particularly important to save memory on GPUs.

• device [cpu]

This flag tells whether the DagNN resides in CPU or GPU memory. Use the DagNN.move() function to move the DagNN between devices.

The DagNN is copyable handle, i.e. allows to create a deep copy using copy operator deep_copy = copy(dagnet);. In all cases the deep copy is located in CPU memory (i.e. is transfered from GPU before copy). Remark: As a side effect the original network is being reset (all variables are cleared) and only the network structure and parameters are copied.

## DAGNN - Initialize an empty DaG

OBJ = DAGNN() initializes an empty DaG.

## GETINPUTS - Get the names of the input variables

INPUTS = GETINPUTS(obj) returns a cell array containing the name of the input variables of the DaG obj, i.e. the sources of the DaG (excluding the network parameters, which can also be considered sources).

## GETOUTPUTS - Get the names of the output variables

OUTPUT = GETOUTPUTS(obj) returns a cell array containing the name of the output variables of the DaG obj, i.e. the sinks of the DaG.

## GETLAYERINDEX - Get the index of a layer

INDEX = GETLAYERINDEX(obj, NAME) returns the index of the layer NAME. NAME can also be a cell array of strings. If no layer with such a name is found, the value NaN is returned for the index.

Layers can then be accessed as the obj.layers(INDEX) property of the DaG.

Indexes are stable unless the DaG is modified (e.g. by adding or removing layers); hence they can be cached for faster variable access.

## GETVARINDEX - Get the index of a variable

INDEX = GETVARINDEX(obj, NAME) obtains the index of the variable with the specified NAME. NAME can also be a cell array of strings. If no variable with such a name is found, the value NaN is returned for the index.

Variables can then be accessed as the obj.vars(INDEX) property of the DaG.

Indexes are stable unless the DaG is modified (e.g. by adding or removing layers); hence they can be cached for faster variable access.

## GETPARAMINDEX - Get the index of a parameter

INDEX = GETPARAMINDEX(obj, NAME) obtains the index of the parameter with the specified NAME. NAME can also be a cell array of strings. If no parameter with such a name is found, the value NaN is returned for the index.

Parameters can then be accessed as the obj.params(INDEX) property of the DaG.

Indexes are stable unless the DaG is modified (e.g. by adding or removing layers); hence they can be cached for faster parameter access.

## GETLAYER - Get a copy of a layer definition

LAYER = GETLAYER(obj, NAME) returns a copy of the layer definition structure with the specified NAME. NAME can also be a cell array of strings or an array of indexes. If no parameter with a specified name or index exists, an error is thrown.

## GETVAR - Get a copy of a layer definition

VAR = GETVAR(obj, NAME) returns a copy of the network variable with the specified NAME. NAME can also be a cell array of strings or an array of indexes. If no variable with a specified name or index exists, an error is thrown.

## GETPARAM - Get a copy of a layer parameter

PARAM = GETPARAM(obj, NAME) returns a copy of the network parameter with the specified NAME. NAME can also be a cell array of strings or an array of indexes. If no parameter with a specified name or index exists, an error is thrown.

## GETLAYEREXECUTIONORDER - Get the order in which layers are evaluated

ORDER = GETLAYEREXECUTIONORDER(obj) returns a vector with the indexes of the layers in the order in which they are executed. This needs not to be the trivial order 1,2,...,L as it depends on the graph topology.

## SETPARAMETERSERVER - Set a parameter server for the parameter derivatives

SETPARAMETERSERVER(obj, PS) uses the specified ParameterServer PS to store and accumulate parameter derivatives across multiple MATLAB processes.

After setting this option, net.params.der is always empty and the derivative value must be retrieved from the server.

## CLEARPARAMETERSERVER - Remove the parameter server

CLEARPARAMETERSERVER(obj) stopts using the parameter server.

## ADDVAR - Add a variable to the DaG

V = ADDVAR(obj, NAME) adds a varialbe with the specified NAME to the DaG. This is an internal function; variables are automatically added when adding layers to the network.

## ADDPARAM - Add a parameter to the DaG

V = ADDPARAM(obj, NAME) adds a parameter with the specified NAME to the DaG. This is an internal function; parameters are automatically added when adding layers to the network.

## ADDLAYER - Adds a layer to a DagNN

ADDLAYER(NAME, LAYER, INPUTS, OUTPUTS, PARAMS) adds the specified layer to the network. NAME is a string with the layer name, used as a unique indentifier. BLOCK is the object implementing the layer, which should be a subclass of the Layer. INPUTS, OUTPUTS are cell arrays of variable names, and PARAMS of parameter names.

## EVAL - Evaluate the DAGNN

EVAL(obj, inputs) evaluates the DaG for the specified input values. inputs is a cell array of the type {'inputName', inputValue, ...}. This call results in a forward pass through the graph, computing the values of the output variables. These can then be accessed using the obj.vars(outputIndex) property of the DaG object. The index of an output can be obtained using the obj.getOutputIndex(outputName) call.

EVAL(obj, inputs, derOutputs) evaluates the DaG forward and then backward, performing backpropagation. Similar to inputs, derOutputs is a cell array of the type {'outputName', outputDerValue, ...} of output derivatives.

### Understanding backpropagation

Only those outputs for which an outputDerValue which is non-empty are involved in backpropagation, while the others are ignored. This is useful to attach to the graph auxiliary layers to compute errors or other statistics, without however involving them in backpropagation.

Usually one starts backpropagation from scalar outptus, corresponding to loss functions. In this case outputDerValue can be interpreted as the weight of that output and is usually set to one. For example: {'objective', 1} backpropagates from the 'objective' output variable with a weight of 1.

However, in some cases the DaG may contain more than one such node, for example because one has more than one loss function. In this case {'objective1', w1, 'objective2', w2, ...} allows to balance the different objectives.

Finally, one can backpropagate from outputs that are not scalars. While this is unusual, it is possible by specifying a value of outputDerValue that has the same dimensionality as the output; in this case, this value is used as a matrix of weights, or projection.

### Factors affecting evaluation

There are several factors affecting evaluation:

• The evaluation mode can be either normal or test. Layers may behave differently depending on the mode. For example, dropout becomes a pass-through layer in test mode and batch normalization use fixed moments (this usually improves the test performance significantly).

• By default, the DaG aggressively conserves memory. This is particularly important on the GPU, where memory is scarce. However, this also means that the values of most variables and of their derivatives are dropped during the computation. For debugging purposes, it may be interesting to observe these variables; in this case you can set the obj.conserveMemory property of the DaG to false. It is also possible to preserve individual variables by setting the property obj.vars(v).precious to true.

## FROMSIMPLENN - Initialize a DagNN object from a SimpleNN network

FROMSIMPLENN(NET) initializes the DagNN object from the specified CNN using the SimpleNN format.

SimpleNN objects are linear chains of computational layers. These layers exchange information through variables and parameters that are not explicitly named. Hence, FROMSIMPLENN() uses a number of rules to assign such names automatically:

• From the input to the output of the CNN, variables are called x0 (input of the first layer), x1, x2, .... In this manner xi is the output of the i-th layer.

• Any loss layer requires two inputs, the second being a label. These are called label (for the first such layers), and then label2, label3,... for any other similar layer.

Additionally, given the option CanonicalNames the function can change the names of some variables to make them more convenient to use. With this option turned on:

• The network input is called input instead of x0.

• The output of each SoftMax layer is called prob (or prob2, ...).

• The output of each Loss layer is called objective (or objective2, ...).

• The input of each SoftMax or Loss layer of type softmax log loss is called prediction (or prediction2, ...). If a Loss layer immediately follows a SoftMax layer, then the rule above takes precendence and the input name is not changed.

FROMSIMPLENN(___, 'OPT', VAL, ...) accepts the following options:

• CanonicalNames [false]

If true use the rules above to assign more meaningful names to some of the variables.

## GETVARRECEPTIVEFIELDS - Get the receptive field of a variable

RFS = GETVARRECEPTIVEFIELDS(OBJ, VAR) gets the receptivie fields RFS of all the variables of the DagNN OBJ into variable VAR. VAR is a variable name or index.

RFS has one entry for each variable in the DagNN following the same format as has DAGNN.GETRECEPTIVEFIELDS(). For example, RFS(i) is the receptive field of the i-th variable in the DagNN into variable VAR. If the i-th variable is not a descendent of VAR in the DAG, then there is no receptive field, indicated by rfs(i).size == []. If the receptive field cannot be computed (e.g. because it depends on the values of variables and not just on the network topology, or if it cannot be expressed as a sliding window), then rfs(i).size = [NaN NaN].

## GETVARSIZES - Get the size of the variables

SIZES = GETVARSIZES(OBJ, INPUTSIZES) computes the SIZES of the DagNN variables given the size of the inputs. inputSizes is a cell array of the type {'inputName', inputSize, ...} Returns a cell array with sizes of all network variables.

Example, compute the storage needed for a batch size of 256 for an imagenet-like network:

batch_size = 256; single_num_bytes = 4;
input_size = [net.meta.normalization.imageSize, batch_size];
var_sizes = net.getVarSizes({'data', input_size});
fprintf('Network activations will take %.2fMiB in single.\n', ...

sum(prod(cell2mat(var_sizes, 1))) * single_num_bytes ./ 1024^3);


## INITPARAM - Initialize the paramers of the DagNN

OBJ.INITPARAM() uses the INIT() method of each layer to initialize the corresponding parameters (usually randomly).

## LOADOBJ - Initialize a DagNN object from a structure.

OBJ = LOADOBJ(S) initializes a DagNN objet from the structure S. It is the opposite of S = OBJ.SAVEOBJ(). If S is a string, initializes the DagNN object with data from a mat-file S. Otherwise, if S is an instance of dagnn.DagNN, returns S.

## MOVE - Move the DagNN to either CPU or GPU

MOVE(obj, 'cpu') moves the DagNN obj to the CPU.

MOVE(obj, 'gpu') moves the DagNN obj to the GPU.

PRINT(OBJ) displays a summary of the functions and parameters in the network. STR = PRINT(OBJ) returns the summary as a string instead of printing it.

PRINT(OBJ, INPUTSIZES) where INPUTSIZES is a cell array of the type {'input1nam', input1size, 'input2name', input2size, ...} prints information using the specified size for each of the listed inputs.

PRINT(___, 'OPT', VAL, ...) accepts the following options:

• All [false]

Display all the information below.

• Layers [''*]

Specify which layers to print. This can be either a list of indexes, a cell array of array names, or the string '*', meaning all layers.

• Parameters [''*]

Specify which parameters to print, similar to the option above.

• Variables [[]]

Specify which variables to print, similar to the option above.

• Dependencies [false]

Whether to display the dependency (geometric transformation) of each variables from each input.

• Format ['ascii']

Choose between ascii, latex, csv, 'digraph', and dot. The first three format print tables; digraph uses the plot function for a digraph (supported in MATLAB>=R2015b) and the last one prints a graph in dot format. In case of zero outputs, it attmepts to compile and visualise the dot graph using dot command and start (Windows), display (Linux) or open (Mac OSX) on your system. In the latter case, all variables and layers are included in the graph, regardless of the other parameters.

• FigurePath ['tempname.pdf']

Sets the path where any generated dot figure will be saved. Currently, this is useful only in combination with the format dot. By default, a unique temporary filename is used (tempname is replaced with a tempname() call). The extension specifies the output format (passed to dot as a -Text parameter). If not extension provided, PDF used by default. Additionally, stores the .dot file used to generate the figure to the same location.

dotArgs:: '' Additional dot arguments. E.g. '-Gsize="7"' to generate a smaller output (for a review of the network structure etc.).

• MaxNumColumns [18]

Maximum number of columns in each table.

## REBUILD - Rebuild the internal data structures of a DagNN object

REBUILD(obj) rebuilds the internal data structures of the DagNN obj. It is an helper function used internally to update the network when layers are added or removed.

## REMOVELAYER - Remove a layer from the network

REMOVELAYER(OBJ, NAME) removes the layer NAME from the DagNN object OBJ. NAME can be a string or a cell array of strings.

## RENAMELAYER - Rename a layer

RENAMELAYER(OLDNAME, NEWNAME) changes the name of the layer OLDNAME into NEWNAME. NEWNAME should not be the name of an existing layer.

## RENAMELAYER - Rename a parameter

RENAMEPARAM(OLDNAME, NEWNAME) changes the name of the parameter OLDNAME into NEWNAME. NEWNAME should not be the name of an existing parameter.

## RENAMEVAR - Rename a variable

RENAMEVAR(OLDNAME, NEWNAME) changes the name of the variable OLDNAME into NEWNAME. NEWNAME should not be the name of an existing variable.

## RESET - Reset the DagNN

RESET(obj) resets the DagNN obj. The function clears any intermediate value stored in the DagNN object, including parameter gradients. It also calls the reset function of every layer.

## SAVEOBJ - Save a DagNN to a vanilla MATLAB structure

S = OBJ.SAVEOBJ() saves the DagNN OBJ to a vanilla MATLAB structure S. This is particularly convenient to preserve future compatibility and to ship networks that are pure structures, instead of embedding dependencies to code.

The object can be reconstructe by obj = DagNN.loadobj(s).

As a side-effect the network is being reset (all variables are cleared) and is transfered to CPU.