Convolutional Neural Networks for Adjacency Matrices

A suggested BrainNetCNN architecture.

We had our work, BrainNetCNN, published in NeuroImage awhile ago,

Kawahara, J., Brown, C. J., Miller, S. P., Booth, B. G., Chau, V., Grunau, R. E., Zwicker, J., G., Hamarneh, G. (2017). BrainNetCNN: Convolutional neural networks for brain networks; towards predicting neurodevelopment. NeuroImage, 146(Feb), 1038–1049. http://doi.org/10.1016/j.neuroimage.2016.09.046

and I’ve meant to do a blog writeup about this. We recently released our code for BrainNetCNN on GitHub (based on Caffe), which implements the proposed filters designed for adjacency matrices.

We called this library Ann4Brains. In hindsight, we could have called this something more general and cumbersome like Ann4AdjacencyMatrcies, but I still like the zombie feel that Ann4Brains has.

We designed BrainNetCNN specifically with brain connectome data in mind. Thus the tag line of,

“Convolutional Neural Networks for Brain Networks”

seemed appropriate. However, after receiving some emails about using BrainNetCNN for other types of (non-connectome) data, I’ll emphasize that this approach can be applied to any sort of adjacency matrix, and not just brain connectomes.

The core contribution of this work is the filters designed for adjacency matrices themselves. So we’ll go through each of them. But first, let’s make sure we are clear on what the brain connectome (or adjacency matrix) is.

Connectome – contains edge weights between brain regions

You have a brain. Your brain gets scanned by some device (e.g., diffusion tensor MRI). This scan of your brain contains a huge amount of data! In order to reduce some of this data, we summarize it by considering only a few regions of the brain (90 brain regions in our case).

We can measure the connections between these different brain regions (e.g., the movement of water between two brain regions). These different brain regions and the connections between them can be represented as a graph.

Let each brain region be a node in the graph, and the connection between them as an edge. We now have a fully-connected undirected graph which captures the brain regions and the connections between them.

For simplicity, imagine that we have just 4 regions of the brain. If they are fully connected to each other (i.e., all nodes/regions have an edge between them), then it would look something like this,

A fully connected graph with 4 nodes.
A simple graph with 4 nodes. We call this a fully connected graph, since there is an edge between all nodes.

A common way to represent this graph is with an adjacency matrix, where we represent the nodes (brain regions) on each axis, and each entry in the matrix represents the weight of an edge between each node.

An image of an adjacency matrix should help make this clearer.

An example adjacency matrix for 4 nodes.
An example adjacency matrix for 4 nodes. Each element in this matrix encodes the edge strength between two nodes. There are no self edges (e.g., the edge between node A to itself has a strength of zero), and the matrix is symmetric (e.g., element i,j is the same as element j,i).

So there we go! This is a rough example of how we can take a complex scan of the brain, and represent it as an adjacency matrix.

Edge-to-Edge Filter

The key idea here is that we are doing a convolution over neighbouring edges. So for a given edge \boldsymbol{e_{ij}}  , which connects node \boldsymbol{i}  to node \boldsymbol{j}  , we define the neighbouring edges of \boldsymbol{e_{ij}}  as all edges that connect to either node \boldsymbol{i}  or node \boldsymbol{j}  .

In words, this simple idea sounds confusing. A diagram might help clear this up.

Animated edge-to-edge filter showing input and output.
An Edge-to-Eode filter convolves over neighbouring edges of a given edge. This is represented in the yellow cross (filter) sliding over the input (left figure) to produce the output responses (right figure).

Edge-to-Node Filter

While in the edge-to-edge filter, we do a convolution (weighted sum) over neighbouring edges with respect to an edge, in the edge-to-node filter we do a convolution over the neighbouring edges with respect to a node.

What does that mean exactly? Well, for a given node \boldsymbol{i}  , we do a convolution (weighted sum) over its neighbouring edges, where neighbouring edges are defined as all edges directly connected to node \boldsymbol{i}  .

An Edge-to-Node filter convolves over neighbouring edges of a given node. The left figure shows the input, while the right figure shows the output responses.

When applied to the connectome/adjacency matrix, this is equal to sliding a cross-shaped filter along the diagonal of the matrix (note we could also only have a single bar rather than a cross too, and in some tests we did that). This has the affect of reducing the spatial dimensions of output responses.

Node-to-Graph Filter

The Node-to-Graph filter is equal to a fully-connected layer when applied after an Edge-to-Node filter. The Edge-to-Node filter summarizes the neighbouring edges with respect to a single node (and the output represents the weighted sum of edges adjacent to a specific node). Applying a fully connected filter to the responses of Edge-to-Node filter, yields a single response that summarizes all the nodes into a single graph response.

Putting it all together

We stack these layers together, and this forms our BrainNetCNN layer. Note that the order is important!

An Edge-to-Edge filter takes as input a \boldsymbol{d \times d}  matrix (in the above example d=4), and also outputs \boldsymbol{d \times d}  matrix. Thus, the Edge-to-Edge can only be applied directly after the input, or after another Edge-to-Edge filter (since the spatial dimensions do not change).

An Edge-to-Node filter takes as input a \boldsymbol{d \times d}  matrix, and outputs a \boldsymbol{d \times 1}  matrix. Thus, an Edge-to-Node filter can only be applied to the input or after an Edge-to-Edge filter, since the output dimensionality changes.

A Node-to-Graph filter can only occur after an Edge-to-Node filter. While a fully connected filter can be applied anywhere, it can not be interpreted as a Node-to-Graph filter unless it occurs right after the Edge-to-Node filter.

Well, hopefully this helps gives some intuition of these filters, and how they can be used within a deep neural network framework. Feel free to check out the code, and to leave a comment here if anything is unclear.

I’ve also put together some slides here that you can download in PPTx format or as a PDF.

Questions/comments? If you just want to say thanks, consider sharing this article or following me on Twitter!