The header image

The header image is derived from a photo of some of the work of a friend, the celebrated artist Janet Echelman.

A graph’s adjacency matrix, and counting walks

In the movie Good Will Hunting, at the end of class the professor announces,

I also put an advanced Fourier system on the main hallway chalkboard. I’m hoping that one of you might prove it by the end of the semester. Now the person to do so will not only be in my good graces but will also go on to fame and fortune, having their accomplishment recorded and their name printed in the auspicious MIT Tech. Former winners include Nobel laureates, Fields Medal winners, renowned astrophysicists, and lowly MIT professors.

Must be a tough problem, eh?

The hero, Will Hunting, who works as a janitor at MIT, sees the problem and surreptitiously writes down the solution.

The class is abuzz over the weekend—who is the mysterious student who cracked this problem?

The problem has nothing to do with Fourier systems. It has to do with representing and manipulating a graph using a matrix.

Graphs

Informally, a graph has points, called vertices or nodes, and links, called edges. Here is a diagram of the graph appearing in Will’s problem, but keep in mind that the graph doesn’t specify the geometric positions of the nodes or edges—just which edges connect which nodes. The nodes of this graph are labeled 1, 2, 3, and 4. There are two edges with endpoints 2 and 3, one edge with endpoints 1 and 2, and so on.

Adjacency matrix

The first part of Will’s problem is to find the adjacency matrix of the graph. The adjacency matrix ${A}$ of a graph ${G}$ is the ${D\times D}$ matrix where ${D}$ is the set of node labels. In Will’s graph, ${D=\{1,2,3,4\}}$. For any pair ${i,j}$ of nodes, ${A[i,j]}$ is the number of edges with endpoints ${i}$ and ${j}$. Therefore the adjacency matrix of Will’s graph is

 1 2 3 4 1 0 1 0 1 2 1 0 2 1 3 0 2 0 0 4 1 1 0 0

Note that this matrix is symmetric. This reflects the fact that if an edge has endpoints ${i}$ and ${j}$ then it has endpoints ${j}$ and ${i}$. Another time we’ll discuss directed graphs, for which things are a bit more complicated.

Note also that the diagonal elements of the matrix are zeroes. This reflects the fact that Will’s graph has no self-loops. A self-loop is an edge whose two endpoints are the same node.

Walks

The second part of the problem addresses walks in the graph. A walk is a sequence of alternating nodes and edges

$\displaystyle v_0 \ e_0 \ v_1\ e_1\ \cdots \ e_{k-1}\ v_k$

in which each edge is immediately between its two endpoints.

Here is a diagram of Will’s graph with the edges labeled:

and here is the same diagram with the walk ${3\ c\ 2\ e\ 4\ e\ 2}$ shown:

Note that a walk can use an edge multiple times. Also, a walk need not visit all the nodes of a graph. (A walk that visits all nodes is a traveling-salesperson tour; finding the shortest such tour is a famous example of a computationally difficult problem.)

Will’s problem concerns three-step walks, by which is meant walks consisting of three edges. For example, here are all the three-step walks from node 3 to node 2:

3 c 2 e 4 e 2,  3 b 2 e 4 e 2,  3 c 2 c 3 c 2,  3 c 2 c 3 b 2,  3 c 2 b 3 c 2, 3 c 2 b 3 b 2,  3 b 2 c 3 c 2,  3 b 2 c 3 b 2,  3 b 2 b 3 c 2,  3 b 2 b 3 b 2 for a total of ten. Or, wait…. Did I miss any?

Computing the number of walks

Matrix-matrix multiplication can be used to compute, for every pair ${i,j}$ of nodes, the number of two-step walks from ${i}$ to ${j}$, the number of three-steps from ${i}$ to ${j}$, and so on.

First, note that the adjacency matrix ${A}$ itself encodes the number of one-step walks. For each pair ${i,j}$ of nodes, ${A[i,j]}$ is the number of edges with endpoints ${i}$ and ${j}$, and therefore the number of one-step walks from ${i}$-to-${j}$.

What about two-step walks? A two-step walk from ${i}$ to ${j}$ consists of a one-step walk from ${i}$ to some node ${k}$, followed by a one-step walk from ${k}$ to ${j}$. Thus the number of two-step walks from ${i}$ to ${j}$ equals

 number of one-step walks from ${i}$ to 1 ${\times}$ number of one-step walks from 1 to ${j}$ + number of one-step walks from ${i}$ to 2 ${\times}$ number of one-step walks from 2 to ${j}$ + number of one-step walks from ${i}$ to 3 ${\times}$ number of one-step walks from 3 to ${j}$ + number of one-step walks from ${i}$ to 4 ${\times}$ number of one-step walks from 4 to ${j}$

This has the form of a dot-product! Row ${i}$ of ${A}$ is a vector ${\boldsymbol{u}}$ such that ${\boldsymbol{u}[k]}$ is the number of one-step walks from ${i}$ to ${k}$. Column ${j}$ of ${A}$ is a vector ${\boldsymbol{v}}$ such that ${\boldsymbol{v}[k]}$ is the number of one-step walks from ${k}$ to ${j}$. Therefore the dot-product of row ${i}$ with column ${j}$ is the number of two-step walks from ${i}$ to ${j}$. By the dot-product definition of matrix-matrix multiplication, therefore, the product ${A A}$ encodes the number of two-step walks.

>>> D = {1,2,3,4}
>>> A = Mat((D,D), {(1,2):1, (1,4):1, (2,1):1, (2,3):2, (2,4):1, (3,2):2, (4,1):1, (4,2):1})
>>> print(A*A)

1 2 3 4
---------
1  |  2 1 2 1
2  |  1 6 0 1
3  |  2 0 4 2
4  |  1 1 2 2

Now we consider three-step walks. A three-step walk from ${i}$ to ${j}$ consists of a two-step walk from ${i}$ to some node ${k}$, followed by a one-step walk from ${k}$ to ${j}$. Thus the number of three-step walks from ${i}$ to ${j}$ equals

 number of two-step walks from ${i}$ to 1 ${\times}$ number of one-step walks from 1 to ${j}$ + number of two-step walks from ${i}$ to 2 ${\times}$ number of one-step walks from 2 to ${j}$ + number of two-step walks from ${i}$ to 3 ${\times}$ number of one-step walks from 3 to ${j}$ + number of two-step walks from ${i}$ to 4 ${\times}$ number of one-step walks from 4 to ${j}$

We already know that ${A A}$ gives the number of two-step walks. Again using the dot-product definition of matrix-matrix multiplication, the product ${(A A) A}$ gives the number of three-step walks:

>>> print((A*A)*A)

1  2  3 4
-----------
1  |  2  7  2 3
2  |  7  2 12 7
3  |  2 12  0 2
4  |  3  7  2 2

Oops, there are not ten but twelve three-step walks from 3 to 2. I missed the walks ${3\ c\ 2\ a\ 1\ a\ 2}$ and ${3\ b\ 2\ a\ 1\ a\ 2}$. Anyway, we’re half the way towards solving Will’s problem. The problems about generating functions are not much harder; they make use of polynomials and determinants. We will not be able to cover generating functions here, but rest assured that they are not beyond your ability and are quite elegant.

Why (apart from undying fame) would you want to compute the number of ${k}$-step walks between pairs of nodes in a graph? These numbers can serve as a crude way to measure how closely coupled a pair of nodes are in a graph modeling a social network, although there are assuredly better and faster ways to do so.  We will see some in future posts.