What Are Block Matrices? An Introduction to Block, or Partitioned, Matrices
When dealing with large matrices that have a known structure, it can be more convenient to express the matrix in terms of what is known as a block, or partitioned, matrix.
What is a Partitioned Matrix?
This matrix:
$$A = \begin{pmatrix} 3&1&4&1&0\\0&0&1&0&1\\0&0&0&4&2\end{pmatrix}$$
can also be written as a $2\times 2$ partitioned (or block) matrix:
\begin{equation*} A = \begin{pmatrix}A _{1,1} & A _{1,2} \\ A _{2,1} & A _{2,2} \end{pmatrix} \end{equation*}
where the entries of $A$ are the blocks
\begin{equation*}
A_{1,1} = \begin{pmatrix} 3 & 1 & 4 \\ 1 & 6 & 1\end{pmatrix}, \quad A_{1,2} = I_2 = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}, \quad A_{2,1} = \begin{pmatrix} 0 & 0 & 0 \end{pmatrix}, \quad A_{2,2} = \begin{pmatrix} 4 & 2 \end{pmatrix}
\end{equation*}
We partitioned our matrix into four blocks, each of which has different dimensions. The matrix could also, for example, be partitioned into five $3\times 1$ blocks, or three $1\times 5$ blocks. A matrix can be partitioned into blocks in many different ways. Depending on the application at hand, there may be a particular partitioning that is useful or needed.
Applications of Partitioned Matrices
Partitioned matrices can be useful to describe the structure of a matrix, or to enable a needed computation.
For example, when solving a linear system $A\vec x = \vec b$ to determine $\vec x$, linear algebra students learn that we can construct and row reduce an augmented matrix of the form $$X = \begin{pmatrix} A & \vec b\end{pmatrix}$$ The reduced form of the augmented matrix gives us the solution to that system. And the augmented matrix $X$ consists of two sub-matrices, $A$ and $\vec b$, meaning that it can be viewed as a block matrix.
Another application of a block matrix arises when using the SVD, which is a popular tool in data science. The SVD uses a matrix, $\Sigma$, of the form
$$\Sigma = \begin{pmatrix} D & \mathbf{0}\\\mathbf{0}&\mathbf{0}\end{pmatrix}$$
Matrix $D$ is a diagonal matrix, and each $\mathbf{0}$ is a zero matrix. Representing $\Sigma$ in terms of sub-matrices helps us see what the structure of $\Sigma$ is.
Yet another block matrix arises when introducing a procedure for computing the inverse of an $n\times n$ matrix. To compute the inverse of matrix $A$, we construct the matrix
$$X = \begin{pmatrix} A & I \end{pmatrix}$$
A more succinct A block matrix is a matrix that is interpreted as having been broken into sections called blocks or submatrices. Intuitively, a matrix interpreted as a block matrix can be visualized as the original matrix with a collection of horizontal and vertical lines, which break it up, or partition it, into a collection of smaller matrices.
Partitioned Matrix Addition
If $m\times n$ matrices $A$ and $B$ are partitioned in exactly the same way, then the entries of their sum is the sum of their blocks. For example, if \begin{equation*} A = \begin{pmatrix}A _{1,1} & A _{1,2} \\ A _{2,1} & A _{2,2} \end{pmatrix} \end{equation*} and \begin{equation*} B = \begin{pmatrix}B _{1,1} & B _{1,2} \\ B _{2,1} & B _{2,2} \end{pmatrix} \end{equation*} then
\begin{equation*} C = A+B = \begin{pmatrix}A _{1,1}+B_{1,1} & A _{1,2} + B_{1,2}\\ A _{2,1} + B_{2,1} & A _{2,2}+B_{2,2} \end{pmatrix} \end{equation*}
In other words, as long as the matrices have the same partitioning, addition is calculated block by block.
Block Matrix Multiplication
Recall the row-column method for matrix multiplication.
Let $ A$ be $ m \times n$ and $ B$ be $ n \times p$ matrix. Then, the $ (i,j)$ entry of $ AB$ is
\begin{equation*}
\operatorname {row}_i A \cdot \operatorname {col}_j B .
\end{equation*} This is the Row Column Method for matrix multiplication.
Partitioned matrices can be multiplied using this method, as if each block were a scalar provided each block has appropriate dimensions so that products are defined.
Example 1: Computing $A^2$
Block matrices can be useful in cases where a matrix has a particular structure. For example, suppose $A$ is the $n\times n$ block matrix
$$A = \begin{pmatrix} X & \mathbf{0} \\ \mathbf{0} & Y \end{pmatrix}$$
where $X$ and $Y$ are $p\times p$ matrices, $\mathbf{0}$ is a $p\times p$ zero matrix, and $2p = n$. Then
$$A^2 = AA = \begin{pmatrix} X & \mathbf{0} \\ \mathbf{0} & Y \end{pmatrix}\begin{pmatrix} X & \mathbf{0} \\ \mathbf{0} & Y \end{pmatrix} = \begin{pmatrix} X^2 & \mathbf{0} \\ \mathbf{0} & Y^2 \end{pmatrix}$$
Computation of $A^2$ only requires computing $X^2$ and $Y^2$. Taking advantage of the block structure $A$ leads to a more efficient computation than it otherwise would have been with a naive row-column method that does not take advantage of the structure of the matrix.
Example 2: Computing the Matrix Product $AB$
$A$ and $B$ are the matrices
\begin{align*}
A &= \begin{pmatrix} 1&0&1\\0&1&1 \end{pmatrix} = \begin{pmatrix} A_{11} & A_{12} \end{pmatrix} \\ B &= \begin{pmatrix} 2 & -1 \\ 0 & -1 \\ 0 & 1 \end{pmatrix} = \begin{pmatrix} B_{11} \\ B_{21} \end{pmatrix}
\end{align*}
where
$$A_{11} = \begin{pmatrix} 1 & 0 \\0 & 1 \end{pmatrix}, \quad A_{12} = \begin{pmatrix} 1\\1 \end{pmatrix}, \quad B_{11} = \begin{pmatrix} 2 & -1 \\ 0 & -1 \end{pmatrix}, \quad B_{21} = \begin{pmatrix} 0 & 1 \end{pmatrix}$$
If we compute the matrix product using the given partitioning we obtain
$$AB = \begin{pmatrix} A_{11} & A_{12} \end{pmatrix} \begin{pmatrix} B_{11} \\ B_{21} \end{pmatrix}=
\begin{pmatrix} A_{11}B_{11} + A_{12}B_{21} \end{pmatrix}$$
where
\begin{align*}A_{11}B_{11} &= \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}\begin{pmatrix} 2 & -1 \\ 0 & -1 \end{pmatrix} = \begin{pmatrix} 2 & -1 \\ 0 & -1 \end{pmatrix} \\ A_{12}B_{21} &= \begin{pmatrix} 1\\1 \end{pmatrix}\begin{pmatrix} 0 & 1 \end{pmatrix} = \begin{pmatrix} 0 & 1 \\ 0 & 1 \end{pmatrix}
\end{align*}
Therefore
\begin{align*}AB = \begin{pmatrix} A_{11}B_{11} + A_{12}B_{21} \end{pmatrix} &= \begin{pmatrix} 2 & -1 \\ 0 & -1 \end{pmatrix} + \begin{pmatrix} 0 & 1 \\ 0 & 1 \end{pmatrix} = \begin{pmatrix} 2 & 0 \\ 0 & 0 \end{pmatrix}
\end{align*}
Computing $AB$ with the row column method confirms our result.
$$AB = \begin{pmatrix} 1&0&1\\0&1&1 \end{pmatrix}\begin{pmatrix}2&-1\\0&-1\\0&1 \end{pmatrix}=\begin{pmatrix} 2+0+0&-1+0+1\\0+0+0&0-1+1 \end{pmatrix}=\begin{pmatrix} 2&0\\0&0 \end{pmatrix}$$
Block Matrix Inversion
In some cases, matrix partitioning can be used to give us convenient expressions for the inverse of a matrix. Recall that the inverse of $n\times n$ matrix $A$ is a matrix $B$, that has the same dimensions as $A$ and satisfies
$$AB= BA = I$$
where $I$ is the $n\times n$ identity matrix. As we will see in the next example, we can use this equation to construct expressions for the inverse of a matrix.
Example 3: Block Matrix Inversion
Recall, using our formula for a $2\times 2$ matrix, \begin{align}\begin{pmatrix}
a & b \\ 0 & c
\end{pmatrix} ^{-1} = \frac 1 {ac} \begin{pmatrix}
c & -b \\ 0 & a
\end{pmatrix} = \begin{pmatrix} 1/a & -b/(ac) \\ 0 & 1/c \end{pmatrix} \quad \quad (1) \end{align}provided that $ac\ne 0$. Suppose $ A$, $B$, and $C$ are invertible $n\times n$ matrices. Suppose we wish to construct an expression for the inverse of the matrix $$ P =
\begin{pmatrix}
A & B \\ 0 & C
\end{pmatrix}
$$ To construct the inverse of $P$, we can write
$$PP^{-1} = P^{-1}P = I_n$$
where $P^{-1}$ is the matrix we seek. If we let $P^{-1}$ be the block matrix
$$P^{-1} = \begin{pmatrix} W & X \\ Y & Z \end{pmatrix}$$
we can determine $P^{-1}$ by solving $PP^{-1} = I$ or $P^{-1}P=I$. Solving $PP^{-1} = I$ gives us:
\begin{align*}
I_n &= PP^{-1} \\ \begin{pmatrix} I & 0 \\ 0 & I \end{pmatrix} &= \begin{pmatrix} A & B \\ 0 & C \end{pmatrix}\begin{pmatrix} W & X \\ Y & Z \end{pmatrix} \\
\begin{pmatrix} I & 0 \\ 0 & I \end{pmatrix} &= \begin{pmatrix} AW+BY & AX+BZ \\ CY & CZ \end{pmatrix}
\end{align*}
The above matrix equation gives us a set of four equations that can be solved to determine $W$, $X$, $Y$, and $Z$. The block in the second row and first column gives us $CY = 0$. It was given that $C$ is an invertible matrix, so $Y$ is a zero matrix because
\begin{align*} CY &= 0 \\ C^{-1}CY &= C^{-1}0 \\ IY &=0 \\ Y &= 0\end{align*}
Likewise the block in the second row and second column yields $CZ = I$, so
\begin{align*}
CZ &= I \\
C^{-1}CZ &= C^{-1}I \\
Z &= C^{-1}
\end{align*}
Now that we have expressions for $Y$ and $Z$ we can solve the remaining two equations for $W$ and $X$. Solving for $X$ gives us the following expression.
\begin{align*}
AX+BZ &= 0 \\
AX+BC^{-1} &= 0 \\
AX &= – BC^{-1} \\
A^{-1}AX &= -A^{-1}BC^{-1} \\
X &= -A^{-1}BC^{-1}
\end{align*}
Solving for $Y$:
\begin{align*}
AW+BY &= I \\ AW+B0 &= I \\ A^{-1}AW &= A^{-1}I \\ W &= A^{-1}
\end{align*}
We now have our expression for $P^{-1}$:
$$P^{-1} = \begin{pmatrix} W & X \\ Y & Z \end{pmatrix} = \begin{pmatrix} A^{-1} & -A^{-1}BC^{-1} \\ 0 & C^{-1} \end{pmatrix}$$
Note that in the special case where $n=2$ that each of the blocks are scalars and our expression is equivalent to Equation (1).
Summary
In this post we used partitioned matrices to solve problems regarding matrix invertibility and matrix multiplication. Partitioned matrices can be multiplied using this method, as if each block were a scalar provided each block has appropriate dimensions so that products are defined. Matrix partitioning can be used to help derive new algorithms because they give a more concise representation of a matrix and of operations on matrices.