Review of Block Codes

Theory

Linear Block Codes

Linear Block Codes are arguably the most popular class of forward error correcting codes used in communication systems.

A linear block code C\mathcal{C} has three important parameters of interest, the blocklength nn, the rate RR of the code given by the ratio k/nk/n of its dimension kk to its length nn, and its minimum distance, which is the least Hamming distance between any pair of distinct codewords. In the case of linear block codes, the minimum distance of the code is exactly equal to the minimum weight of any non-zero codeword in the code. We denote the minimum distance of the code as dmin(C)d_{\min}(\mathcal{C}).

Binary Linear Block Codes, Codewords, Dimension of a Code, Number of Codewords, and Rate

Our focus is on binary linear block codes. A binary linear block code is a subspace of the nn-dimensional vector space F2n\mathbb{F}^n_2 over the finite field F2\mathbb{F}_2 with two elements {0,1}\{0,1\}, with the operations addition and multiplication modulo 22.

The dimension of a linear code is typically denoted by kk. Observe that 1kn1\leq k \leq n. We call such a binary linear code as [n,k][n,k] binary linear code, where nn refers to the blocklength of the code, and kk refers to its dimension. Since the dimension of the code subspace is kk, the code itself contains 2k2^k codewords, corresponding to all possible linear combinations. The rate of a [n,k][n,k] linear block code is defined as the ratio kn\frac{k}{n}, which signifies the fraction of the message bits to the number of bits in the codeword.

Generator Matrix and the code as its Rowspace

A generator matrix GG of an [n,k][n,k] binary linear code C\cal C is a k×nk\times n matrix over F2\mathbb{F}_2, containing a basis of the code space as its rows. In other words, the code C\cal C is the rowspace of the matrix GG. Let us see an example.

G1=[100011101010110011101] \begin{aligned} G_1=\begin{bmatrix} 1&0&0&0&1&1&1\\ 0&1&0&1&0&1&1\\ 0&0&1&1&1&0&1 \end{bmatrix} \end{aligned} The rowspace of G1G_1 is the code C1\mathcal{C}_1 is the set of all the linear combinations of all rows of G1G_1. Since the three rows of GG are linearly independent, we will obtain 88 unique codewords, as follows. C1={(0000000)(1000111)(0101011)(0011101)(0011101)(1101100)(1011001)(0110110)} \mathcal{C}_1= \begin{Bmatrix} (0&0&0&0&0&0&0)\\ (1&0&0&0&1&1&1)\\ (0&1&0&1&0&1&1)\\ (0&0&1&1&1&0&1)\\ (0&0&1&1&1&0&1)\\ (1&1&0&1&1&0&0)\\ (1&0&1&1&0&0&1)\\ (0&1&1&0&1&1&0) \end{Bmatrix} Note that the dimension of this code is k=3k=3, its blocklength is n=7n=7, and its rate is 37\frac{3}{7}. Observe that the all-zero vector is a codeword of this code. Indeed, the all-zero vector is a member of every codeword. We also observe that the minimum weight of any non-zero codeword in C1\cal{C}_1 is 44. Thus, the minimum distance of the code is dmin(C1)=4d_{\min}(\cal{C}_1)=4.

Parity Check Matrix of a Code

A parity check matrix of an [n,k][n,k] binary linear code C\mathcal{C} is l×nl\times n matrix HH, with entries from F2\mathbb{F}_2 such that HH contains exactly nkn-k linearly independent rows and HcT=0,cCH\boldsymbol{c}^T=\boldsymbol{0}, \forall \boldsymbol{c}\in\cal{C}. Thus, the codewords of C\cal{C} are orthogonal to the rows of HH. If the generator matrix GG of the code C\cal C is of the form [IkP][I_k | P] (a column-wise concatenation of a k×kk\times k identity matrix IkI_k, and a k×nkk\times n-k matrix PP), then a parity check matrix for the code can be written as [PTInk][P^T|I_{n-k}], where PTP^T denotes the transpose of the matrix PP. For the example code C1\cal{C}_1 with generator matrix G1G_1 described before, we see that such parity check matrix is given by H1=[0111000101010011000101110001]. H_1=\begin{bmatrix} 0&1&1&1&0&0&0\\ 1&0&1&0&1&0&0\\ 1&1&0&0&0&1&0\\ 1&1&1&0&0&0&1 \end{bmatrix}. Observe that nk=4n-k=4 in this case, and thus we see the 4×44\times 4 identity submatrix in H1H_1. However, this is not the only parity check matrix for C1\cal{C}_1. The following matrix H2H_2 is also a valid parity check matrix for the code. H2=[1101100101010011000100010011]. H_2=\begin{bmatrix} 1&1&0&1&1&0&0\\ 1&0&1&0&1&0&0\\ 1&1&0&0&0&1&0\\ 0&0&1&0&0&1&1 \end{bmatrix}. How do we verify that both H1H_1 and H2H_2 are valid parity check matrices for C1\cal{C}_1? We do this by observing that both have rank nk=4n-k=4 (they contain 44 linearly independent rows over F2\mathbb{F}_2) and also that H1cT=H2cT=0H_1\boldsymbol{c}^T=H_2\boldsymbol{c}^T=\boldsymbol{0}, for all the 88 codewords cC1\boldsymbol{c}\in\cal{C}_1 shown above.

Observe that the property that H1cT=0H_1\boldsymbol{c}^T=\boldsymbol{0} signifies that the codeword c\boldsymbol{c} (which is a nn-tuple c=(c0,...,cn1)\boldsymbol{c}=(c_0,...,c_{n-1})) satisfies nkn-k linearly independent equations. These are as follows. c1+c2+c3=0c0+c2+c4=0c0+c1+c5=0c0+c1+c2+c6=0. \begin{aligned} c_1+c_2+c_3&=0\\ c_0+c_2+c_4&=0\\ c_0+c_1+c_5&=0\\ c_0+c_1+c_2+c_6&=0. \end{aligned} Further, we also observe that, since HH has rank exactly equal to nkn-k, the code C\mathcal{C} is precisely the set of all nn-length vectors v\boldsymbol{v} taken from F2n\mathbb{F}_2^n which satisfy the equations given by HvT=0H\boldsymbol{v}^T=\boldsymbol{0}. That is, we can write C={vF2n:HvT=0}. \mathcal{C}=\{\boldsymbol{v}\in\mathbb{F}_2^n : H\boldsymbol{v}^T=\boldsymbol{0} \}. For the example discussed in this page, you can observe that, out of the set of all 272^7 binary vectors of length 77, it is exactly the set of 88 vectors vF2n\boldsymbol{v}\in\mathbb{F}_2^n, which are the codewords of C\mathcal{C} which satisfy the equation HvT=0H\boldsymbol{v}^T=\boldsymbol{0}.

Encoding

The way an [n,k][n,k] code C\cal C is used in communication is as follows. The message vector m\boldsymbol{m} of length kk is multiplied with the generator matrix GG to obtain a codeword c\boldsymbol{c} of length nn. This codeword is then transmitted through the communication channel. For example, consider the message vector m=(1,0,1)\boldsymbol{m}=(1,0,1) for the code C1\cal{C}_1. The corresponding codeword is given as mG1=(1,0,1)[100011101010110011101]=(1,0,1,1,0,1,0). \boldsymbol{m}G_1=(1,0,1)\begin{bmatrix} 1&0&0&0&1&1&1\\ 0&1&0&1&0&1&1\\ 0&0&1&1&1&0&1 \end{bmatrix}=(1,0,1,1,0,1,0). This codeword (1,0,1,1,0,1,0)(1,0,1,1,0,1,0) is transmitted through the channel, when we want to communicate the 33-bit message vector (1,0,1)(1,0,1). Indeed, each codeword in C1\cal{C}_1 is the codeword obtained by encoding a particular 33-length message vector. Similarly, for any [n,k][n,k] linear code C\cal{C}, every codeword is obtained by encoding specific kk-length message vector. Corresponding to the 2k2^k message vectors in F2k\mathbb{F}_2^k, there are 2k2^k codewords of length nn in C\cal{C}.

We will discuss decoding aspects of these codes in the subsequent experiments.