Linear Algebra I

\( \newcommand\C{\mathbb{C}} \newcommand\R{\mathbb{R}} \newcommand\Q{\mathbb{Q}} \newcommand\Z{\mathbb{Z}} \newcommand\N{\mathbb{N}} \newcommand\ve[1]{\boldsymbol{#1}} \newcommand\norm[1]{\left\Vert #1 \right\Vert} \def\u{\ve{u}} \def\v{\ve{v}} \def\w{\ve{w}} \def\e{\ve{e}} \def\span{\operatorname{span}} \def\Id{\operatorname{Id}} \def\kernel{\operatorname{ker}} \def\image{\operatorname{im}} \def\L{\mathscr{L}} \def\T{\mathscr{T}} \def\ev{\operatorname{ev}} \)

Linear maps

A linear map \(T\colon V \to W\) between vector spaces (over a field \(F\)) is such that \(T(\v + \w) = T(\v) + T(\w)\) and \(T(\lambda\v) = \lambda T(\v)\) for all \(\v, \w \in V\) and scalars \(\lambda\).

This immediately forces the following properties.

The following are all linear maps.

Indeed, all linear maps from \(\R\) to \(\R\) are scaling maps.

Note that any linear combination of linear maps is also a linear map. This hints at the fact that the set of all linear maps between \(V\) and \(W\) forms a vector space of its own.

Rank and Nullity

Let \(T\colon V\to W\) be a linear map. The null space or kernel of \(T\) is the set of all vectors in \(V\) which get mapped to \(\ve0_W\). The range or image of \(T\) is the set of all vectors in \(W\) which have a pre-image in \(V\). \[ \begin{align}\kernel{T} &= \{\v \in V: T(\v) = \ve0_W\}, \\ \image{T} &= \{T(\v): \v \in V\}. \end{align} \] It is easily verified that \(\kernel{T}\) and \(\image{T}\) are subspaces of \(V\).

If \(V\) is finite dimensional, we define the nullity of \(T\) as the dimension of its kernel, and the rank of \(T\) as the dimension of its image.

Note that if \(T\) is injective, \(\kernel{T} = \{\ve0_W\}\). Also, if \(\kernel{T} = \{\ve0_W\}\), given any \(\v_1, \v_2 \in V\) such that \(T(\v_1) = T(\v_2)\), we have \(T(\v_1 - \v_2) = \ve0\), which forces \(\v_1 - \v_2 = \ve0_W\), i.e. \(T\) is injective. Thus, a linear map is injective iff its nullity is zero.

Rank-Nullity Theorem

Let \(V\) be a finite dimensional vector space and let \(T\colon V\to W\) be a linear map. Then, \[ \dim(V) = \dim(\image{T}) + \dim(\kernel{T}). \] In other words, the dimension \(n\) of \(V\) is precisely the sum of the rank and nullity of \(T\).

As a corollary, suppose that \(\dim{V} = \dim{W}\), where \(V\) is finite dimensional. Then, the linear map \(T\colon V \to W\) is injective iff \(T\) is surjective.

Consider a linear bijection \(L\colon V \to V\). Note that the inverse map \(L^{-1}\) is also linear. With this, we begin our proof of the theorem.

Let \(k = \dim(\kernel{T}) \leq \dim(V) = n\), and let \(\{\v_1, \dots \v_k\}\) be a basis of \(\kernel{T}\). Using the Replacement Theorem, we extend this basis to of \(V\), \(\beta = \{\v_1, \dots, \v_k, \v_{k + 1}, \dots, \v_n\}\). Note that \(\{T(\v_{1}), \dots, T(\v_n)\}\) spans \(\image{T}\). For \(i = 1, \dots, k\), \(T(\v_i) = \ve0\). Thus, we claim that \(\{T(\v_{k + 1}), \dots, T(\v_n)\}\) is a basis of \(\image{T}\). To do this, it is sufficient to show that this set is linearly independent. Suppose that for \(\lambda_i \in F\), \[ \lambda_{k + 1}T(\v_{k + 1}) + \dots + \lambda_nT(\v_n) = \ve0. \] Using linearity, we see that \(T(\sum {\lambda_i\v_i}) = \ve0\). We thus find \(\mu_1, \dots, \mu_k\) such that \[ \lambda_{k + 1}\v_{k + 1} + \dots + \lambda_n\v_n = \ve0 = \mu_1\v_1 + \dots + \mu_k\v_k. \] From the linear independence of \(\beta\), we see that \(\lambda_i = \mu_i = 0\), so the set \(\{\v_{k + 1}, \dots, \v_{n}\}\) is linearly independent, hence a basis of \(\image{T}\). Thus, \(\dim(\image{T}) = n - k\), which proves the theorem.

Vector spaces of linear maps

We denote \[ \L(V, W) = \{T\colon V \to W, T \text{ is a linear map}\}.\] We also use the shorthand \(\L(V, V) = \L(V)\).

It is easily seen that \(\L(V, W)\) is a vector space, with the standard definitions of addition and scaling of functions. The zero map \(\mathbb{O}\) is its additive identity.

Question: What is the dimension of \(\L(\R, \R^n)\)?
Response: Consider \(T \in \L(\R, \R^n)\). Since \(\{1\}\) is a basis of \(\R\), \(T\) is fully determined by \(T(1)\). Thus, we look at the evaluation map \[ \ev\colon \L(\R, \R^n) \to \R^n, \quad T \mapsto T(1). \] This is clearly a linear map. Furthermore, we note that \(\ev\) is a bijection, since there is always a linear map \(T\) such that \(T(1) = \v\) given any \(\v \in \R^n\), namely \(T(\lambda) = \lambda\v\) for all \(\lambda \in R\). Also, this is the only possible map because of the linearity of \(T\). This means that \(\ev^{-1}\) is also a linear bijection. Thus, we have an identification between elements of \(\L(\R, \R^n)\) and \(\R^n\), in the form of the linear isomorphism \(\ev\). We will later see that this means \(\dim\L(\R, \R^n) = n\).

Note that we can similarly identify elements of \(\L(\R^n, \R^m)\) with elements of \(M_{m\times n}(\R)\).

Linear isomorphisms

A linear isomorphism \(T\colon V \to W\) is a linear map which is injective and surjective.

Two vector spaces are called isomorphic if there exists a linear isomorphism between them.

Note that isomorphism is an equivalence relation. Every vector space is isomorphic to itself (reflexivity), if \(V\) is isomorphic to \(W\), then \(W\) is isomorphic to \(V\) (symmetry), and if \(V\), \(W\) and \(W\), \(U\) are isomorphic, then so are \(V\), \(U\) (transitivity). Thus, we can partition the set of all vector spaces over a field \(F\) using this relation. We will see that vector spaces in the same equivalence class share the same dimension.

Let \(T \in \L(V, W)\). If \(\{\w_1, \dots, \w_n\} \subset W\) is linearly independent, and \(T\) is surjective, with \(T(\v_i) = \w_i\), then the set \(\{\v_1, \dots, \v_n\}\) is also linearly independent. If \(\{\v_1, \dots, \v_n\} \subset V\) is linearly independent, and \(T\) is injective with \(T(\v_i) = \w_i\), then the set \(\{\w_1, \dots, \w_n\}\) is linearly independent. Both follow immediately by applying the linearity of \(T\) on the linear combinations \(\sum \mu_i\v_i\) and \(\sum \lambda_i\w_i\).

If \(T\) is a linear isomorphism, and \(\{\v_1, \dots, \v_n\}\) is a basis of \(V\), then \(\{\w_1, \dots, \w_n\}\) where \(T(\v_i) = \w_i\) is a basis of \(W\). The converse is also true, since \(T^{-1}\) is also a linear isomorphism. Specifically, \(\dim{V} = \dim{W}\), if either one has been shown to be finite dimensional.

Ordered Basis

Let \(V\) be a finite dimensional vector space. An ordered basis of \(V\) is a basis \(\beta\) of \(V\), endowed with a specific order.

We show that any two vector spaces of the same finite dimension (over the same field) are isomorphic. To prove this, suppose \(V\) and \(W\) have the same finite dimension \(n\), and let \(\beta = \{\v_1, \dots \v_n\}\) and \(\gamma = \{\w_1, \dots, \w_n\}\) be ordered bases of them respectively. Define the linear map \(T\colon V \to W\) by first declaring \(\v_i \mapsto \w_i\). Note that since we want \(T\) to be linear, we also require \[ \lambda_1\v_1 + \dots + \lambda_n\v_n \mapsto \lambda_1\w_1 + \dots + \lambda_n\w_n. \] Note that this makes \(T\) well-defined on \(V\). This is because every vector in \(V\) can be written as a linear combination of the basis vectors \(\{\v_i\}\), and uniquely so. Again, \(T\) is also surjective, since any vector in \(W\) can be written as a linear combination of the basis vectors \(\{\w_i\}\), so its pre-image in \(V\) is the linear combination of the basis vectors \(\{\v_i\}\) with the same coefficients. From the Rank-Nullity Theorem, we see that \(T\) must also be injective. Thus, \(T\) is a linear isomorphism.

As a corollary, any vector space \(V\) of dimension \(n\) over \(\R\) is isomorphic to \(\R^n\). If we choose the standard basis \(\{\e_1, \dots, \e_n\}\) and an ordered basis \(\beta = \{\v_1, \dots, \v_n\}\) of \(V\), we see that \[ T\colon \R^n \to V,\quad T(\lambda_1, \dots, \lambda_n) = \lambda_1\v_1 + \dots + \lambda_n\v_n \] is a linear isomorphism.

Let \(T\colon V\to W\) be a linear map. If we denote the range of \(T\) as \(\image{T}\), note that \(T\colon V \to \image(T)\) is surjective. Denote the null space of \(T\) as \(\kernel{T}\). Consider the map \[\T\colon V/\kernel{T} \to \image{T}, \quad [\v] \mapsto T(\v).\] We see that this is well-defined. Note that \([\v_1] = [\v_2]\) iff \(\v_1 - \v_2 \in N(T)\). This means that \(T(\v_1 - \v_2) = \ve0\), so \(\T[\v_1] - \T[\v_2] = \ve0\). All these implications were bidirectional, so we have also shown that \(\T\) is injective. The surjectiveness of of \(\T\) follows from the surjectiveness of \(T\) by construction. It also follows that \(\T\) is linear, so \(\T\) is a linear isomorphism.

If \(V\) is finite dimensional, then \(\image{T}\) is the span of \(T(\v_i)\) for the finite basis \(\{\v_i\}\) of \(V\), so \(\image{T}\) is also finite dimensional. As \(\kernel{T}\) is a subspace of \(V\), it is also finite dimensional. It can be shown that \(\dim(V/\kernel{T}) = \dim(V) - \dim(\kernel{T})\). Thus, we obtain \[ \dim(V) - \dim(\kernel{T}) = \dim(\image{T}). \]

Note that \(V/W\) can be finite dimensional even if \(V\) and \(W\) are not. Consider \(V = P(\R)\) and let \(W \subset V\) be the set of polynomials with no constant term. Thus, each element of \(V/W\) is of the form \([\lambda]\) for a constant \(\lambda\), since all polynomials in \(V\) having the same constant term are equivalent. Define \[ T\colon V/W \to \R, \quad [p(x)] \mapsto p(0). \] This map is well-defined and surjective, and we can show that this is a linear isomorphism. Thus, \(V/W\) has a dimension of \(1\).

Coordinate vectors

Given a finite dimensional vector space \(V\) with an ordered basis \(\beta = \{\v_1, \dots, \v_n\}\), we can uniquely write every \(\v \in V\) as a linear combination \( \lambda_1\v_n + \dots + \lambda_n\v_n\). Then, \[ [\v]^\beta = (\lambda_1 \; \dots \; \lambda_n)^\top \] is called the coordinate vector of \(\v\) with respect to the ordered basis \(\beta\).

Matrix representation

Let \(T\colon V \to W\) be a linear map, and let \(\beta = \{v_1, \dots, \v_n \}\) and \(\gamma = \{\w_1, \dots, \w_m\}\) be ordered bases for \(V\) and \(W\) respectively. Note that \[ T(\v_j) = \sum_{i = 1}^m a_{ij} \w_i, \] for unique scalars \(a_{ij}\). Thus, these scalars form an \(m \times n\) matrix \(A = [a_{ij}]\).

We seek a relationship between \([\v]^\beta\) and \([T\v]^\gamma\). Suppose \(\v = x_1\v_1 + \dots x_n\v_n\). Then, \[ \begin{align} T(\v) &= x_1T(\v_1) + \dots + x_nT(\v_n) \\ &= \sum_{i}x_1a_{i1}\w_1 + \dots + \sum_{i}x_na_{in}\w_i \\ &= \sum_{i, j}x_ja_{ij}\w_i. \end{align} \] Therefore, \[ [T\v]^\gamma = \left(\sum_{j}a_{1j}x_j\; \dots\; \sum_{j}a_{mj}x_j\right)^\top = A[\v]^\beta. \]

We call the matrix \(A\) the matrix representation of \(T\) with respect to \(\beta\) and \(\gamma\). We write \(A = [T]_\beta^\gamma\). Thus, \[ [T\v]^\gamma \;=\; [T]_\beta^\gamma\,[\v]^\beta. \] Note that \[ [T]_\beta^\gamma \;=\; \Big(\;[T\v_1]_\gamma \quad\cdots\quad [T\v_n]_\gamma\;\Big). \] By convention, if \(\beta = \gamma\), we simply write \([T]_\beta^\gamma = [T]_\beta\).

For example, the identity map \(\Id\colon V \to V\), where \(\beta = \{\v_1, \dots, \v_n\}\) is an ordered basis of \(V\), has the matrix representation \(I_n\) which is simply the \(n \times n\) identity matrix.

Suppose \(T\colon V \to V\) for some finite dimensional vector space \(V\). We choose a basis \(\{\v_1, \dots \v_k\}\) of the null space \(\kernel{T}\). We extend this to an ordered basis \(\beta = \{\v_1, \dots, \v_k, \v_{k + 1}, \dots, \v_n\}\) of \(V\). Set \(\w_j = T(\v_j)\) for \(j > k\). Thus, the set \(L = \{\w_{k + 1}, \dots, \w_n\}\) is linearly independent. Indeed, \(L\) is a basis of the range of \(T\), i.e. \(\image{T}\) which has dimension \(n - k\). Thus, we extend this to the basis \(\gamma = \{\w_1, \dots, \w_k, \w_{k + 1}, \dots, \w_n\}\) of \(V\). Therefore, we obtain \[ [T]_\beta^\gamma = \begin{pmatrix}0_{k\times k} & 0_{k\times(n - k)} \\ 0_{(n-k)\times k} & I_{n-k}\end{pmatrix}.\]

It can be shown that the space \(\L(V, W)\) is isomorphic to \(M_{m\times n}(F)\), where \(V\) and \(W\) are finite dimensional vector spaces. Simply consider the map \[ \Phi\colon\L(V, W) \to M_{m\times n}(F), \quad T \mapsto [T]_\beta^\gamma, \] where \(\beta\) and \(\gamma\) are ordered bases of \(V\) and \(W\) respectively.

Question: What is the matrix representation of the differentiation map, \[ \mathscr{D}\colon P_3(\R) \to P_2(\R), \quad p \mapsto p', \] under the standard bases \(\beta = \{1, x, x^2, x^3\}\) and \(\gamma = \{1, x, x^2\}\)?
Response: It is easily seen that \[ [\mathscr{D}]_\beta^\gamma = \begin{pmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 2 & 0 \\ 0 & 0 & 0 & 3 \end{pmatrix}.\] This is because \(a + bx + cx^2 + dx^3 \mapsto b + 2cx + 3dx^2\). Thus, \(1 \mapsto \ve{0}\), \(x \mapsto 1\), \(x^2 \mapsto 2x\) and \(x^3 \mapsto 3x^2\).

By reshaping the differentiation map so that \(\mathscr{D}\colon P_3(\R) \to P_3(\R)\), then \([\mathscr{D}]_\beta\) simply acquires an additional row filled with zeros. \[ [\mathscr{D}]_\beta = \begin{pmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 2 & 0 \\ 0 & 0 & 0 & 3 \\ 0 & 0 & 0 & 0 \end{pmatrix}.\] Note that the determinant of \([\mathscr{D}]_\beta\) is zero, so \(\mathscr{D}\) is not an isomorphism.

Dual spaces

Let \(V\) be a vector space over \(\R\). The dual of \(V\), denoted as \(V^*\), is defined to be \(\L(V, \R)\).

Note that if \(V\) is finite dimensional, then the dimensions of \(V\) and \(V^*\) are the same.

Consider a linear map \(T\colon \R^n \to \R\). Suppose \(T(\e_i) = \lambda_i\). Then, \[ T(\v) = v_1T(\e_1) + \dots + v_nT(\e_n) = \lambda_1v_1 + \dots + \lambda_nv_n. \] Thus, we can identify \(V\) with the vector \(\Lambda = (\lambda_1, \dots, \lambda_n)\). This defines a map from \((\R^n)^*\) to \(\R^n\).

Define the projection maps \[ T_i\colon \R^n \to \R, \quad (v_1, \dots, v_n) \mapsto v_i. \] Thus, \[ T(\v) = \lambda_1v_1 + \dots + \lambda_nv_n = \lambda_1T_1(\v) + \dots + \lambda_nT_n(\v). \] Thus, \(\{T_1, \dots, T_n\}\) forms a basis of \((\R^n)^*\). The elements of this space are called covectors or linear functionals.

Choose an ordered basis \(\beta = \{\v_1, \dots, \v_n\}\) of \(\v\). Define the linear maps \[ \v_i^*\colon V \to \R, \quad \lambda_1\v_1 + \dots + \lambda_n\v_n \mapsto \lambda_i. \] It can be shown that \(\beta^* = \{\v_1^*, \dots, \v_n^*\}\) is a basis of \(V^*\). This is because for arbitrary \(\v^* \in V\), pick \(\v = \lambda_1\v_1 + \dots + \lambda_n\v_n\). Since \(\lambda_i = \v_i^*\v\), we have \[ \v^*\v = \lambda_1\v^*\v_1 + \dots + \lambda_n\v^*\v_n = (\v^*\v_1)\v_1^*\v + \dots + (\v^*\v_n)\v_n^*\v. \] Thus, \(\v^*\) is a linear combination of \(\{\v_i^*\}\), with coefficients \(\mu_i = \v^*\v_i\). \[ \v^* = \mu_1\v_1^* + \dots + \mu_n\v_n^*. \] Note that \(\v_i^*\v_j = \delta_{ij}\).

Consider the linear map \[ T_\beta\colon V \to V^*, \quad \lambda_1\v_1 + \dots + \lambda_n\v_n \mapsto \lambda_1\v_1^* + \dots + \lambda_n\v_n^*. \] It can be shown that this is a linear isomorphism.

Suppose \(V = \R^n\). Define \[ \v^*\colon\R^n \to \R, \quad \w \mapsto \v\cdot\w = \sum v_iw_i. \] It is easily verified that \(\v^*\) is a linear map. Thus, all \(\v^*\) for all \(\v\) are dual vectors of \(\R^n\).

Suppose \(V\) is the set of all continuous functions \(f\colon [0, 2\pi] \to \R\). Consider the linear maps \[ S_n\colon V \to \R, \quad f\mapsto \frac{1}{2\pi}\int_0^{2\pi} f(t)\sin{nt}\:dt, \] \[ C_n\colon V \to \R, \quad f\mapsto \frac{1}{2\pi}\int_0^{2\pi} f(t)\cos{nt}\:dt. \] These numbers, \(\{C_n(f), S_n(f)\}_{n \in \Z}\), are called the Fourier coefficients of \(f\).

Suppose \(V = P_1(\R)\). Consider the linear maps \[ f_1\colon V \to \R, \quad p \mapsto \int_0^1 p(t)\:dt, \] \[ f_2\colon V \to \R, \quad p \mapsto \int_0^2 p(t)\:dt. \] Note that \(\{1, x\}\) is a basis of \(P_1(\R)\). Now, \(f_1(1) = 1\), \(f_1(x) = 1/2\), \(f_2(1) = 2\), \(f_2(x) = 2\).

We claim that \(\{f_1, f_2\}\) is a basis of \(V^*\). It suffices to show that \(f_1\) and \(f_2\) are linearly independent. Suppose \(af_1 + bf_2 = 0\) for scalars \(a, b\). Evaluating the left hand side on the basis vectors \(\{1, x\}\), we see that \(a + 2b = 0\) and \(a/2 + 2b = 0\), from which is follows that \(a, b = 0\). This proves our claim.

Question: Is there a basis \(\{p_1, p_2\}\) of \(P_1(\R)\) such that \(\{f_1, f_2\}\) is the dual basis?
Response: We seek \(p_1 = a + bx\), \(p_2 = c + dx\) such that \(p_1^* = f_1\) and \(p_2^* = f_2\). This means that \(f_ip_j = \delta_{ij}\). This requires \[ a + b/2 = 1, \quad 2a + 2b = 0, \quad c + d/2 = 0, \quad 2c + 2d = 1. \] This gives \(p_1 = 2 - 2x\) and \(p_2 = -1/2 + x\). This is indeed the required basis of \(P_1(\R)\).

Dual maps

Let \(T\colon V \to W\) be a linear map. We define its dual \(T^*\), such that \[ T^*\colon W^* \to V^*, \quad L \mapsto L\circ T. \] It is easily verified that \(T^*\) is linear.

Consider \(V^{**} = (V^*)^* = \L(\L(V, \R), \R)\), the double dual of \(V\). Each of its elements is the linear map \(L\colon V^* \to \R\). For example, let \(\v \in V\). The evaluation map \(\ev_\v\colon V^* \to \R, T \mapsto T(\v)\) is an element of the double dual of \(V\).

Consider the evaluation map \[ \Phi\colon V \to V^{**}, \quad \v \mapsto \ev_\v. \] When \(V\) is finite dimensional, \(\Phi\) is a linear isomorphism. Thus, there is a way of identifying elements \(\v \in V\) with the elements of the double dual.

Let \(T\colon V \to W\) be a linear map. If \(\beta\) and \(\gamma\) are ordered bases of \(V\) and \(W\) respectively, then \[ \left([T]_\beta^\gamma\right)^\top = [T^*]_{\beta^*}^{\gamma^*}. \]

It can be seen from this that the ranks of \(T\) and \(T^*\) are the same.

Consider the conjugation map \[ T\colon \C \to \C, \quad z \mapsto \bar{z}. \] Let \(\beta = \{1, i\} = \{\e_1, \e_2\}\) be the standard basis. The dual basis \(\beta^* = \{\e_1^*, \e_2^*\}\) satisfy \(\e_i^*\e_j = \delta_{ij}\). Thus, \(\e_1^*(a + ib) = a\) and \(\e_2^*(a + ib) = b\). Note that \[ T^*(\e_i^*)(\e_j) = \e_i^*(T(\e_j)). \] This means that \[ [T]_\beta = \begin{pmatrix}1 & 0\\ 0 & -1\end{pmatrix}, \qquad [T^*]_{\beta^*} = \begin{pmatrix}1 & 0\\ 0 & -1\end{pmatrix}. \]