\( \newcommand\C{\mathbb{C}} \newcommand\R{\mathbb{R}} \newcommand\Q{\mathbb{Q}} \newcommand\Z{\mathbb{Z}} \newcommand\N{\mathbb{N}} \newcommand\ve[1]{\boldsymbol{#1}} \newcommand\norm[1]{\left\Vert #1 \right\Vert} \newcommand\ip[2]{\langle #1,\, #2 \rangle} \def\u{\ve{u}} \def\v{\ve{v}} \def\w{\ve{w}} \def\e{\ve{e}} \def\span{\operatorname{span}} \def\Id{\operatorname{Id}} \def\kernel{\operatorname{ker}} \def\image{\operatorname{im}} \def\L{\mathscr{L}} \def\T{\mathscr{T}} \def\ev{\operatorname{ev}} \def\trace{\operatorname{trace}} \def\diag{\operatorname{diag}} \def\rank{\operatorname{rank}} \def\adj{\operatorname{adj}} \def\proj{\operatorname{proj}} \)
Let \(T\colon V \to V\) be a linear map. We note that if \([T]_\beta\) is a diagonal matrix for some ordered basis \(\beta\) of \(V\), the computation of quantities such as the trace and determinants of \(T^k\) becomes simple. If such a \(\beta\) exists, then the linear map \(T\) is called diagonalizable.
A matrix \(A \in M_n(F)\) is called diagonalizable if \(A\) is similar to a diagonal matrix.
Suppose \(T\) is diagonalizable, and \([T]_\beta\) is diagonal with respect to the basis \(\beta = \{\v_1, \dots, \v_n\}\). Then, \[[T]_\beta = D(\lambda_1, \dots, \lambda_n)\] for scalars \(\lambda_i\). This means that \(T(\v_j) = \lambda_j\v_j\). Such a basis \(\beta\) is called an eigenbasis.
A non-zero vector \(\v\in V\) is called an eigenvector if \(T\v = \lambda\v\) for some scalar \(\lambda\). The scalar \(\lambda\) is called the eigenvalue of the eigenvector \(\v\).
Note that all vectors in the null space of a linear map are eigenvectors, with eigenvalue zero.
The only possible eigenvalues of a projection map are \(0\) and \(1\). This is because \(P^2 = P\), so if \(P\v = \lambda\v\) for \(\v \neq 0\), then \(\lambda^2\v = \lambda\v\).
The following conditions are equivalent: \(\lambda\) is an eigenvalue of \(T\), \(T - \lambda I_V\) is not invertible, \(\det(T - \lambda I_v) = 0\).
Let \(T\colon V \to V\) be a linear map. If \(lambda\) is an eigenvalue, its eigenspace \(E_\lambda\) is defined as the set of all vectors having that eigenvalue. \[ E_{\lambda_i} = \{\v\in V: T\v = \lambda_i\v\}. \] If \(T\) is diagonalizable with distinct eigenvalues \(\lambda_1, \dots, \lambda_k\), then there exists a basis \(\beta = \{\v_1, \dots, \v_n\}\) of eigenvectors, i.e. an eigenbasis. We can rearrange the \(\v_j\) into \(k\) groups such that the \(i^\text{th}\) group of \(r_i\) vectors have eigenvalue \(\lambda_i\). Clearly, \(r_1 + \dots + r_k = n\). Thus, the \(i^\text{th}\) group of eigenvectors, say \(\beta_i\), is a basis of the eigenspace \(E_{\lambda_i}\). Thus, it follows that \[ V = E_{\lambda_1} \oplus \dots \oplus E_{\lambda_k}. \]
Note that for a projection map \(P\), the eigenspaces \(E_0\) and \(E_1\) are precisely the kernel and image of \(P\).
For a linear map \(T\colon V \to V\), we consider the polynomial \[ p_T(x) = \det(T - xI_V). \] The roots of this polynomial are the eigenvalues of \(T\).
The Cayley-Hamilton theorem states that a matrix satisfies its characteristic polynomial.
The unique monic polynomial of minimal degree which is satisfied by \(A\) is called its minimal polynomial.
Similar matrices have the same characteristic and minimal polynomials.
For vector spaces over \(\R\) or \(\C\), we may introduce an extra structure in order to eventually talk about lengths and angles.
An inner product is a vector space \(V\) equipped with a map \[ \ip{\cdot}{\cdot}\colon V\times V \to F, \] which satisfies the following.
Note that even if \(V\) is defined over the field \(\C\), conjugate symmetry demands that \(\ip{\v}{\v} = \overline{\ip{\v}{\v}}\), which means that the inner product \(\ip{\v}{\v}\) must be real.
Some more properties are immediately evident.
The length of a vector \(\v\) in an inner product space is defined as \(\norm{\v} = \ip{\v}{\v}^{1 /2}\). Note that by positivity, the length of \(\v\) is zero iff \(\v = \ve0\).
Note that if \(\ip{\v_1}{\v} = \ip{\v_2}{\v}\) for all \(\v\in V\), then \(\v_1 = \v_2\). We show this by noting that linearity gives \(\ip{\v_1 - \v_2}{\v} = 0\) for all \(\v\). In particular, set \(\v = \v_1 - \v_2\), which forces \(\v_1 - \v_2 = \ve0\).
It can be shown that the length satisfies the parallelogram law, \[ \norm{\v + \w}^2 + \norm{\v - \w}^2 = 2\norm{\v}^2 + 2\norm{\w}^2. \]
This immediately follows from \[ \norm{\v + \w}^2 = \norm{\v}^2 + \ip\v\w + \ip\w\v + \norm{\w}^2,\\ \norm{\v - \w}^2 = \norm{\v}^2 - \ip\v\w - \ip\w\v + \norm{\w}^2.\]
The Cauchy-Schwarz inequality states that \[ |\ip{\v}{\w}| \leq \norm{\v}\cdot\norm{\w}. \]
This is trivially true when \(\w = 0\). When \(\w \neq 0\), set \(\lambda = \ip{\v}{\w}/\norm{\w}^2\). Thus, \[ 0 \leq \norm{\v - \lambda\w}^2 = \ip{\v}{\v} - \ip{\lambda\w}{\v} - \ip{\v}{\lambda\w} + \ip{\lambda\w}{\lambda\w}. \] We can take the factors of \(\lambda\) outside, noting that \(\ip{\lambda\v}{\mu\w} = \lambda\overline{\mu}\ip\v\w\). Simplifying, \[ 0 \leq \norm{\v}^2 - \frac{|\ip{\v}{\w}|^2}{\norm{\w}^2}. \] Rearrangement gives the desired inequality.
Equality holds iff \(\v = \lambda \w\) for some non-negative real scalar \(\lambda\).
This leads to the triangle inequality \[ \norm{\v + \w} \leq \norm{\v} + \norm{\w}. \]
We note that \[ \norm{\v + \w}^2 = \norm{\v}^2 + \ip\v\w + \ip\w\v + \norm{\w}^2 \leq \norm{\v}^2 + 2|\ip\v\w| + \norm{\w}^2. \] This follows since for \(z = a + ib\), we have \(z\overline{z} = z\overline{z} = a^2 + b^2\), so \(|z| \geq a\). Cauchy-Schwarz gives \[ \norm{\v + \w}^2 \leq \norm{\v}^2 + 2\norm{\v}\norm{\w} + \norm{\w}^2 = \left(\norm{\v} + \norm{\w}\right)^2. \]
The standard inner product on \(\C^n\) is defined as \[ \ip{\v}{\w} = \sum_i v_i\overline{w_i}. \]
The inner product on complex functions over \([0, 2\pi]\) may be defined as \[ \ip{f}{g} = \frac{1}{2\pi}\int_0^{2\pi} f(x)\overline{g(x)}\:dx.\]
The inner product on matrices over \(\C\) may be defined as \[ \ip{A}{B} = \trace(B^*A). \] Note that \(B^*\) is the conjugate transpose of \(B\).
\(\ip\cdot\cdot\) is an inner product on \(\R^n\) iff \(\ip\v\w = \v^\top A\w\) for a symmetric matrix \(A\) whose eigenvalues are strictly positive.
Similarly, \(\ip\cdot\cdot\) is an inner product on \(\C^n\) iff \(\ip\v\w = \v^* A\w\) for a self adjoint matrix \(A\) (we must have \(A^* = A\)) whose eigenvalues are strictly positive.
In real vector spaces, note that \[ -\norm{\v}\norm{\w} \leq \ip\v\w \leq \norm{\v}\norm{\w}. \] This allows us to define the angle between two real vectors. Thus, \(\ip\v\w = \norm{\v}\norm{\w}\cos\theta\), where \(\theta\) is a unique angle in \([0, \pi]\). In other words, the angle between \(\v\) and \(\w\) is defined as \[ \theta = \arccos\frac{\ip\v\w}{\norm{\v}\norm{\w}}. \]
Two vectors are orthogonal when \(\ip\v\w = 0\), i.e. \(\theta = \pi/2\). We write \(\v\perp\w\).
When \(\v \perp \w\), the Pythagorean Theorem holds. \[ \norm{\v + \w}^2 = \norm{\v}^2 + \norm{\w}^2. \] More generally, if \(\v_1, \dots, \v_n\) are mutually (pairwise) orthogonal, then \[ \norm{\v_1 + \dots + \v_n}^2 = \norm{\v_1}^2 + \dots + \norm{\v_n}^2. \]
An orthonormal set is a collection \(\{\v_1, \dots, \v_n\} \subset V\) such that \(\ip{\v_i}{\v_j} = \delta_{ij}\).
If \(V\) is an inner product space of dimension \(n\), then any orthonormal set has at most \(n\) elements. Moreover, any orthonormal set of size \(n\) must be a basis.
An orthonormal set \(\{\v_1, \dots, \v_k\}\) is linearly independent. Consider \[ \lambda_1\v_1 + \dots + \lambda_k\v_k = \ve0. \] Taking successive inner products with \(\v_i\), we see that \(\lambda_i = 0\).
If \(\beta = \{\v_1, \dots, \v_n\}\) is an orthonormal basis of \(V\), then any \(v \in V\) is written uniquely as \[ \v = \lambda_1\v_1 + \dots + \lambda_n\v_n, \] where \(\lambda_i = \ip{\v}{\v_i}\).
Define the projection of \(\v\) along \(\w\) as \[ \proj(\v, \w) = \frac{\ip{\v}{\w}}{\ip{\w}{\w}}\w. \] Also define the normalised vector \[ \hat{\v} = \frac{\v}{\norm{\v}}. \] Given a basis \(\beta = \{\v_1, \dots, \v_n\}\) of \(V\), we construct \[\begin{align} \w_1 &= \v_1, \\ \w_2 &= \v_2 - \proj(\v_2, \w_1) \\ \w_3 &= \v_3 - \proj(\v_3, \w_1) - \proj(\v_3, \w_2), \\ \vdots &\quad \vdots \\ \w_n &= \v_n - \proj(\v_n, \w_1) - \dots - \proj(\v_n, \w_{n - 1}). \end{align}\] Thus, \(\gamma = \{\hat{\w}_1, \dots, \hat{\w}_n\}\) is an orthonormal basis of \(V\). This process of constructing an orthonormal basis is called the Gram-Schmidt process. \[ \w_k = \v_k - \sum_{i = 1}^{k - 1} \proj(\v_k, \w_i). \]
Bessel's inequality states that given an orthonormal set \(\{\e_1, \dots, \e_k\}\), we have \[ \norm{\v}^2 \geq |\ip{\v}{\e_1}|^2 + \dots + |\ip{\v}{\e_1}|^2. \]
To show this, simply extend the orthonormal set to an orthonormal basis, and write \[ \v = \ip{\v}{\e_1}\e_1 + \dots + \ip{\v}{\e_n}\e_n. \] Apply the Pythagorean Theorem to obtain the desired inequality.
Note that we have equality when the given orthonormal set is also a basis.
Let \(W\) be a subspace of an inner product space \(V\). Choose a basis \(\{\w_1, \dots, \w_k\}\) of \(W\), and apply the Gram-Schmidt process to obtain an orthogonal basis \(\gamma = \{\v_1, \dots, \v_k\}\) of \(W\). Extend this to a basis of \(V\) and apply Gram-Schmidt again to obtain an orthogonal basis \(\beta = \{\v_1, \dots, \v_n\}\) of \(V\).
The orthogonal complement \(W^\perp\) of \(W\) is defined as the set \(\{\v \in V: \ip{\v}{\w} = 0 \text{ for all }\w \in W\}\). We note that \(W^\perp\) is also a subspace of \(V\).
Note that \(W\cap W^\perp = \{\ve0\}\). It can be shown that \(\dim W^\perp = \dim{V} - \dim{W}\). Indeed, \(V = W\oplus W^\perp\). Thus, every vector \(\v \in V\) can be written as \(\v = w + w'\), where \(w \in W\) and \(w' \in W^\perp\). The map \[ P\colon V \to V, \qquad \v \mapsto \w, \] is called the orthogonal projection of \(V\) onto \(W\). Note that \(\image{P} = W\) and \(\kernel{P} = W^\perp\).
Given a finite dimensional inner product space \(V\), for each \(\v\in V\), we have the linear map \[ \ip{\cdot}{\v}\colon V \to F, \qquad \w \mapsto \ip\w\v. \] This leads to the linear map \[ \Phi_V\colon V \to V^*, \qquad \v \mapsto \ip{\cdot}{\v}. \] Furthermore, note that \(\Phi_V\) is a linear isomorphism.
The finite dimensional Riesz representation theorem states that for a linear map \(T\colon V \to F\) where \(V\) is a finite dimensional inner product space, there exists \(\v_0 \in V\) such that \(T\v = \ip{\v}{\v_0}\).
The adjoint of a linear map \(T\colon V \to W\) is the map \(W \to V\) defined as the composite \[ W \overset{\Phi_W}{\longrightarrow} W^* \overset{T^*}{\longrightarrow} V^*\overset{(\Phi_V)^{-1}}{\longrightarrow} V. \] This is denoted as \(T^*\).
Let \(\w \in W\) and consider \(T^*\w\). Note that \(T^*\w = \Phi_V^{-1}\ip{T(\cdot)}{\w}_W\). Thus, \[ \ip{T(\cdot)}{\w}_W = \ip{\cdot}{T^*\v}_V. \] This property is enough to define the adjoint, i.e. the adjoint is the map \(T^*\) satisfying \(\ip{T\v}{\w}_W = \ip{\v}{T^*\w}_V\) for all \(\v \in V\), \(\w \in W\).
Note that \((S + T)^* = S^* + T^*\), \((\lambda T)^* = \lambda^* T^*\), \((ST)^* = T^*S^*\) and \(T^{**} = T\).
Let \(V\) be a finite dimensional inner product space and let \(\beta\) be an orthonormal basis. Then, \([T^*]_\beta = [T]_\beta^*\).
A linear map is called self-adjoint if \(T = T^*\). Equivalently, a linear map \(T\colon V \to V\) is self-adjoint if for any \(\v, \w \in V\), we have \(\ip{T\v}{\w} = \ip{\v}{T\w}\).
\(T\) is self-adjoint iff \([T]_\beta\) is Hermitian for any orthonormal basis \(\beta\), i.e. \([T]_\beta = [T]_\beta^*\).
Let \(T\colon V \to V\) be a self-adjoint operator and \(\lambda \in \C\) is an eigenvalue, with \(T\v = \lambda\v\) for some \(\v \neq \ve{0}\). Then, \(\lambda \in \R\). Moreover, the eigenspaces associated with distinct eigenvalues are orthogonal.
The spectral theorem for real self-adjoint maps says that \(T\colon V \to V\) is self-adjoint if there exists an orthonormal eigenbasis.
The spectral theorem for normal operators says that \(T\colon V \to V\) is normal iff there exists an orthonormal eigenbasis.
The distance between vectors \(\v\) and \(\w\) in an inner product space \(V\) is defined to be \(\norm{\v - \w}\).
A linear map \(T\colon V \to V\) is called an isometry if it is distance preserving. \[ \norm{T\v - T\w} = \norm{\v - \w}, \quad\text{for all }\v, \w \in V. \] Equivalently, \[ \norm{T\v} = \norm{\v}, \quad \text{for all }\v \in V. \] It can be shown that \(T\) is an isometry iff \(\ip{T\v}{T\w} = \ip{\v}{\w}\). Thus, isometries are also angle preserving.
It can also be shown that an isometry satisfies \(T^* T = I_V\).
The matrix \([T]_\beta\) is unitary.