5-tensors.tex

\marginnote{For a brief and \emph{concrete} explanation of tensors, I warmly recommend the following \href{https://youtu.be/f5liqUk0ZTw}{youtube video by Dan Fleisch} and \cite[Chapter XIV]{book:lieber}. For the role of tensors in the context of machine learning, you can have a look at the \href{https://ems.press/content/serial-article-files/25550}{following aricle on the EMS magazine 126 (2022)}.}
Many of the spaces that we have encountered so far are particular examples of a much larger class of objects.
In this chapter we are going to introduce all the necessary algebraic concepts.

We have seen that covectors in $V^*$ are real linear maps $V\to\R$ from the underlying space $V$ while, through the double dual, vectors can be understood as real linear maps $V^*\to\R$ from the dual space $V^*$.
In practice, \emph{tensors} are just multilinear real-valued maps on cartesian products of the form $V^*\times \cdots \times V^* \times V \times \cdot \times V$.
We have already encountered some examples; covectors, inner products and even determinants are examples of tensors:
\begin{itemize}
  \item a scalar product is a bilinear map $\langle\cdot,\cdot\rangle:V\times V\to \R$;
  \item the signed area spanned by two vectors is a bilinear map $\R^2\times\R^2\to\R$ defined by $\mathrm{area}(u,v) := u\wedge v = u^1v^2-u^2v^1$;
  \item the determinant\footnote{In fact, the signed area is the determinant of the $2\times 2$ matrix $(u \, v)$...} of a square matrix in $\mathrm{Mat}(n, \R)$, viewed as a function $\det: \LaTeXunderbrace{\R^n\times\cdots\times\R^n}_{n\mbox{ times}}\to\R$ is a $n$-linear map.
\end{itemize}

So functions of several vectors or covectors that are linear in each argument are also called multilinear forms or tensors.
It should not come as a surprise that multilinear functions of tangent vectors and covectors to manifolds appear naturally in different geometrical and physical contexts.
In this chapter we are going to discuss the general definitions and notions that interest us, some of which may be just refreshing what you have seen in multivariable analysis in the context of general vector spaces $V$.
Keep in mind, that at a certain point, we will replace $V$ with the tangent spaces $T_pM$ of a smooth manifold $M$.

\section{Tensors}

\begin{definition}
  Let $V$ be a $n$-dimensional vector space and $V^*$ its dual.
  Let
  \begin{equation}
    \mathrm{Mult}(V_1, \ldots, V_k)
  \end{equation}
  denote the space of multilinear maps $V_1\times\cdots\times V_k\to\R$.

  A multilinear map
  \begin{equation}
    \tau : \LaTeXoverbrace{V^*\times \cdots \times V^*}^{r\mbox{ times}} \times \LaTeXunderbrace{V \times \cdots \times V}_{s\mbox{ times}} \to \R
  \end{equation}
  is called \emph{tensor of type $(r,s)$}, $r$-contravariant $s$-covariant tensor, or $(r,s)$-tensor.
  Similarly as we did for the dual pairing, when convenient we define the pairing
  \begin{equation}
    \tau\left(\omega^1, \ldots, \omega^r; v_1, \ldots, v_s\right)
    =: \left(\tau \mid \omega^1, \ldots, \omega^r; v_1, \ldots, v_s \right).
  \end{equation}

  For tensors $\tau_1$ and $\tau_2$ of the same type $(r,s)$ and $\alpha_1, \alpha_2\in\R$ we define
  \begin{equation}
    \left(\alpha_1\tau_1 + \alpha_2\tau_2 | \ldots \right) := \alpha_1\left(\tau_1 | \ldots \right) + \alpha_2 \left(\tau_2 | \ldots \right).
  \end{equation}
  This equips the space
  \begin{equation}
    T^r_s(V) := \mathrm{Mult}(\LaTeXoverbrace{V^*,\ldots,V^*}^{r \mbox{ times}}, \LaTeXunderbrace{V, \ldots, V}_{s \mbox{ times}})
  \end{equation}
  of tensors of type $(r,s)$ with the structure of a real vector space\footnote{Be careful when reading books and papers, for tensor spaces the literature is wild: there are so many different conventions and notations that there is not enough space on this margin to mention them all. Note that the book of Lee inverts the order of superscripts and subscripts in $T^r_s$.}. %of dimension $(\dim V)^{r+s}$.
  In particular, $V^* = T_1^0(V)$ and $V=T_0^1(V)$.
\end{definition}

\begin{example}
  \begin{itemize}
    \item An inner product on $V$, e.g. the scalar product in $\R^n$, is a $(0,2)$-tensor.
          This means, for example that the aforementioned scalar product is an element of $T^0_2(\R^n)$.
    \item The determinant, thought as a function of $n$ vectors, is a tensor in $T_n^0(\R^n)$.
    \item Covectors are elements of $T_1^0(T_pM)$ while tangent vectors are elements of $T_0^1(T_pM)$.
  \end{itemize}
\end{example}

Take now, for example, two covectors $\omega^1, \omega^2 \in V^*$. We can define the bilinear map
\begin{equation}
  \omega^1\otimes \omega^2 : V\times V \to \R,\quad
  \omega^1\otimes \omega^2(v_1, v_2) = \omega_1(v_1)\omega_2(v_2),
\end{equation}
called the tensor product of $\omega^1$ and $\omega^2$.
This can be generalized immediately to general tensors in order to define new higher order tensors.

\begin{definition}
  Let $V$ an $n$-dimensional vector space, $\tau_1\in T_s^r(V)$, $\tau_2\in T_{s'}^{r'}(V)$.
  We define the \emph{tensor product} $\tau_1\otimes\tau_2$ as the $(r+r', s+s')$-tensor defined by
  \begin{align}
     & \tau_1\otimes\tau_2(\omega^1,\ldots,\omega^{r+r'}, v_1,\ldots,v_{s+s'})                                                    \\
     & = \tau_1(\omega^1,\ldots,\omega^{r}, v_1,\ldots,v_{s}) \cdot \tau_2(\omega^{r+1},\ldots,\omega^{r+r'}, v_{s+1},\ldots,v_{s+s'}).
  \end{align}
\end{definition}

This definition immediately implies that the map
\begin{equation}
  \otimes :  T_s^r(V)\times T_{s'}^{r'}(V) \to T_{s+s'}^{r+r'}(V)
\end{equation}
is associative and distributive but not commutative (why?).

\begin{exercise}
  Give a tensor in $T^2_0$ which is a linear combination of tensor products but cannot be written as a tensor product.
  Justify your answer.\\
  \textit{\small Hint: one of the examples at the beginning of the chapter can help.}
\end{exercise}

Of course, there is no reason to restrict ourselves to just two tensors: the definition above is immediately generalized to arbitrary tensor product, with an extra interesting property.

\marginnote{A more general approach to this proposition is by proving the universal property of tensor spaces. See for instance~\cite[Propositions 12.5, 12.7 and 12.8]{book:lee}.}
\begin{proposition}\label{prop:tensorbasis}
  Let $V$ be an $n$-dimensional vector space.
  Let $\{e_j\}$ and $\{\varepsilon^i\}$ respectively denote bases of $V=T_0^1(V)$ and $V^*=T_1^0(V)$ respectively.
  Then, every $\tau\in T_s^r(V)$ can be uniquely written as the linear combination\marginnote{\textit{Exercise}: expand Einstein's notation to write the full sum on the left with the relevant indices.}% In the expression, all indices $j_1,\ldots,j_r$, $i_1,\ldots,i_s$ run from $1$ to $n$.}
  \begin{equation}\label{eq:tensor:decomposition}
    \tau = \tau^{j_1\cdots j_r}_{i_1\cdots i_s} \, e_{j_1}\otimes\cdots\otimes e_{j_r}\otimes \varepsilon^{i_1}\otimes \cdots\otimes \varepsilon^{i_s},
  \end{equation}
  where the coefficients $\tau^{j_1\cdots j_r}_{i_1\cdots i_n}\in\R$.
  %
  Thus the $n^{r+s}$ tensor products
  \begin{equation}\label{eq:tensor:decompo}
    e_{j_1}\otimes\cdots\otimes e_{j_r}\otimes \varepsilon^{i_1}\otimes \cdots\otimes \varepsilon^{i_s}, \quad j_1,\ldots,j_r, i_1,\ldots,i_s = 1,\ldots,n,
  \end{equation}
  form a basis of $T_s^r(V)$, and $T_s^r(V)$ has dimension $n^{r+s}$.
\end{proposition}

\begin{proof}
  % Let $\{\beta^j\}$ and $\{b_i\}$ denote the bases of $V^*$ and $V$ that are dual to $\{e_j\}$ and $\{\varepsilon^i\}$, that is,
  \marginnote{A linear map is uniquely specified by its action on a basis, which in particular means that these dual bases are unique.}
  %\begin{equation}
  %  (\beta^j\mid e_i) = \delta^j_i = (\varepsilon^j \mid b_i).
  %\end{equation}
  Define
  \begin{equation}
    %\tau^{j_1\cdots j_r}_{i_1\cdots i_s} := \tau(\beta^{j_1}, \ldots, \beta^{j_r}, b_{i_1}, \ldots, b_{i_s}).
    \tau^{j_1\cdots j_r}_{i_1\cdots i_s} := \tau(\varepsilon^{j_1}, \ldots, \varepsilon^{j_r}, e_{i_1}, \ldots, e_{i_s}).
  \end{equation}
  Then, on any element of the form $(\varepsilon^{j_1}, \ldots, \varepsilon^{j_r}, e_{i_1}, \ldots, e_{i_s})$, we trivially have the decomposition~\eqref{eq:tensor:decomposition}.
  By multilinearity of all the terms involved,~\eqref{eq:tensor:decomposition} holds for any element $(\omega^1, \ldots, \omega^r, v_1, \ldots, v_s)$ after decomposing it on the basis.

  Uniqueness follows from the linear independence of the tensor products $e_{j_1}\otimes\cdots\otimes e_{j_r}\otimes \varepsilon^{i_1}\otimes \cdots\otimes \varepsilon^{i_s}$ proceeding by contradiction.
\end{proof}

\begin{exercise}
  Formalize in details the last step of the proof: uniqueness follows from the linear independence of the tensor products.
\end{exercise}

\begin{remark}
  There is a canonical isomorphism such that
  \begin{equation}
    T^r_s(V) \simeq \LaTeXoverbrace{V\otimes \cdots \otimes V}^{r\mbox{ times}} \otimes \LaTeXunderbrace{V^*\otimes \cdots \otimes V^*}_{s\mbox{ times}}.
  \end{equation}
  This allows us to choose whichever interpretation\footnote{If you are not familiar with tensor products of vector spaces and want to know more, you can start from this \href{https://web.archive.org/web/20231106094511/https://math.stackexchange.com/questions/2138459/understanding-the-definition-of-tensors-as-multilinear-maps/2141663\#2141663}{detailed Stack Exchange answer}. If you want to go deeper, I warmly recommend \href{https://web.archive.org/web/20230922070935/https://www.dpmms.cam.ac.uk/~wtg10/tensors3.html}{Tim Gower's blog post \emph{How to lose your fear of tensor products}}.} is more convenient for the problem at hand: being it a multilinear map on a cross product of spaces or an element of the tensor product of spaces.
\end{remark}

Let's go back for a moment to the example of inner products.

\begin{definition}\label{def:metric}
  \marginnote[1em]{In Definition~\ref{def:metric} and Example~\ref{ex:musicaliso} we are slightly abusing notation: as we will see, the ``true'' pseudo-metric, metric and symplectic tensors will be sections of certain tensor bundles over manifolds which pointwise share the properties presented here. Nevertheless, this definition provides a good intuition for what is coming.}
  We call \emph{pseudo-metric tensor}, any tensor $g\in T_2^0(V)$ that is
  \begin{enumerate}
    \item symmetric, i.e. $g(v,w) = g(w,v)$ for all $v,w\in T_0^1(V)$;
    \item positive semidefinite, i.e. $g(v,v)\geq0$ for all $v\neq 0$.
  \end{enumerate}

  We call \emph{non-degenerate} any tensor $g\in T_2^0(V)$ such that
  \begin{equation}
    g(v,w) = 0 \quad\forall w\in V \qquad\Longrightarrow\qquad v=0.
  \end{equation}

  A \emph{metric tensor} or \emph{inner product} is a non-degenerate pseudo-metric tensor, that is, a symmetric, positive definite $(0,2)$-tensor.
  We will briefly see later that a Riemannian metric is one such object and it provides an inner product on the tangent spaces of a manifold.

  An example of non-degenerate tensor which is not a metric is the so-called \emph{symplectic tensor}: a skew-symmetric non-degenerate $(0,2)$-tensor related to the symplectic form, a fundamental object in topology, classical mechanics and the study of Hamiltonian systems.
\end{definition}

\begin{example}\label{ex:musicaliso}
  Let $V$ be a $n$-dimensional real vector space with an inner product $g(\cdot, \cdot)$.
  %
  Denote $\{e_1, \ldots, e_n\}$ the basis for $V$ and $\{e^1, \ldots, e^n\}$ its dual basis\footnote{This is what we called $\{\varepsilon^1, \ldots,\varepsilon^n\}$ before. In this example we follow a common practice in Riemannian geometry: the same letter is used for bases elements of both spaces and the position of the indices is then what discerns where the elements belong.} for $V^*$.
  As a bilinear map on $V\times V$, the inner product is uniquely associated to a matrix $[g_{ij}]$ by $g_{ij} := g(e_i, e_j)$.

  We already mentioned that in this case we can canonically identify $V$ with $V^*$.
  Indeed, the inner product defines the isomorphisms\footnote{Often called \emph{musical isomorphisms} or index raising and index lowering operators.}
  \begin{equation}
    {}^\flat: V \to V^*,\; v\mapsto g(v, \cdot),
    \quad\mbox{and its inverse}\quad
    {}^\sharp: V^*\to V.
  \end{equation}
  \marginnote{That ${}^\flat$ is an isomorphism follows immediately from the linearity and the fact that non-degeneracy implies that its kernel contains only the zero vector.}
  The matrix of ${}^\flat$, by definition, is $[g_{ij}]$, that is,
  \begin{equation}
    (v^\flat)_i = g_{ij} v^j,
  \end{equation}
  where the $v^j$ are the components of $v$.
  Therefore, the matrix of ${}^\sharp$ is the inverse\footnote{Using lower indices for matrix entries and upper indices for the entries of the inverse is very common. It turns out to be an especially convenient notation, which simplifies many formulas in general relativity and classical mechanics.} $[g^{ij}]$ of the inner product matrix, that is,
  \begin{equation}
    (\omega^\sharp)^i = g^{ij}\omega_j,
  \end{equation}
  where the $\omega_j$ are the components of $\omega$.

  \marginnote[2em]{To add to the confusion: in the physics literature, for $v\in V$, the components $v_j$ of $v^\flat$ are often called covariant components of $v$ while the components $v^j$ of $v$ are called its contravariant components.}
  Note that, in general, $e^\flat_i\neq e^i$: indeed, by definition $e^\flat_i = g_{ij}e^j$.

  It turns out that these operators can be applied to tensors to produce new tensors.
  For example, if $\tau$ is a $(0,2)$-tensor we can define an associated tensor $\tau'$ of type $(1,1)$ by $\tau'(\omega, v) = \tau(\omega^\sharp, v)$.
  Its components are $(\tau')_i^j = g^{jk}\tau_{ik}$.
\end{example}

\begin{exercise}\label{exe:iso_vs_endo}
  Let $V$ be a finite-dimensional vector space.
  \begin{enumerate}
    \item Show that the space $T^1_1(V)$ is canonically isomorphic to the space of endomorphisms of $V$, that is, of linear maps $L:V\to V$.

    \item If $\ell\in T^1_1(V)$ is the tensor associated to $A$, show that its components $\ell_i^j$ are just the matrix entries of $A$ seen as a matrix.

    \item Of course, given the previous example, $T^1_1(V)$ is also canonically isomorphic to the space of endomorphisms of $V^*$, that is, of linear maps $\Lambda:V^*\to V^*$.
          Prove the claim by explicitly constructing the mapping $\ell \leftrightarrow \Lambda$.
  \end{enumerate}
  \textit{\small Hint: definitions can look rather tautological when dealing with tensors... think carefully about domains and co-domains.}
\end{exercise}

We are now in a good place to discuss how tensors are affected by changes of basis.
Let $L: V\to V$ be an isomorphism, then we can define a new basis $\{\widetilde e_i\}$ of $V$ by $\widetilde e_i := L e_i$. For convenience, we denote its dual basis by $\{\widetilde e^i\}$ in contrast with our initial notation.

Thinking in linear algebraic terms, what is the linear map $\Lambda:V^*\to V^*$ that relates the dual bases? This must be determined somehow by
\begin{align}\label{eq:vs_ch_bases}
  \delta^i_j = (\widetilde e^i \mid \widetilde e_j) = (\Lambda e^i \mid L e_j).
\end{align}

It is convenient, at this point, to introduce the \emph{dual map} $L^* : V^* \to V^*$ to $L$.
This is defined as the map such that for all $v\in V$ and for all $w\in V^*$
\begin{equation}
  (w | Lv) = (L^* w | v).
\end{equation}
But we already know how to define such kind of map! This is like a pullback: $L^*(w) := w \circ L$ for $w\in V^*$.
\emph{As matrices}, this means that $w^T L v = (L^T w)^T v$, so the matrix of $L^*$ is the transpose of the matrix of $L$.

Let's go back to \eqref{eq:vs_ch_bases}:
\begin{align}
   & \delta_j^i = (\Lambda e^i \mid L e_j) = (L^* \Lambda e^i \mid e_j) \\
   & \mbox{that is, } \Lambda := (L^*)^{-1}.
\end{align}
So, \emph{as matrices}, $\Lambda = (L^T)^{-1}$ is the inverse of the transpose of the matrix of $L$.

If $[l_i^j]$ is the matrix associated\footnote{Does exercise~\ref{exe:iso_vs_endo} ring any bell here?} to the endomorphism $L: V \to V$, that is, $l_i^j := (e^j | L e_i)$, and $[\lambda_i^j]$ the one associated to the endomorphism $\Lambda: V^* \to V^*$, then the above translates into
\begin{align}
   & \delta_j^i = (\Lambda e^i \mid L e_j) = (\lambda_{k_1}^i e^{k_1} \mid l_j^{k_2} e_{k_2}) = \lambda_k^i l_j^k.
\end{align}
That is, as we already saw, \emph{as matrices} they are inverses of each other.

We can transport this fact to general tensors to obtain that the components of an arbitrary tensor $\tau\in T^r_s(V)$ transform as follows.
Since
\begin{align}
  \tau
   & =
  \tau^{j_1\cdots j_r}_{i_1\cdots i_s} \, e_{j_1}\otimes\cdots\otimes e_{j_r}\otimes \varepsilon^{i_1}\otimes \cdots\otimes \varepsilon^{i_s}                                                           \\
   & = \widetilde\tau^{k_1\cdots k_r}_{h_1\cdots h_s} \, \widetilde e_{k_1}\otimes\cdots\otimes \widetilde e_{k_r}\otimes \widetilde\varepsilon^{h_1}\otimes \cdots\otimes \widetilde\varepsilon^{h_s},
\end{align}
applying the previous reasoning and comparing term by term we get
\begin{align}
   & \tau^{j_1\cdots j_r}_{i_1\cdots i_s} = \widetilde\tau^{k_1\cdots k_r}_{h_1\cdots h_s} l_{k_1}^{j_1}\cdots l_{k_r}^{j_r}\, \lambda_{i_1}^{h_1}\cdots \lambda_{i_s}^{h_s}  \\
   & \mbox{or}                                                                                                                                                                \\
   & \widetilde\tau^{k_1\cdots k_r}_{h_1\cdots h_s} = \tau^{j_1\cdots j_r}_{i_1\cdots i_s} \lambda_{j_1}^{k_1}\cdots \lambda_{j_r}^{k_r}\, l_{h_1}^{i_1}\cdots l_{h_s}^{i_s}.
\end{align}

\begin{remark}
  An important consequence of this fact is that we can use a metric tensor, and the associated musical isomorphisms ${}^\flat$ and ${}^\sharp$, to canonically identify a tensor space $T_s^r(V)$ with $T_r^s(V)$, $T_0^{r+s}(V)$ and $T_{t+s}^0(V)$ by concatenating the correct number of maps, for example
  \begin{align}
    \cI = \cI_g : T_s^r (V) \to T_r^s(V) \\
    \cI : \tau \mapsto \tau \circ (\LaTeXunderbrace{{\cdot}^\flat, \ldots, {\cdot}^\flat}_{r\mbox{ times}}, \LaTeXoverbrace{{\cdot}^\sharp, \ldots, {\cdot}^\sharp}^{s\mbox{ times}}).
  \end{align}
    In general, one can use the metric tensor to raise or lower arbitrary indices, changing the tensor type from $(r,s)$ to $(r+1, s-1)$ or $(r-1, s+1)$.

  A neat application of this is showing that a non-degenerate bilinear map $g\in T_2^0(V)$ can be lifted to a non-degenerate bilinear map on arbitrary tensors, that is
  \begin{equation}
    G: T_s^r(V)\times T_s^r(V) \to \R,
    \quad
    G(\tau, \widetilde\tau) := (\cI_g(\tau), \widetilde\tau),
  \end{equation}
  where the scalar product is defined via the requirement that tensor products of basis elements of $V$ are orthonormal and that $G$ is invariant under the musical isomorphism.
  In particular\footnote{Exercise: write down the detailed proof of this statement.}, if $g$ is a metric tensor on $V$, then $G$ is a metric tensor on $T_s^r(V)$.
\end{remark}

\begin{exercise}
  \begin{enumerate}
    \item What do the canonical identifications of $T_s^r(V)$ with $T_0^{r+s}$ and $T_{t+s}^0$ look like?
    \item Fix a basis for $V$. What does $I_g$ in the previous remark look like with respect to this basis?
    \item Write down $G$ with respect to the basis from the previous point.
  \end{enumerate}
      %In coordinates the one mapping $(r,s)$-tensors to $(r+s, 0)$-tensors is $I_g(\tau) = \tau_{i_1\ldots i_r}^{j_1\ldots j_s} g_{j_1 j_{r+1}} \ldots g_{j_s j_{r+s}}}$
      %If \((g_{ij})\) and \((g^{ij})\) are the matrix element of the matrices representing metric tensor and its inverse resp, then \[ G(\sigma, \tau) = \langle\sigma, \tau\rangle_g := g^{k_1 l_1}\cdots g^{k_rl_r} g_{i_1j_1} \cdots g_{i_sj_s} \sigma_{k_1,\ldots,k_r}^{i_1,\ldots,i_s} \tau_{l_1,\ldots,l_r}^{j_1,\ldots,j_s} \]
\end{exercise}

\begin{remark}
  Interestingly, even though none of the tensor spaces $T_s^r(V)$ are algebras, the map $\otimes$ makes the collection of all tensor spaces
  \marginnote{This is a so-called \emph{graded algebra} since $\otimes :  T_s^r(V)\times T_{s'}^{r'}(V) \to T_{s+s'}^{r+r'}(V)$ in some sense moves along the structure of the indices.}
  \begin{equation}
    T(V) := \bigoplus_{r,s\geq 0} T_s^r(V), \qquad T_0^0(V):= \R,
  \end{equation}
  an algebra, called \emph{tensor algebra}.
  Here, for $r=s=0$ we define the tensor multiplication with a scalar as the standard multiplication: $r\otimes v = r v$ for $r\in T_0^0(V)=\R$ and $v\in T^1_0(V)=V$.
\end{remark}

Before moving on, there is an important operation on tensors that will come back later on and is worth to introduce in its generality.

\begin{definition}
  Let $V$ be a vector space and fix $r,s\geq0$.
  For $h\leq r$ and $k\leq s$, we define the \emph{$(h,k)$-contraction} of a tensor as the linear mapping $T_s^r(V)\to T_{s-1}^{r-1}(V)$ defined through
  \begin{align}
    v_1 & \otimes\cdots\otimes v_r\otimes\omega^1\otimes\cdots\otimes\omega^s \\
        & \mapsto \omega^k(v_h)\, v_1\otimes\cdots\otimes v_{h-1}\otimes v_{h+1}\cdots\otimes v_r\otimes\omega^1\otimes\cdots\otimes\omega^{k-1}\otimes\omega^{k+1}\cdots\otimes\omega^s,
  \end{align}
  and then extended by linearity, thus mapping $\tau \mapsto \widetilde\tau$ where
  \begin{align}
    \widetilde\tau & (\nu^1,\ldots,\nu^{r-1}, v_1,\ldots,v_{s-1}) \\
                   & = \tau(\nu^1,\ldots,\LaTeXunderbrace{e^i}_{h\mbox{th index}},\ldots,\nu^{r-1},w_1,\ldots,\LaTeXunderbrace{e_i}_{k\mbox{th index}},\ldots,w_{s-1}).
  \end{align}
\end{definition}

\begin{notation}[Hat notation for erased elements]\label{notation:hat}
  It is common to use an hat to denote elements that have been removed from the tensor product.
  For instance, the contraction above would look like
  \begin{align}
    v_1 & \otimes\cdots\otimes v_r\otimes\omega^1\otimes\cdots\otimes\omega^s \\
        & \mapsto \omega^k(v_h)\, v_1\otimes\cdots\otimes v_{h-1}\otimes v_{h+1}\cdots\otimes v_r\otimes\omega^1\otimes\cdots\otimes\omega^{k-1}\otimes\omega^{k+1}\cdots\otimes\omega^s \\
        &\qquad  =: \omega^k(v_h)\, v_1\otimes\cdots\otimes \widehat{v}_{h} \otimes \cdots\otimes v_r\otimes\omega^1\otimes\cdots\otimes\widehat{\omega}^{k}\otimes\cdots\otimes\omega^s.
  \end{align}
\end{notation}

\begin{example}
  To understand better the definition of the contraction it is worth looking at an example over a decomposable element.
  For simplicity, assume $(r,s) = (2,3)$ and $\tau = v_1\otimes v_2\otimes\omega^1\otimes\omega^2\otimes\omega^3$.
  Then $\tau$ corresponds to a multilinear function
  \begin{equation}
    \tau(\nu^1,\nu^2,w_1,w_2,w_3) = \nu^1(v_1)\nu^2(v_2)\omega^1(w_1)\omega^2(w_2)\omega^3(w_3).
  \end{equation}
  By definition, the $(1,2)$-contraction is
  \begin{align}
    \widetilde\tau(\nu^1,w_1,w_2) & = \tau(e^i,\nu^1,w_1,e_i,w_2) \\
                                  & = e^i(v_1)\;\nu^1(v_2)\omega^1(w_1)\omega^2(e_i)\omega^3(w_2) \\
                                  & = \LaTeXunderbrace{e^i(v_1)\omega^2(e_i)}_{=\omega^2_i e^i (v_1) =\omega^2(v_1)}\nu^1(v_2)\omega^1(w_1)\omega^3(w_2) \\
                                  & = \omega^2(v_1)\; v_2\otimes\omega^1\otimes\omega^3 (\nu^1,w_1,w_2).
  \end{align}
\end{example}

\begin{example}
  For $a\in T_1^1(V)$, the contraction $\mathrm{tr} (a) := a^1_1$ is called the trace of $a$ and is the usual trace of the corresponding endomorphism $A:V\to V$.
\end{example}

\section{Tensor bundles}

It is time to leave the abstract world of vector spaces and start getting closer to our main focus: manifolds.
In the previous chapters we have shown that the tangent bundle and the cotangent bundle are families of vector spaces built over $M$ that are dual to each other.
We now have the tools to further extend this idea and define tensor bundles as families of tensor spaces build on top of the fibres of the tangent bundle.

\begin{definition}
  The \emph{$(r,s)$-tensor bundle over $M$} is the bundle
  \begin{equation}
    T_s^r M = \bigsqcup_{p\in M}\left(\{p\}\times T_s^r(T_p M)\right)
  \end{equation}
  of tensors of type $(r,s)$, with the projection on the first component $\pi:T_s^r M\to M$.
\end{definition}

Here pullback and differential\footnote{Now you see why somebody calls it pushforward...} turn out to be life-saviours:
any atlas $\{(U_i, \varphi_i)\}$ of $M$ can be naturally mapped to an atlas  on $T_s^r M$ via $\{(T_s^r U_i, \widetilde\varphi_i)\}$ where
\begin{align}
  \widetilde\varphi_i : T_s^r U_i \to T_s^r\varphi(U_i)
\end{align}
is defined by linearity on the fibres via
\marginnote{Study hint: look carefully at the domains and codomains of all the maps involved and make sure that you understand how this is defined.}
\begin{align}
  \widetilde\varphi & (p, e_{j_1}\otimes\cdots\otimes e_{j_r}\otimes \varepsilon^{k_1}\otimes \cdots\otimes \varepsilon^{k_s})                                                                     \\
                    & :=(\varphi(p), d\varphi_p e_{j_1}\otimes\cdots\otimes d\varphi_p e_{j_r}\otimes d(\varphi^{-1})^*\varepsilon^{i_1}\otimes \cdots\otimes d(\varphi^{-1})^*\varepsilon^{i_s}).
\end{align}

\begin{exercise}
  Let $M$ be a $m$-smooth manifold.
  Show that $T^r_sM$ is a vector bundle of rank $m^{r+s}$.
\end{exercise}

In analogy to the definition of vector fields, we can introduce tensor fields: these will just be local assignments of tensors to points.

\begin{definition}
  A section $\Gamma(T_s^r M)$ of $T_s^r M$, that is, a smooth map $\tau : M \to T_s^r M$ such that $\pi\circ \tau = \id_M$, is called a \emph{tensor field} of type $(r,s)$.
  We denote the space of tensor fields of type $(r,s)$ by $\cT_s^r(M)$ and define $\cT_0^0(M) := C^\infty(M)$.
\end{definition}

\begin{example}
  With the definition above we have that $\fX(M) = \cT_0^1(M)$ and $\fX^*(M) = \cT_1^0(M)$.
\end{example}

Locally, we can express any tensor field in terms of the coordinate bases.
On a chart for $M$ with local coordinates $(x^i)$, our analysis of the change of basis tells us that $\tau\in\cT_s^r(M)$ has the form
\begin{equation}
  \tau = \tau^{j_1\cdots j_r}_{i_1\cdots i_s}\; \frac{\partial}{\partial x^{j_1}}\otimes\cdots\otimes\frac{\partial}{\partial x^{j_r}}\otimes dx^{i_1}\otimes\cdots\otimes dx^{i_s},
\end{equation}
where $\tau^{j_1\cdots j_r}_{i_1\cdots i_s}\in C^\infty(M)$.

\begin{definition}
  The \emph{support} of a tensor field $\tau\in\cT_s^r(M)$ is defined as the set
  \begin{equation}
    \supp \tau := \overline{\{p\in M\mid \tau(p) \neq 0\}} \subset M.
  \end{equation}
  We sat that $\tau\in\cT_s^r(M)$ is \emph{compactly supported} if $\supp\tau$ is a compact set.
\end{definition}

Again in analogy with what we saw on tangent and cotangent bundles, we can provide a general definition of pullback and pushforward on tensor bundles.
This will be extremely useful soon, when we start dealing with differential forms.

\begin{definition}\label{def:pullback0s}
  Let $F:M\to N$ be a smooth map between smooth manifolds and let $\omega \in \cT_s^0(N)$ be a $(0,s)$-tensor field on $N$. We define the \emph{pullback of $\omega$ by $F$} as the $(0,s)$-tensor field $F^*\omega \in \cT_s^0(M)$ on $M$ defined for any $p\in M$ by
  \begin{align}
     & F^* : \cT_s^0(N) \to \cT_s^0(M),                             \\
     & F^*\omega|_p := dF^*_p(\omega|_{F(p)}) \quad \forall p\in M,
  \end{align}
  where
  \begin{equation}
    dF^*_p(\omega|_{F(p)})(v_1, \ldots, v_s) := \omega|_{F(p)} (dF_p v_1, \ldots, dF_p v_s), \quad\forall v_1, \ldots, v_s \in T_p M.
  \end{equation}
\end{definition}

To be consistent with this definition, if $f\in C^\infty(M) = \cT_0^0(M)$ and $\omega \in\cT_s^0(N)$, then we define $f\otimes \omega := f\omega$ and $F^* f := f\circ F$.

\begin{exercise}
  Show that the tensor pullback satisfies the following properties.
  Let $F:M\to N$ and $G:N\to P$ be smooth maps and $\nu, \omega \in\cT_s^0(N)$ and $f\in C^\infty(N)$, then the following hold
  \begin{enumerate}
    \item $F^*(f\otimes\omega) = F^*(f \omega) = (f\circ F) F^*\omega = (F^* f)(F^*\omega)$;
    \item $F^*(\omega\otimes\nu) = F^*\omega\otimes F^*\nu$;
    \item $F^*(\omega + \nu) = F^*\omega + F^*\nu$;
    \item $(G\circ F)^*\omega = F^*(G^* \omega)$;
    \item $(\id_N)^* \omega = \omega$.
  \end{enumerate}
\end{exercise}

\begin{example}[Change of coordinates for tensor fields]
  Let, as usual, $(U,\varphi)$ be a chart on $M$ with local coordinates $(x^i)$.
  If $\{e^i:\R^n\to\R\}$ are the standard euclidean coordinates\footnote{Have a look at Notation~\ref{ntn:coords} if you don't remember what I am talking about.
    Here we are using the notation $e^i \equiv r^i$ since now we know that $\{e^i\}$ is just the dual basis to $\{e_i\}$. } and $\{e_i\}$ are the standard basis vectors in $\R^n$, then the coordinate $1$-forms and the coordinate vector fields on $U\subset M$ are given by
  \begin{equation}
    dx^i = \varphi^* de^i
    \quad\mbox{and}\quad
    \frac{\partial}{\partial x^i} = (\varphi^{-1})_* e_i.
  \end{equation}
  This immediately exposes the transformation laws for the change of coordinates: let $(U, \psi)$ be another chart on $U$ with local coordinates $(y^i)$, then $dy^i = \psi^* de^i$ and $\frac{\partial}{\partial y^i} = (\psi^{-1})_* e_i$. If we denote $\sigma = \psi\circ\varphi^{-1}$ the transition map in $\R^n$, we get
  \begin{align}
    \frac{\partial}{\partial x^i} & = (\varphi^{-1})_* e_i                                             \\
                                  & = (\varphi^{-1})_* \LaTeXunderbrace{(\sigma^{-1})_*\sigma_*}_{\id} e_i \\
                                  & = (\varphi^{-1}\circ \sigma^{-1})_* (\sigma_* e_i)                     \\
                                  & = (\psi^{-1})_* ((D\sigma)_i^j e_j)                                  \\
                                  & = (D\sigma)_i^j \frac{\partial}{\partial y^j},
  \end{align}
  which may be easier to think about in terms of the following diagram
  \begin{equation}\nonumber
    \begin{tikzcd}[row sep=large, column sep=tiny]
      & \frac{\partial}{\partial x^i} \in \cT_0^1(U) \ni (D\sigma)_i^j \frac{\partial}{\partial y^j} \arrow[dl, "\varphi_*" description] \arrow[dr, "\psi*" description] & \\
      e_i \in \cT_0^1(V) \arrow[rr, "\sigma_*" description] & & \cT_0^1(W) \ni \LaTeXunderbrace{\sigma_* e_i}_{= (D\sigma)_i^j e_j}
    \end{tikzcd}
  \end{equation}
  where $V = \varphi(U)$ and $W = \psi(U)$.

  From this, we immediately get $dy^j = (D\phi)_i^j dx^i$ and, therefore, $dx^i =  (D\sigma^{-1})_j^i dy^j$.
\end{example}

\begin{exercise}
  Let $F:N\to M$ be a smooth map between smooth manifolds.
  Show that a function $f\in C^\infty(M)$ is constant on $F(N)\subset M$ if and only if $F^* df \equiv 0$.\\
  \textit{\small Hint: if you get stuck start by looking at a simple example, like $N=M$ and $F=\id_M$.}
\end{exercise}

We can also use the pullback to construct a diffeomorphism between tensor bundles of the same type from a diffeomorphism $\varphi:M\to N$ between manifolds.

\begin{proposition}
  Let $\varphi:M\to N$ be a diffeomorphism between smooth manifolds.
  Then $\varphi$ induces a diffeomorphism $T_s^rM \to T_s^r N$.
\end{proposition}
\begin{proof}
  \newthought{Step I}.
  We know that the pullback induces on the fibres a diffeomorphism of cotangent bundles. Let $p\in M$.
  We have already seen that on the fibres the pullback is a diffeomorphism:
  \begin{equation}
    T^*N \to T^* M, \quad
    (q,\omega) \mapsto (\varphi^*\omega)_q = \left(\varphi^{-1}(q), d\varphi^*\omega|_{\varphi^{-1}(q)}\right).
  \end{equation}
  This can be inverted giving rise to the so-called \emph{cotangent lift}\footnote{This is often denoted also $\varphi_\sharp$.}
  \begin{equation}
    d\varphi^\dagger :=(d\varphi)^\dagger := (\varphi^{-1})^*: T^*M \to T^*N.
  \end{equation}
  \marginnote{An aid to understand this map is the following commuting diagram:
    \begin{equation}\nonumber
      \begin{tikzcd}[row sep=normal, column sep=normal, ampersand replacement=\&]
        T_p^* M \arrow[d, "\pi_M" left] \arrow[r, "d\varphi^\dagger"] \& T^*_{\varphi(p)}N \arrow[d, "\pi_N"] \\
        M \arrow[r, "\varphi" below] \& N
      \end{tikzcd}
    \end{equation}}
  For any $\omega\in T_p^* M$ and any $v\in T_pM$, we have
  \begin{align}
    (d\varphi^\dagger_p \omega \;\mid\; d\varphi_p v)_{\varphi(p)} & =  d(\varphi^{-1})^*(\omega|_{\varphi^{-1}\circ\varphi(p)} (d\varphi_p v) ) \\
                                                                   & = \omega_p(d\varphi^{-1}_{\varphi(p)} \circ d\varphi_p v )                  \\
                                                                   & = \omega_p(v) = (\omega \mid v)_p.
  \end{align}

  \newthought{Step II}.
  Chaining $d$ and $d^\dagger$ on the appropriate components of the tensor, we obtain a diffeomorphism of arbitrary tensor bundles:
  \begin{equation}
    d\varphi \otimes\cdots\otimes d\varphi \otimes d\varphi^\dagger \otimes\cdots\otimes d\varphi^\dagger  : T_s^rM \to T_s^r N,
  \end{equation}
  defined on the product elements as
  \begin{align}
    d\varphi & \otimes\cdots\otimes d\varphi \otimes d\varphi^\dagger \otimes\cdots\otimes d\varphi^\dagger  (p, v_1 \otimes \cdots\otimes v_r \otimes \omega^1\otimes\cdots\otimes\omega^s) \\
             & := (\varphi(p), d\varphi\; v_1 \otimes \cdots\otimes d\varphi\; v_r \otimes d\varphi^\dagger \omega^1\otimes\cdots\otimes d\varphi^\dagger \omega^s),
  \end{align}
  which extends to the whole fibres by linearity.
\end{proof}

With this diffeomorphism at hand, we can finally generalize the pushforward.

\begin{definition}
  Let $F:M\to N$ be a diffeomorphism between smooth manifolds.
  We define \emph{pushforward of $(r,s)$-tensor fields} by $F$ as the map $F_*: \cT_s^r(M) \to \cT_s^r(N)$ for which the following diagram commutes:
  \begin{equation}
    \begin{tikzcd}[row sep=huge, column sep=huge]
      M \arrow[r, "F"] \arrow[d, "\tau" left]
      &[10em] N \arrow[d, "F_* \tau" cyan] \\
      T_s^r(M) \arrow[r, "dF \otimes\cdots\otimes dF \otimes dF^\dagger \otimes\cdots\otimes dF^\dagger "]
      & T_s^r(N)
    \end{tikzcd}.
  \end{equation}
  That is, for $\tau\in\cT_s^r(M)$ we define
  \begin{equation}
    F_*\tau = \LaTeXunderbrace{dF \otimes\cdots\otimes dF}_{r\mbox{ times}} \otimes \LaTeXoverbrace{dF^\dagger \otimes\cdots\otimes dF^\dagger }^{s\mbox{ times}} \circ \tau \circ F^{-1}.
  \end{equation}
\end{definition}

\begin{example}
  Let $f\in\cT_0^0(M)$, then $F_* f = f\circ F^{-1}$.
  Similarly, for $X\in\cT_0^1(M)$ we have the pushforward $F_* X = dF\circ X \circ F^{-1}$, in line with the definition of pushforward of vector fields that we gave in the previous chapter.
  An interesting, not really surprising though (right?), property is the following: $F_* df = d(F_* f)$.
\end{example}

\begin{exercise}
  Let $F:M\to N$ and $G:N\to P$ two diffeomorphisms of smooth manifolds.
  \begin{enumerate}
    \item Show that the chain rule $(G\circ F)_* = G_* \circ F_*$ holds.
    \item Show that our previous definition\footnote{That is, Definition~\ref{def:pullback0s} -- which includes the pullback from Definition~\ref{def:pullback1f}.} of pullback is a particular case of the following general definition of a \emph{pullback of $(r,s)$-tensor fields by $F$}:
          \begin{equation}
            F^* := (F^{-1})_* : \cT_s^r(N) \to \cT_s^r(M).
          \end{equation}
  \end{enumerate}
  \textit{\small Hint: always work on a product tensor and extend by linearity.}
\end{exercise}

Note that thanks to this duality between pullback and pushforward, the dual pairing is always invariant under diffeomorphisms:
\marginnote[-2em]{In general, \eqref{eq:pairdualitypull} is not true for scalar products: one has to require that the diffeomorphism leaves the metric invariant, i.e. $g_N(F_* v, F_* w)\circ F = g_M(v,w)$ where $g_M\in T_2^0(M)$ and $g_N\in T_2^0(N)$.
  You encounter this if you study isometries for pseudo-Riemannian metrics or canonical transformations in classical mechanics.}
\begin{equation}\label{eq:pairdualitypull}
  (F_* \omega \mid F_* v) = (\omega \mid v).
\end{equation}
Can you show why?

\newthought{If you work on Riemannian geometry or in Relativit}y, you cannot avoid hearing about tensors.
Let's briefly look at the reason.

\begin{definition}
  A non-degenerate symmetric bilinear form $g\in \cT_2^0(M)$ is a \emph{pseudo-Riemannian metric} and the pair $(M,g)$ a \emph{pseudo-Riemannian} (or \emph{semi-Riemannian}) \emph{manifold}.
  If $g$ is also fibre-wise positive definite, then $g$ is a \emph{Riemannian metric} and $(M,g)$ is a \emph{Riemannian manifold}.
  From this you see that the Riemannian metric is just an inner product on the tangent bundle of the manifold.
\end{definition}

\begin{example}
  \begin{enumerate}
    \item The euclidean space $\R^n$ is a Riemannian manifold with the usual scalar product, which we can represent as $g = \sum_{i=1}^n dx^i\otimes dx^i$ (What is its matrix form?).
    \item If $M=\R^4$, an example of pseudo-Riemannian metric is the Minkowski metric $g = g_{ij} dx^i\otimes dx^j$ where $[g_{ij}] = {\left(\begin{smallmatrix} -1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{smallmatrix}\right)}$. The pseudo-Riemannian manifold $(M, g)$ is the space-time manifold of special relativity, with $x^1 = t$ is the time and $(x^2, x^3, x^4) = (x,y,z)$ is the space.
  \end{enumerate}
\end{example}

A metric on a manifold $M$ provides an inner product at each point of the tangent bundle: this allows to compute lengths of curves and angles between vectors. We can then induce a distance function $d: M \times M \to [0, \infty)$ by defining
\begin{equation}
  d(p,q) = \inf_{\gamma \in \cC(p,q)} \ell(\gamma),
\end{equation}
where
\begin{equation}
  \cC(p,q) = \left\{\gamma : [0,1] \to M \mbox{ piecewise smooth} \;\mid\; \gamma(0)=p, \gamma(1)=q\right\}
\end{equation}
and
\begin{equation}
  \ell(\gamma) := \int_0^1 \sqrt{g_{\gamma(t)}(\gamma'(t), \gamma'(t))}\, dt.
\end{equation}
This will make $(M, d)$ also a metric space whose metric topology is the same as the original manifold topology.
We will not go further into proving these claims or discussing their many interesting consequences.
If you are interested or curious you can look at any good book in Riemannian geometry,
a good and concise one is \cite[Chapter 6]{book:lee:riemannian}.

The relation between manifolds and metric spaces does not end here. As it happens, all smooth manifolds are \emph{metrizable}\footnote{In fact, all the topological manifolds are metrizable. This more general statement is far harder to prove~\cite[Theorem 34.1 and Exercise 1 of Chapter 4.36]{book:munkres:topology} or \cite{nlab:urysohn_metrization_theorem}. Note that not all topological spaces are metrizable: for example, a space with more than one point endowed with the trivial topology is not. The line with two origins from Exercise~\ref{exe:line-two-origins} is also not metrizable (but it is locally metrizable). You can find plenty more non-trivial examples on \href{https://topology.pi-base.org/spaces?q=\%7EMetrizable}{$\pi$-Base}. Even if a topological space is metrizable, the metric will be far from unique: for example, proportional metrics generate the same collection of open sets.}: there exists some distance on the manifold that induces the given topology on it.

We will show here the proof in the case of smooth manifolds since it is relatively simple consequence of the existence of partitions of unity.

\begin{theorem}
  Every smooth manifold admits a Riemannian metric.
\end{theorem}
\begin{proof}
  Let $M$ be a smooth $m$-dimensional manifold, let $\{(U_i, \varphi_i)\}_{i\in I}$ be a countable atlas for the manifold and let $\{\rho_i\}_{i\in I}$ be a partition of unity adapted to it. See Section~\ref{sec:partition_of_unity}.
  Denote $g_{\R^m}$ the Euclidean metric on $\R^m$, that is, for any $x \in \R^m$ and any $v,w \in T_x\R^m\simeq\R^m$, $g_{\R^m}(v, w) = v \cdot w$.

  We define the metric\footnote{Exercise: check that it is actually a metric} $g$ on $M$ by setting
  \begin{align}
    g := \sum_{i\in I} \rho_i \hat{g}_i,
    \quad\mbox{where}\quad
    \hat{g}_i := \begin{cases}
      \varphi_i^* g_{\R^m} & \mbox{on } U_i, \\
      0 & \mbox{otherwise}
    \end{cases},
  \end{align}
  concluding the proof.
\end{proof}

Once we have a Riemannian metric, we can define the distance as described above.