Reliable Tensor Clusters
2021 James B. Wilson, Colorado State University
-
All tensors are multiplication
- All tensor products are matrix multiplication
Vectors, matrices, and hypermatrices
(scroll down)
\(K\) is a type of coefficient, e.g. float \(\mathbb{R}\), or int \(\mathbb{Z}\) etc.
\(X\in K^{a\times b}\) is data indexed by
- rows \(1\leq i\leq a\)
- columns \(1\leq j\leq b\)
written \(X_{ij}\in K\).
\(u\in K^{a}\) is data
\[u_{i}\in K,\] indexed by \(1\leq i\leq a\)
\(\Gamma\in K^{d_1\times \cdots\times d_{\ell}}\) is data indexed by
- axes \(1\leq a\leq \ell\), and
- coordinate \(1\leq i_a\leq d_a\)
written \(\Gamma_{i_1\ldots i_{\ell}}\in K\).
Hypermatrices, multi-way arrays, "tensors"
\[\otimes:K^{a}\times K^{b}\rightarrowtail K^{a\times b}\]
\[[u\otimes v]_{ij} = u_{i}\times v_{j}.\]
Aliases: tensor product, outer product, matrix product
\(\times\) is the product in \(K\)
\[(|\cdots|):K^{a}\times K^{b}\times K^{c}\rightarrowtail K^{a\times b\times c}\]
\[(|u, v, w|)_{ijk} = (u_{i}\times v_{j})\times w_k.\]
This is a ternary product, i.e. one operation requiring 3 parameters
assuming \(\times\) is associative, i.e.
\(x\times(y\times z)=(x\times y)\times z\)
you can ignore parenthesis
\[(|\cdots|):K^{d_1}\times\cdots\times K^{d_{\ell}}\rightarrowtail K^{d_1\times \cdots \times d_{\ell}}\]
\[(|u_1,\ldots,u_{\ell}|) = (\cdots(u_1\times u_2)\times\cdots)\times u_{\ell}\]
More compactly,
\[(|\cdots|): \prod_{d\in D} K^d \rightarrowtail K^{\Pi D}\]
\[(|u|)_{\iota} = \prod_{d\in D}u_{\iota(k)}.\]
This is a \(\ell\)-ary product, i.e. one operation requiring \(\ell\) parameters
So what is \(K^a\otimes K^b\)?
\[\otimes:K^{a}\times K^{b}\rightarrowtail K^{a\times b}\]
\[[u\otimes v]_{ij} = u_{i}\times v_{j}.\]
As a function, what is its image?
\[(1,2)\otimes (3,4) = \begin{bmatrix} 1\\ 2 \end{bmatrix}\begin{bmatrix}3 & 4 \end{bmatrix}=\begin{bmatrix}3 & 4\\ 6& 8 \end{bmatrix}\]
\[\sim\begin{bmatrix}3 & 4\\ 0& 0 \end{bmatrix}\sim \begin{bmatrix} 1 & 4/3\\ 0 & 0 \end{bmatrix}\]
Repeat with some other vectors, then row-reduce.
\[\otimes:K^{a}\times K^{b}\rightarrowtail K^{a\times b}\]
\[[u\otimes v]_{ij} = u_{i}\times v_{j}.\]
As a function, what is its image?
- The image only ever has matrices of rank 1 or 0.
- In fact all rank 0,1 matrices are in the image.
- Not all matrices have rank 1.
- No way that \(\otimes\) is surjective.
\[\otimes:K^{a}\times K^{b}\rightarrowtail K^{a\times b}\]
\[[u\otimes v]_{ij} = u_{i}\times v_{j}.\]
As a function, what is its image?
Here are two rank 1 matrices
\[(1,0)\otimes (1,0)=\begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}\qquad (0,1)\otimes (0,1)=\begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix}\]
Add them together and it aint rank 1...
\[(1,0)\otimes (1,0)+(0,1)\otimes (0,1)=\begin{bmatrix} 1 & 0 \\ 0& 1 \end{bmatrix}\]
\[\otimes:K^{a}\times K^{b}\rightarrowtail K^{a\times b}\]
\[[u\otimes v]_{ij} = u_{i}\times v_{j}.\]
- As a function \(\otimes\) is not surjective.
- In fact the image is not even a subspace.
- Hard to do linear algebra if you cannot add and rescale.
- Decision. Replace the function image with the subspace it spans, i.e.: \[K^a\otimes K^b=\mathrm{Span}\{u\otimes v\mid u\in K^a, v\in K^b\}.\]
How far off is the image of \(\otimes\) from \(K^a\otimes K^b\)?
- How likely are you to throw two darts so that they land on the same line as the bull's eye? (Answer, never gonna happen.)
- Dart A is column 1 Dart B is column 2, conclusion a random \(2\times 2\) matrix has rank 2.
- You should consider it a miracle when you find a tensor that IS in the image of \(\otimes\).
Exercise
Write down the actual probabilities that a random matrix is full rank.
Option 1: Assume the coefficients are finite precision (a finite field or finite floating point etc.)
Option 2: Assume a random matrix is made by sampling vector by a Gaussian distribution from the origin.
\[\otimes:K^{a}\times K^{b}\rightarrowtail K^{a\times b}\]
\[[u\otimes v]_{ij} = u_{i}\times v_{j}.\]
A natural basis ...
\[e_i = (0,\ldots,\overset{i}{1},\ldots,0)\in K^a\]
\[E_{ij}=\begin{array}{c|ccccc|} & & \cdots & j & \cdots & \\ \hline & & & \vdots & & \\ & & & 0 & &\\ i & \cdots & 0 & 1 & 0 & \cdots \\ & & & 0 & & \\ & & & \vdots & & \\ \hline \end{array}\]
\[E_{ij} = e_i\otimes e_j\]
Unit vectors/matrices
\(K^a=\mathrm{Span}\{e_1,\ldots,e_{\ell}\}\)
\(K^a\otimes K^b=\mathrm{Span}\left\{\begin{matrix} E_{11}, & \ldots & E_{1\ell},\\ & \ldots & \\ E_{a1}, & \ldots, & E_{ab}\end{matrix}\right\}\)
- Warning: \(K^a\otimes K^b\) is not the image of \(\otimes\)!!
- Really! Almost nothing in \(K^a\otimes K^b\) is in the image of \(\otimes\).
- Fact: \(K^a\otimes K^b\) is the span of the image of \(\otimes\).
Relating products
\((|K^a, K^b, K^c|)\)
\((K^a\otimes K^b)\otimes K^c\)
\(\neq\)
\(\cong\)
Difference in code
// (|A,B,C|)
t = new Array[Int](4,5,6)
t[1,2,3] = 9
// (A (x) B) (x) C
t = new Array[Array[Array[Int](4)](5)](6)
t[1][2][3] = 9
The differences are natural to relate,
but they are differences.
Still to come
- What is \(U\otimes V\)?
- What is \(U\otimes_A V\)?
Copy of Copy of Tensor Products; Versor Fractions
By James Wilson
Copy of Copy of Tensor Products; Versor Fractions
- 460