Reliable Tensor Clusters

 

2021 James B. Wilson, Colorado State University

  • All tensors are multiplication

  • All tensor products are matrix multiplication

Vectors, matrices, and hypermatrices

(scroll down)

\(K\) is a type of coefficient, e.g. float \(\mathbb{R}\), or int \(\mathbb{Z}\) etc.

\(X\in K^{a\times b}\) is data indexed by

  • rows \(1\leq i\leq a\)
  • columns \(1\leq j\leq b\)

written \(X_{ij}\in K\).

\(u\in K^{a}\) is data

\[u_{i}\in K,\] indexed by \(1\leq i\leq a\)

\(\Gamma\in K^{d_1\times \cdots\times d_{\ell}}\) is data indexed by

  • axes \(1\leq a\leq \ell\), and
  • coordinate \(1\leq i_a\leq d_a\)

written \(\Gamma_{i_1\ldots i_{\ell}}\in K\).

Hypermatrices, multi-way arrays, "tensors"

\[\otimes:K^{a}\times K^{b}\rightarrowtail K^{a\times b}\]

\[[u\otimes v]_{ij} = u_{i}\times v_{j}.\]

Aliases: tensor product, outer product, matrix product

\(\times\) is the product in \(K\)

\[(|\cdots|):K^{a}\times K^{b}\times K^{c}\rightarrowtail K^{a\times b\times c}\]

\[(|u, v, w|)_{ijk} = (u_{i}\times v_{j})\times w_k.\]

This is a ternary product, i.e. one operation requiring 3 parameters

assuming \(\times\) is associative, i.e.

\(x\times(y\times z)=(x\times y)\times z\)

you can ignore parenthesis

\[(|\cdots|):K^{d_1}\times\cdots\times K^{d_{\ell}}\rightarrowtail K^{d_1\times \cdots \times d_{\ell}}\]

\[(|u_1,\ldots,u_{\ell}|) = (\cdots(u_1\times u_2)\times\cdots)\times u_{\ell}\]

 

More compactly,

 

\[(|\cdots|): \prod_{d\in D} K^d \rightarrowtail K^{\Pi D}\]

\[(|u|)_{\iota} = \prod_{d\in D}u_{\iota(k)}.\]

This is a \(\ell\)-ary product, i.e. one operation requiring \(\ell\) parameters

So what is \(K^a\otimes K^b\)?

\[\otimes:K^{a}\times K^{b}\rightarrowtail K^{a\times b}\]

\[[u\otimes v]_{ij} = u_{i}\times v_{j}.\]

As a function, what is its image?

\[(1,2)\otimes (3,4) = \begin{bmatrix} 1\\ 2 \end{bmatrix}\begin{bmatrix}3 & 4 \end{bmatrix}=\begin{bmatrix}3 & 4\\ 6& 8 \end{bmatrix}\]

\[\sim\begin{bmatrix}3 & 4\\ 0& 0 \end{bmatrix}\sim \begin{bmatrix} 1 & 4/3\\ 0 & 0 \end{bmatrix}\]

Repeat with some other vectors, then row-reduce.

\[\otimes:K^{a}\times K^{b}\rightarrowtail K^{a\times b}\]

\[[u\otimes v]_{ij} = u_{i}\times v_{j}.\]

As a function, what is its image?

  • The image only ever has matrices of rank 1 or 0.
  • In fact all rank 0,1 matrices are in the image.
  • Not all matrices have rank 1.
  • No way that \(\otimes\) is surjective.

\[\otimes:K^{a}\times K^{b}\rightarrowtail K^{a\times b}\]

\[[u\otimes v]_{ij} = u_{i}\times v_{j}.\]

As a function, what is its image?

Here are two rank 1 matrices

\[(1,0)\otimes (1,0)=\begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}\qquad (0,1)\otimes (0,1)=\begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix}\]

Add them together and it aint rank 1...

\[(1,0)\otimes (1,0)+(0,1)\otimes (0,1)=\begin{bmatrix} 1 & 0 \\ 0& 1 \end{bmatrix}\]

\[\otimes:K^{a}\times K^{b}\rightarrowtail K^{a\times b}\]

\[[u\otimes v]_{ij} = u_{i}\times v_{j}.\]

  • As a function \(\otimes\) is not surjective.
  • In fact the image is not even a subspace.
  • Hard to do linear algebra if you cannot add and rescale.
  • Decision. Replace the function image with the subspace it spans, i.e.: \[K^a\otimes K^b=\mathrm{Span}\{u\otimes v\mid u\in K^a, v\in K^b\}.\]

How far off is the image of \(\otimes\) from \(K^a\otimes K^b\)?

  • How likely are you to throw two darts so that they land on the same line as the bull's eye?  (Answer, never  gonna happen.)
  • Dart A is column 1 Dart B is column 2, conclusion a random \(2\times 2\) matrix has rank 2.
  • You should consider it a miracle when you find a tensor that IS in the image of \(\otimes\).

Exercise

Write down the actual probabilities that a random matrix is full rank. 

Option 1: Assume the coefficients are finite precision (a finite field or finite floating point etc.)

Option 2: Assume a random matrix is made by sampling vector by a Gaussian distribution from the origin.

\[\otimes:K^{a}\times K^{b}\rightarrowtail K^{a\times b}\]

\[[u\otimes v]_{ij} = u_{i}\times v_{j}.\]

A natural basis ...

\[e_i = (0,\ldots,\overset{i}{1},\ldots,0)\in K^a\]

 

\[E_{ij}=\begin{array}{c|ccccc|} &  & \cdots & j & \cdots &  \\ \hline &  &  & \vdots &  &  \\ & & & 0 & &\\ i &  \cdots & 0 & 1 & 0 & \cdots  \\ &  &  & 0 & &  \\ & & & \vdots & &  \\ \hline \end{array}\]

 

\[E_{ij} = e_i\otimes e_j\]

Unit vectors/matrices

\(K^a=\mathrm{Span}\{e_1,\ldots,e_{\ell}\}\)

\(K^a\otimes K^b=\mathrm{Span}\left\{\begin{matrix} E_{11}, & \ldots & E_{1\ell},\\ & \ldots & \\ E_{a1}, & \ldots, & E_{ab}\end{matrix}\right\}\)

  • Warning: \(K^a\otimes K^b\) is not the image of \(\otimes\)!!
  • Really!  Almost nothing in \(K^a\otimes K^b\) is in the image of \(\otimes\).
  • Fact: \(K^a\otimes K^b\) is the span of the image of \(\otimes\).

Relating products

\((|K^a, K^b, K^c|)\)

\((K^a\otimes K^b)\otimes K^c\)

\(\neq\)

\(\cong\)

Difference in code

// (|A,B,C|)
t = new Array[Int](4,5,6)
t[1,2,3] = 9
// (A (x) B) (x) C
t = new Array[Array[Array[Int](4)](5)](6)
t[1][2][3] = 9

The differences are natural to relate, 

but they are differences.

Still to come

  • What is \(U\otimes V\)?
  • What is \(U\otimes_A V\)?

Copy of Copy of Tensor Products; Versor Fractions

By James Wilson

Copy of Copy of Tensor Products; Versor Fractions

  • 344