James B. Wilson, Colorado State University
Follow the slides at your own pace.
Open your smartphone camera and point at this QR Code. Or type in the url directly
https://slides.com/jameswilson-3/tensors-operators/#/
Uriya First
U. Haifa
Joshua Maglione,
Bielefeld
Peter Brooksbank
Bucknell
Below we explain in more detail.
Mathematics Computation
Vect[K,a] = [1..a] -> K
$ v:Vect[Float,4] = [3.14,2.7,-4,9]
$ v(2) = 2.7
*:Vect[K,a] -> Vect[K,a] -> K
u * v = ( (i:[1..a]) -> u(i)*v(i) ).fold(_+_)
Matrix[K,a,b] = [1..b] -> Vect[K,a]
$ M:Matrix[Float,2,3] = [[1,2,3],[4,5,6]]
$ M(2)(1) = 4
*:Matrix[K,a,b] -> Vect[K,b] -> Vect[K,a]
M * v = (i:[1..a]) -> M(i) * v
Difference? Math has sets, computation has types.
But types are math invention (B. Russell); lets use types too.
Terms of types store object plus how it got made
Implications become functions
hypothesis (domain) to conclusion (codomain)
(Union, Or) becomes "+" of types
(Intersection,And) becomes dependent type
Sets are the same if they have the same elements.
Are these sets the same?
We cannot always answer this, both because of practical limits of computation, but also some problems like these are undecidable (say over the integers).
In types the above need not be a set.
Sets are types where a=b only by reducing b to a explicitly.
These are tensors, so what does that mean?
Explore below
See common goals below
Why the notation? Nice consequences to come, like...
Context: finite-dimensional vector spaces
Actually, versors can be defined categorically
1. An abliean group constructor
2. Together we a distributive "evaluation" function
3. Universal Mapping Property
Rewrite how we evaluate ("uncurry"):
Practice
Evaluation
Definition. A tensor space is vector space \(T\) equipped with a linear map \(\langle \cdot |\) into a space of multilinear maps, i.e.:
Tensors are elements of a tensor space.
The frame is
The axes are the
The valence is the the size of the frame.
* This is actually where we need types. And it wouldn't have been possible even 5 years ago as types did not have a solid time-complexity as they do now (Accattoli-Dal Lago).
These are basis independent (in fact functors),
E.g.:
Rule: If shuffling through index 0, collect a dual.
And so duals applied in 0 and 1
In Levi-Civita index calculus this is "raising-lowering'' an index.
In algebra these are transposes (here we use dagger) and opposites "ops". Knuth & Liebler gave general interpretations.
Despite long history, often rediscovered independently, but sometimes the rediscovery fails to use proper duals!
Whenever possible we shuffle problems to work on the 0 axis. We shall do so even without alerting the reader. We shall also work with single axes even if multiple axes are possible as well.
Side effects include confusion, loss of torsion, and incorrect statements in infinite dimensions. The presenter and his associates assume no blame for researchers harmed by these choices.
Glue together tensor with a common set of axes.
Interprets systems of forms, see below...
Commonly introduced with homogeneous polynomials.
E.g. quadratic forms.
where
Stack Gram matrices
because
Cut tensor along an axis.
Slices recover systems of forms, see this.
Area = Length x Width
Volume = Length x Width x Height
Univesality? Facing Torsion? Look down.
this defines matrices as a tensor product
There is onto linear map:
Quotient by an ideal to get exact sequence of tensors.
(Rows are exact sequences, columns are Curried bilinear maps.)
As with rings, ideals have quotients.
Fractional inspired notation for things known, e.g.:
Fractional inspired notation for things known, e.g.:
More accurately there are natural maps, invertible if f.d. over fields
Add the values along the axis.
Perhaps weight by another tensor
then add.
Simplest example is the standard dot-product.
Uses? Averages, weighted averages, matrix multiplication, lets explore one called "convolution"...
Goal: sharpen an image, find edges, find textures, etc.
In the past it was for "photo-shop" (is this trade marked?).
Today its image feature extraction to give to machine learning.
We turn the image into a (3 x 3 x ab)-tensor where very slice is a (3 x 3)-subimage.
Convolution with a target shape is a contraction on the (3 x 3)-face. Result is a ab-tensor (another (a x b)-image).
Do this with k-targets and get an (a x b x k) tensor with the meta-data of our image.
Machine Learner tries to learn these tensors.
+ + + + + + +
* * * * * + + + + + + + + +
* * * * * * * + + + + + + + +
* * * * * * * * * + + + + + +
* * * * * * * * *
* * * * * * * * *
* * * * * * * * *
* * * * * * * * * - - - - - -
* * * * * * * - - - - - - - -
* * * * * - - - - - - - - -
- - - - - - -
Some convolutions detect horizontal edges
+ + - -
* * * * * + + + - - -
* * * * * * * + + + + - - - -
* * * * * * * * * + + + - - -
* * * * * * * * * + + - -
* * * * * * * * * + + - -
* * * * * * * * * + + - -
* * * * * * * * * + + + - - -
* * * * * * * + + + + - - - -
* * * * * + + + - - -
+ + - -
Some convolutions detect vertical edges
- - - - -
* * * * * - + + + + + -
* * * * * * * - + + -
* * * * * * * * * - + + -
* * * * * * * * * - + + -
* * * * * * * * * - + + -
* * * * * * * * * - + + -
* * * * * * * * * - + + -
* * * * * * * - + + -
* * * * * - + + + + + -
- - - - -
Some convolutions are good at all edges.
Stuff an infinite sequence
in a finite-dimensional space,
you get a dependence.
So begins the story of annihilator polynomials and eigen values.
An infinite lattice in finite-dimensional space makes even more dependencies.
(and the ideal these generate)
> M := Matrix(Rationals(), 2,3,[[1,0,2],[3,4,5]]);
> X := Matrix(Rationals(), 2,2,[[1,0],[0,0]] );
> Y := Matrix(Rationals(), 3,3,[[0,0,0],[0,1,0],[0,0,0]]);
> seq := [ < i, j, X^i * M * Y^j > : i in [0..2], j in [0..3]];
> U := Matrix( [s[3] : s in seq]);
i j X^i * M * Y^j
0 0 [ 1, 0, 2, 3, 4, 5 ]
1 0 [ 1, 0, 2, 0, 0, 0 ]
2 0 [ 1, 0, 2, 0, 0, 0 ]
0 1 [ 0, 0, 0, 0, 4, 0 ]
1 1 [ 0, 0, 0, 0, 0, 0 ]
2 1 [ 0, 0, 0, 0, 0, 0 ]
0 2 [ 0, 0, 0, 0, 4, 0 ]
1 2 [ 0, 0, 0, 0, 0, 0 ]
2 2 [ 0, 0, 0, 0, 0, 0 ]
0 3 [ 0, 0, 0, 0, 4, 0 ]
1 3 [ 0, 0, 0, 0, 0, 0 ]
2 3 [ 0, 0, 0, 0, 0, 0 ]
Step out the bi-sequence
> E, T := EchelonForm( U ); // E = T*U
0 0 [ 1, 0, 2, 3, 4, 5 ] 1
1 0 [ 1, 0, 2, 0, 0, 0 ] x
0 1 [ 0, 0, 0, 0, 4, 0 ] y
Choose pivots
Write null space rows as relations in pivots.
> A<x,y> := PolynomialRing( Rationals(), 2 );
> row2poly := func< k | &+[ T[k][1+i+3*j]*x^i*y^j :
i in [0..2], j in [0..3] ] );
> polys := [ row2poly(k) : k in [(Rank(E)+1)..Nrows(E)] ];
2 0 [ 1, 0, 2, 0, 0, 0 ] x^2 - x
1 1 [ 0, 0, 0, 0, 0, 0 ] x*y
2 1 [ 0, 0, 0, 0, 0, 0 ] x^2*y
0 2 [ 0, 0, 0, 0, 4, 0 ] y^2 - y
1 2 [ 0, 0, 0, 0, 0, 0 ] x*y^2
2 2 [ 0, 0, 0, 0, 0, 0 ] x^2*y^2
0 3 [ 0, 0, 0, 0, 4, 0 ] y^3 - y
1 3 [ 0, 0, 0, 0, 0, 0 ] x*y^3
2 3 [ 0, 0, 0, 0, 0, 0 ] x^2*y^3
> ann := ideal< A | polys >;
> GroebnerBasis(ann);
x^2 - x,
x*y,
y^2 - y
Take Groebner basis of relation polynomials
Groebner in bounded number of variables is in polynomial time (Bradt-Faugere-Salvy).
Same tensor,
different operators,
can be different annihilators.
Different tensor,
same operators,
can be different annihilators.
Data
Action by polynomials
Resulting annihilating ideal
Could this be wild? Read below.
Mal'cev showed that the representation of 2-generated algebras is "wild" in that its theory is undecidable.
However, we have two features: our variables commute, and our operators are transverse.
Still, maybe this is wild?
This presecirbes a bimap
The image of this map we call the transverse operators.
These generate all other operators.
Explore a proof below or move on to more generality
Theorem. A Groebner basis for this annihilator can be computed in polynomial time.
A trait is an element of the Groebner basis of a prime decomposition of the annihilator.
Traits generalize eigen values.
are all prime, and the minimal primes are unique.
where
As a variety we only see the radical, i.e. that x=0 and y=0.
There is nothing about the tensor in this.
Need to look at the scheme -- need to focus on the xy.
Examples
This ideal is the intersection of annihilators for each operator. So the dimension of the ideal grows.
This is how we isolate traits of high dimension -- use many operators.
T-sets are to traits what eigen vectors are to eigen values.
This is a ternary Galois connection.
For trinomial ideals, all geometries can arise so classification beyond this point is essentially impossible.
Idea: study the hypersurface traits -- captures the largest swath of operators. Reciprocally, what we find applies to very few tensors, perhaps even just the one we study.
Fact: In projective geometry there is a well-defined notation of a generic degree d hyper surface:
Seems tractible only for the linear case
Treating 0 as contra-variant the natural hyperplane is:
That is, the generic linear trait is simply to say that operators are derivations!
However, the schemes Z(S,P) are not the same as Z(P), so generic here is not the same as generic there...work required.
Theorem (FMW). If
Then
(If 1. fails extend the field; if 2. is affine, shift; if 3 fails, then result holds over support of P.)
Since Whitney's 1938 paper, tensors have been grounded in associative algebras.
Derivations form natural Lie algebras.
If associative operators define tensor products but Lie operators are universal, who is right?
Theorem (FMW). If
Then in all but at most 2 values of a
In particular, to be an associative algebra we are limited to at most 2 coordinates. Whitney's definition is a fluke.
As valence grows we act on a linear number of spaces but have exponentially many possible actions left out.
Lie tensor products act on all sides.
Densors are
I.e. operators that on the indices A are restricted to the U's.
Claim. Singularity at U if, and only if, monomial trait on A.
Singularities come with traits that are in bijection with Stanley-Raisner rings, and so with simplicial complexes.
Shaded regions are 0.
Theorem (FMW). If for every S and P
then
If
then the converse holds.
(We speculate this is if, and only if.)
In many settings a symmetry is required. The correspondence applies here but the classifications just given all evolve. Much to be learned here. E.g.