Optimization Informed by Geometric Invariant Theory and Symplectic Geometry

Clayton Shonkwiler

Colorado State University

shonkwiler.org

/icerm24

this talk!

Recent Progress on Optimal Point Distributions and Related Fields

June 3, 2024

Collaborators

Tom Needham

Florida State University

Funding

National Science Foundation (DMS–2107700)

Simons Foundation (#709150)

Dustin Mixon

The Ohio State University

Soledad Villar

Johns Hopkins University

Anthony Caine

Colorado State University

Take-Home Message

Symmetry + geometry sometimes tells you an optimization problem is easier than expected.

Equal-Norm Parseval Frames

A spanning set \(f_1, \dots , f_n \in \mathbb{C}^d\) is a frame.

\(\Rightarrow F = [f_1 \cdots f_n] \in \mathbb{C}^{d \times n}\)

Definition.

\(\{f_1,\dots, f_n\}\subset \mathbb{C}^d\) is a Parseval frame if \(\operatorname{Id}_{d\times d}=FF^*=f_1f_1^*+\dots+f_nf_n^*\).

An equal-norm Parseval frame (ENP frame) is a Parseval frame \(f_1,\dots , f_n\) with \(\|f_i\|^2=\|f_j\|^2\) for all \(i\) and \(j\).

\(\sum \|f_i\|^2=\operatorname{tr}F^*F=\operatorname{tr}FF^*=\operatorname{tr}\operatorname{Id}_{d \times d} = d\), so each \(\|f_i\|^2=\frac{d}{n}\).

Frame Potential

Definition [Benedetto–Fickus, Casazza–Fickus]

The frame potential is

\(\operatorname{FP}(F) = \|FF^\ast\|_{\operatorname{Fr}}^2\)

Proposition [cf. Welch]

The equal-norm Parseval frames are exactly the global minima of \(\operatorname{FP}|_{\text{equal norm}}\).

Theorem [Benedetto–Fickus]

As a function on equal-norm frames with fixed \(d\) and \(n\), \(\operatorname{FP}\) has no spurious local minima.

Frame Potential

Optimization

Theorem [with Mixon, Needham, and Villar]

On the space of equal-norm frames, consider the initial value problem

\(\Gamma(F_0,0) = F_0, \qquad \frac{d}{dt}\Gamma(F_0,t) = -\operatorname{grad}\operatorname{FP}(\Gamma(F_0,t))\).

If \(F_0\) has full spark, then \(\lim_{t \to \infty} \Gamma(F_0,t)\) is an ENP frame.

Theorem [with Needham]

Same for fusion frames.

Why Not the Other Way?

Definition [cf. Bodmann–Casazza]

The normalizing potential is

\(\operatorname{NP}(f_1,\ldots,f_n) = \sum_{i=1}^n \|f_i\|^4.\)

Proposition [Bodmann–Haas]

The ENP frames are exactly the global minima of \(\operatorname{NP}|_{\text{Parseval}}\).

Theorem [with Caine and Needham]

On the space of Parseval frames, consider the initial value problem

\(\widetilde{\Gamma}(F_0,0) = F_0 \qquad \frac{d}{dt}\widetilde{\Gamma}(F_0,t) = -\operatorname{grad} \operatorname{NP}(\widetilde{\Gamma}(F_0,t))\).

If \(F_0\) is full spark, then \(\lim_{t \to \infty} \widetilde{\Gamma}(F_0,t)\) is an ENP frame.

Normal Matrices

Definition.

\(A \in \mathbb{C}^{d \times d}\) is normal if \(AA^\ast = A^\ast A\).

Equivalently,

\(0 = AA^\ast - A^\ast A = [A,A^\ast]\).

Define the non-normal energy \(\operatorname{E}:\mathbb{C}^{d \times d} \to \mathbb{R}\) by

\(\operatorname{E}(A) := \|[A,A^\ast]\|^2.\)

Obvious Fact.

The normal matrices are the global minima of \(\operatorname{E}\).

Theorem [with Needham]

The only critical points of \(\operatorname{E}\) are the global minima; i.e., the normal matrices.

Normal Matrices

\(\operatorname{E}\) is not quasiconvex!

Theorem [with Needham]

The only critical points of \(\operatorname{E}\) are the global minima; i.e., the normal matrices.

Gradient Descent

Let \(\mathcal{F}: \mathbb{C}^{d \times d} \times \mathbb{R} \to \mathbb{C}^{d \times d}\) be negative gradient descent of \(\operatorname{E}\); i.e.,

\(\mathcal{F}(A_0,0) = A_0 \qquad \frac{d}{dt}\mathcal{F}(A_0,t) = -\nabla \operatorname{E}(\mathcal{F}(A_0,t))\).

Theorem [with Needham]

For any \(A_0 \in \mathbb{C}^{d \times d}\), the matrix \(A_\infty := \lim_{t \to \infty} \mathcal{F}(A_0,t)\) exists, is normal, has the same eigenvalues as \(A_0\), and is real if \(A_0\) is.

Balancing Graphs

Define the unbalanced energy \(\operatorname{B}(A) := \|\mathrm{diag}([A,A^\ast])\|^2 = \sum \left(\|A_i\|^2 - \|A^i\|^2\right)^2\).

If \(A = \left(a_{ij}\right)_{i,j} \in \mathbb{R}^{d \times d}\) such that \(\mathrm{diag}([A,A^\ast]) = 0\), then \(\widehat{A} = \left(a_{ij}^2\right)_{i,j}\) is the adjacency matrix of a balanced multigraph.

Balancing Graphs

Let \(\mathscr{F}(A_0,0) = A_0, \frac{d}{dt}\mathscr{F}(A_0,t) = - \nabla \operatorname{B}(\mathscr{F}(A_0,t))\) be negative gradient flow of \(\operatorname{B}\).

Theorem [with Needham]

For any \(A_0 \in \mathbb{C}^{d \times d}\), the matrix \(A_\infty := \lim_{t \to \infty} \mathscr{F}(A_0,t)\) exists, is balanced, has the same eigenvalues and principal minors as \(A_0\), and has zero entries wherever \(A_0\) does.