Lie Theory: Leverage Map
A. EXISTENCE JUSTIFICATION
Symmetry kept appearing with a strange property: it was continuous.
Rotating a sphere by any angle is a symmetry. Translating space by any amount is a symmetry. These aren’t discrete operations like permutations—they form a continuum of transformations, smoothly parameterized.
Lie theory exists because we needed to study groups that are also manifolds—continuous families of symmetries with both algebraic structure (composition, inverses) and geometric structure (smoothness, tangent spaces).
The miraculous discovery (Sophus Lie, 1870s): To understand an infinite continuous group, you only need to study an infinitesimal neighborhood of the identity. The tangent space at the identity—the Lie algebra—captures almost everything about the group. Finite transformations are built by exponentiating infinitesimal ones.
Why this is profound:
- A Lie group can be infinite-dimensional as a manifold
- Its Lie algebra is a finite-dimensional vector space
- Vector spaces are easy; manifolds are hard
- The algebra determines the group (locally, and often globally)
The core move: Replace the study of global symmetry transformations with the study of infinitesimal generators. The commutator of generators becomes the Lie bracket. Representation theory of groups becomes representation theory of algebras (linear!). Everything simplifies.
B. CORE OBJECTS & MORPHISMS
| Object | What it is | Notation |
|---|---|---|
| Lie group | A smooth manifold G with smooth group operations (multiplication, inverse) | G, H, K |
| Lie algebra | The tangent space at identity, with Lie bracket [·,·] | 𝔤, 𝔥, 𝔨 (fraktur) |
| Lie bracket | Bilinear, antisymmetric, satisfies Jacobi identity: [X,[Y,Z]] + cyclic = 0 | [X, Y] |
| Exponential map | exp: 𝔤 → G, sends algebra elements to group elements | exp(X) or eˣ |
| One-parameter subgroup | A smooth homomorphism ℝ → G; image of a line under exp | γ(t) = exp(tX) |
| Adjoint representation | G acting on 𝔤 by conjugation: Ad_g(X) = gXg⁻¹ | Ad: G → GL(𝔤) |
| adjoint representation | 𝔤 acting on itself by bracket: ad_X(Y) = [X,Y] | ad: 𝔤 → 𝔤𝔩(𝔤) |
| Killing form | The natural bilinear form: B(X,Y) = Tr(ad_X ∘ ad_Y) | B or κ |
| Root | Eigenvalue of Cartan subalgebra action on 𝔤 | α ∈ 𝔥* |
| Weight | Eigenvalue of Cartan action on a representation | λ ∈ 𝔥* |
| Cartan subalgebra | Maximal abelian subalgebra (all elements commute) | 𝔥 ⊂ 𝔤 |
| Simple / Semisimple | No nontrivial ideals / direct sum of simples | Classification via Dynkin diagrams |
Key examples:
| Lie Group | Manifold | Lie Algebra | Dimension |
|---|---|---|---|
| GL(n,ℝ) | Open subset of ℝⁿ² | 𝔤𝔩(n,ℝ) = all n×n matrices | n² |
| SL(n,ℝ) | det = 1 | 𝔰𝔩(n,ℝ) = traceless matrices | n²-1 |
| O(n) | Orthogonal matrices | 𝔬(n) = antisymmetric matrices | n(n-1)/2 |
| SO(n) | O(n) ∩ {det=1} | 𝔰𝔬(n) = 𝔬(n) | n(n-1)/2 |
| U(n) | Unitary matrices | 𝔲(n) = anti-Hermitian matrices | n² |
| SU(n) | U(n) ∩ {det=1} | 𝔰𝔲(n) = traceless anti-Hermitian | n²-1 |
| Sp(n) | Symplectic matrices | 𝔰𝔭(n) | n(2n+1) |
Morphisms:
- Lie group homomorphism: smooth map respecting multiplication
- Lie algebra homomorphism: linear map respecting bracket
- Every Lie group homomorphism induces a Lie algebra homomorphism (differentiate at identity)
C. CENTRAL INVARIANTS
The Lie bracket [X,Y]:
For matrix groups: [X,Y] = XY - YX (commutator)
This measures “how much X and Y fail to commute.” If [X,Y] = 0, the flows generated by X and Y commute. If [X,Y] ≠ 0, there’s a “twist”—doing X then Y differs from Y then X.
Properties defining a Lie algebra:
- Bilinearity: [aX + bY, Z] = a[X,Z] + b[Y,Z]
- Antisymmetry: [X,Y] = -[Y,X]
- Jacobi identity: [X,[Y,Z]] + [Y,[Z,X]] + [Z,[X,Y]] = 0
The Killing form B(X,Y) = Tr(ad_X ∘ ad_Y):
This is the natural inner product on a semisimple Lie algebra.
- B is symmetric and Ad-invariant
- B is non-degenerate iff 𝔤 is semisimple
- The sign structure of B determines the real form (compact vs. non-compact)
For 𝔰𝔲(n): B(X,Y) = 2n·Tr(XY) — negative definite → compact group For 𝔰𝔩(n,ℝ): B has mixed signature → non-compact group
Root system (for semisimple 𝔤):
Choose a Cartan subalgebra 𝔥 (maximal abelian). The rest of 𝔤 decomposes into root spaces:
$$\mathfrak{g} = \mathfrak{h} \oplus \bigoplus_{\alpha \in \Phi} \mathfrak{g}_\alpha$$
where 𝔤_α = {X ∈ 𝔤 : [H,X] = α(H)X for all H ∈ 𝔥}.
The roots α form a highly constrained geometric pattern—the root system. This pattern completely classifies the algebra.
What counts as “the same”:
- Lie groups: isomorphic as groups and diffeomorphic as manifolds
- Lie algebras: isomorphic as algebras (linear isomorphism preserving bracket)
- Key fact: Lie algebras with the same root system are isomorphic
D. SIGNATURE THEOREMS
1. Lie’s Fundamental Theorems
First: Every Lie group homomorphism φ: G → H induces a Lie algebra homomorphism dφ: 𝔤 → 𝔥 (its derivative at identity).
Second: Every Lie algebra homomorphism 𝔤 → 𝔥 is the derivative of some local Lie group homomorphism.
Third: For every finite-dimensional Lie algebra 𝔤, there exists a simply connected Lie group G with Lie algebra 𝔤, unique up to isomorphism. Every connected Lie group with algebra 𝔤 is a quotient of G by a discrete central subgroup.
Importance: The correspondence between Lie groups and Lie algebras is tight:
- Algebra determines local group structure completely
- Algebra determines global structure up to discrete ambiguity (covering spaces)
- Algebraic problems (linear!) capture geometric problems (nonlinear!)
2. Classification of Simple Lie Algebras
Every simple Lie algebra over ℂ is isomorphic to one of:
- Classical: A_n (𝔰𝔩_{n+1}), B_n (𝔰𝔬_{2n+1}), C_n (𝔰𝔭_{2n}), D_n (𝔰𝔬_{2n})
- Exceptional: G₂, F₄, E₆, E₇, E₈
Importance: Complete classification! There are exactly these families and five exceptions—no others. The classification is encoded in Dynkin diagrams:
A_n: ○—○—○—...—○ (n nodes)
B_n: ○—○—○—...—○=>○
C_n: ○—○—○—...—○<=○
D_n: ○—○—○—...—<○ (fork at end)
○
G₂: ○≡>○
F₄: ○—○=>○—○
E₆: ○—○—○—○—○
|
○
E₇: ○—○—○—○—○—○
|
○
E₈: ○—○—○—○—○—○—○
|
○
Each node = simple root. Edges encode angles between roots. The diagram determines everything.
3. Exponential Map Properties
For matrix Lie groups: $$\exp(X) = \sum_{n=0}^{\infty} \frac{X^n}{n!} = I + X + \frac{X^2}{2!} + \cdots$$
Key properties:
- exp(0) = I (identity)
- exp((s+t)X) = exp(sX)exp(tX)
- exp is a local diffeomorphism near 0
- d/dt|₀ exp(tX) = X
Importance: The exponential map is how you go from infinitesimal (Lie algebra) to finite (Lie group). One-parameter subgroups are exactly curves of the form exp(tX). Rotations by angle θ around axis n̂ are exp(θ n̂·J) where J are rotation generators.
4. Baker-Campbell-Hausdorff Formula
If X, Y ∈ 𝔤, then exp(X)exp(Y) = exp(Z) where: $$Z = X + Y + \frac{1}{2}[X,Y] + \frac{1}{12}[X,[X,Y]] - \frac{1}{12}[Y,[X,Y]] + \cdots$$
Importance: Group multiplication becomes algebra operations (brackets). If [X,Y] = 0, then exp(X)exp(Y) = exp(X+Y)—just add. The correction terms involve nested brackets. This is how non-commutativity at the group level emerges from the bracket at the algebra level.
5. Peter-Weyl Theorem
For compact Lie group G, the matrix coefficients of irreducible representations form an orthonormal basis of L²(G).
Importance: Representation theory gives you Fourier analysis on the group. For G = S¹ = U(1), this is ordinary Fourier series. For G = SU(2), you get spherical harmonics. The irreps are the “frequencies”; the matrix elements are the “basis functions.”
E. BRIDGES TO OTHER DOMAINS
| Domain | Connection |
|---|---|
| Physics (QM) | Observables generate symmetries via exp(iHt). Angular momentum operators are 𝔰𝔲(2) or 𝔰𝔬(3). Commutators become Poisson brackets in classical limit. |
| Particle Physics | The Standard Model is SU(3) × SU(2) × U(1). Quarks/leptons are representations. Gauge bosons live in the Lie algebra. The whole theory is Lie-theoretic. |
| Differential Geometry | The frame bundle has structure group GL(n) or O(n). Connections are Lie-algebra-valued 1-forms. Curvature is Lie-algebra-valued 2-form. |
| Representation Theory | We did this! Representations of G induce representations of 𝔤. Weights are eigenvalues of Cartan action. Highest weight theory classifies irreps. |
| Gauge Theory | Gauge transformations form (infinite-dimensional) Lie group. Gauge fields are connections on principal bundles. Yang-Mills equation is Lie-theoretic. |
| Control Theory | System symmetries are Lie groups. Controllability relates to Lie algebra generated by controls. The Lie bracket measures “new directions” from nested controls. |
| Robotics | SE(3) = rigid motions of 3D space. Robot kinematics is Lie group theory. Exponential coordinates for rotations (axis-angle). Screw theory. |
| Geometric Deep Learning | Equivariant networks respect Lie group symmetries. Convolutional = translation equivariant. Steerable = rotation equivariant. The Lie algebra tells you the “infinitesimal constraints.” |
| Integrable Systems | Many integrable systems have Lie group symmetry. Moment maps, Hamiltonian reduction. The KdV equation has infinite-dimensional Lie symmetry. |
Pattern-linking gold:
The exponential map is universal. Whenever you have:
- Small thing → big thing via iteration
- Infinitesimal rate → finite change
- Generator → transformation
…you’re using the exponential map of some (possibly infinite-dimensional) Lie group.
Time evolution in QM: U(t) = exp(-iHt/ℏ) Rotation by angle θ: R = exp(θJ) Flow of a vector field: φ_t = exp(tX) Transfer matrix in statistical mechanics: T = exp(-βH)
The bracket is universal: [X,Y] measures:
- Failure of flows to commute
- Quantum uncertainty (canonical commutation)
- New directions accessible by nested operations
- Curvature (for connection 1-forms)
F. COMMON MISCONCEPTIONS
“Lie algebra = tangent space” — The Lie algebra is the tangent space at the identity, with the bracket structure. Any tangent space is just a vector space; the bracket makes it an algebra.
“exp is always surjective” — Not for non-compact or non-connected groups. SL(2,ℝ) has elements not reached by exp. SO(3) is reached by exp, but SO(3,1) isn’t.
“The Lie algebra determines the Lie group” — Only up to covering and discrete quotients. 𝔰𝔬(3) and 𝔰𝔲(2) are isomorphic algebras, but SO(3) and SU(2) are different groups (SU(2) double covers SO(3)).
“Semisimple means simple” — Semisimple means direct sum of simples. SU(2) × SU(2) is semisimple but not simple.
“Root systems are just a classification tool” — Roots have geometric content: they’re the “resonant frequencies” of the algebra, the directions where the algebra “responds” to the Cartan subalgebra. They control everything: representations, Weyl group, Dynkin diagram.
“Compact and non-compact are just different real forms” — They’re profoundly different geometrically and representation-theoretically. Compact groups have discrete unitary representations. Non-compact groups have continuous series representations. Unitary representations of non-compact groups are much harder.
“Lie theory is just matrix groups” — While matrix groups are the main examples, Lie theory is broader. Diffeomorphism groups (infinite-dimensional), spinor groups (double covers), and abstract Lie groups all fit the framework.
“The bracket is the commutator” — For matrix algebras, yes. But the bracket is defined abstractly by the three axioms. Other realizations exist (vector fields with Lie bracket = [X,Y]f = X(Yf) - Y(Xf)).
G. NOTATION SURVIVAL KIT
| Symbol | Meaning |
|---|---|
| G, H, K | Lie groups |
| 𝔤, 𝔥, 𝔨 | Lie algebras (fraktur letters) |
| [X, Y] | Lie bracket |
| exp: 𝔤 → G | Exponential map |
| Ad_g | Adjoint action of g ∈ G on 𝔤: Ad_g(X) = gXg⁻¹ |
| ad_X | adjoint action of X ∈ 𝔤: ad_X(Y) = [X,Y] |
| B(X,Y) or κ(X,Y) | Killing form: Tr(ad_X ad_Y) |
| 𝔥 | Cartan subalgebra |
| Φ or Δ | Root system |
| α, β, γ | Roots (elements of 𝔥*) |
| 𝔤_α | Root space for root α |
| W | Weyl group (symmetries of root system) |
| λ | Weight (eigenvalue of Cartan action on representation) |
| ρ | Representation, or half-sum of positive roots |
| GL(n), SL(n), O(n), SO(n), U(n), SU(n), Sp(n) | Classical matrix Lie groups |
| 𝔤𝔩, 𝔰𝔩, 𝔬, 𝔰𝔬, 𝔲, 𝔰𝔲, 𝔰𝔭 | Their Lie algebras |
| A_n, B_n, C_n, D_n | Classical simple Lie algebra series |
| E_6, E_7, E_8, F_4, G_2 | Exceptional simple Lie algebras |
| Spin(n) | Double cover of SO(n) |
| T or U(1)ⁿ | Maximal torus (maximal abelian subgroup) |
| G/H | Homogeneous space (coset space) |
H. ONE WORKED MICRO-EXAMPLE
SO(3) and 𝔰𝔬(3): The rotation group
Setup: SO(3) = 3×3 real orthogonal matrices with determinant 1 = rotations of ℝ³.
The Lie algebra 𝔰𝔬(3):
Tangent space at identity = antisymmetric 3×3 matrices (differentiating RᵀR = I gives Xᵀ + X = 0).
Basis: $$J_1 = \begin{pmatrix} 0 & 0 & 0 \ 0 & 0 & -1 \ 0 & 1 & 0 \end{pmatrix}, \quad J_2 = \begin{pmatrix} 0 & 0 & 1 \ 0 & 0 & 0 \ -1 & 0 & 0 \end{pmatrix}, \quad J_3 = \begin{pmatrix} 0 & -1 & 0 \ 1 & 0 & 0 \ 0 & 0 & 0 \end{pmatrix}$$
The bracket: $$[J_i, J_j] = \epsilon_{ijk} J_k$$
This is the defining relation of angular momentum / rotation generators.
Explicitly: [J₁, J₂] = J₃, [J₂, J₃] = J₁, [J₃, J₁] = J₂ (and antisymmetric).
Exponential map:
Rotation by angle θ around unit vector n̂ = (n₁, n₂, n₃):
$$R = \exp(\theta(n_1 J_1 + n_2 J_2 + n_3 J_3))$$
This is Rodrigues’ formula in disguise.
The double cover SU(2) → SO(3):
𝔰𝔲(2) = 2×2 traceless anti-Hermitian matrices.
Basis (Pauli matrices times i/2): $$\sigma_k’ = \frac{i}{2}\sigma_k$$
The bracket relations are identical: [σ₁’, σ₂’] = σ₃’, etc.
So 𝔰𝔲(2) ≅ 𝔰𝔬(3) as Lie algebras.
But SU(2) is simply connected; SO(3) ≅ SU(2)/{±I} has π₁ = ℤ/2.
Importance: Electrons are spin-½ = representations of SU(2) that don’t descend to SO(3). A 2π rotation gives a factor of -1. This is why spinors exist—the algebra is the same, but the groups differ, and spinors see the covering group.
Micro-example 2: Highest weight classification
Setup: Classify irreducible representations of 𝔰𝔩(2,ℂ).
The algebra: Basis H, E, F with:
- [H, E] = 2E (E is a “raising operator,” eigenvalue +2)
- [H, F] = -2F (F is a “lowering operator,” eigenvalue -2)
- [E, F] = H
How representations work:
H is diagonalizable. Its eigenvalues are called weights.
E raises weight by 2: if Hv = λv, then H(Ev) = (λ+2)(Ev). F lowers weight by 2: H(Fv) = (λ-2)(Fv).
Irreducible representation structure:
Start with highest weight vector v₀: Ev₀ = 0, Hv₀ = λv₀.
Apply F repeatedly: v₀, Fv₀, F²v₀, … with weights λ, λ-2, λ-4, …
For finite dimension, this must terminate: Fⁿ⁺¹v₀ = 0 for some n.
Result: The representation has weights λ, λ-2, …, λ-2n, and dimension n+1.
For the representation to exist (F actually annihilates at the right spot), λ must be a non-negative integer.
Classification: Irreps of 𝔰𝔩(2,ℂ) are labeled by non-negative integers n = 0, 1, 2, …
- n = 0: trivial (dimension 1)
- n = 1: standard representation (dimension 2)
- n = 2: adjoint representation (dimension 3)
- …
This is highest weight theory in its simplest form. The same pattern—Cartan subalgebra, root operators, highest weight—classifies representations of all semisimple Lie algebras.
Leverage for your work:
Spinor networks:
Spinors are representations of Spin(n), the double cover of SO(n). The Lie algebra is the same: 𝔰𝔭𝔦𝔫(n) ≅ 𝔰𝔬(n). But the representation theory differs—spinor representations don’t descend to SO(n).
For geometric deep learning with spinors:
- Equivariance under SO(n) is about how features transform
- Spinor features transform under Spin(n) instead
- The infinitesimal generators (Lie algebra) are the same, so local computations look similar
- But global topology differs, enabling different representation content
Gauge theory / physics:
The Standard Model is SU(3) × SU(2) × U(1). Understanding this requires:
- 𝔰𝔲(3): 8-dimensional algebra, generators are Gell-Mann matrices, governs strong force
- 𝔰𝔲(2): 3-dimensional, Pauli matrices, governs weak force
- 𝔲(1): 1-dimensional (abelian!), governs electromagnetism
Particles are representations. Quarks transform as (3, 2){1/6} under (SU(3), SU(2)){U(1)}. The representation theory determines what interactions are possible.
Convergence Thesis:
If optimal cognitive architectures respect symmetries, they respect the Lie group of those symmetries. The Lie algebra captures the infinitesimal constraints. Different real forms of the same complex algebra give different “flavors” of the same symmetry structure.
Constraint: your architecture must be equivariant. Lie theory tells you: what equivariance is (representation theory), how to implement it (algebra → group via exp), what the complete set of options is (classification).
The Dynkin diagram as architecture blueprint:
For a given symmetry requirement, the Dynkin diagram tells you:
- How many independent “directions” of symmetry (nodes)
- How they interact (edges)
- What representations are possible (weight lattice)
This is a complete combinatorial encoding of continuous symmetry structure. In principle, you could derive neural network architecture constraints directly from Dynkin diagrams.
Control theory connection:
The Lie bracket of two control vector fields gives a “new direction” accessible by alternating controls (Lie bracket = infinitesimal commutator). A system is controllable iff the Lie algebra generated by controls spans the full tangent space at every point. This is how you prove a robot can reach any configuration—pure Lie theory.
Next: Dynamical Systems—where all of this comes alive in motion. Stability, attractors, chaos, bifurcations. Where the geometry and algebra actually do something over time.