Processing math: 19%

Sunday, January 14, 2018

New Location

I have a new blog at github pages. All of the content of this blog has been migrated there. This blog is no longer being maintained.

Tuesday, September 15, 2015

Santalo Formula

Let M be a simple Riemannian manifold with boundary M. For (x,v)SM, let τ(x,v) denote the exit time of the geodesic starting at x with tangent vector v, i.e. τ(x,v) is the (necessarily unique, and finite) time at which expx(tv)M.

We let +(SM) denote the set
+(SM)={(x,v)SM | xM,v,ν>0}
where ν denotes the inward unit normal to M in M. The exponential map identifies SM with the set
Ω={(x,v,t)+(SM)×R | 0tτ(x,v)},
via (x,v,t)expx(tv). Let Φ:ΩM denote this diffeomorphism. Then we have, for all fC(SM)
SMfdvol(SM)=Ω(Φf)(Φdvol(SM))=+(SM)τ(x,v)0f(ϕt(x,v))Φdvol(SM).
Therefore, we can compute integrals of functions over SM by integrating along geodesics, provided that we can cmopute Φdvol(SM). This is the content of the Santalo formula.

Theorem (Santalo formula). For all fC(SM), we have
SMfdvol(SM)=+(SM)τ(x,v)0f(ϕt(x,v))v,νdtdvol((SM))

Proof. Necessarily, we must have
Φ(dvol(SM))=a(x,v)dtdvol((SM))),
for some function a(x,v). The reason we can assume that a is independent of t is that Φ is defined via geodesic flow, and geodesic flow preserves the volume form on SM. To compute the factor a(x,v), we just need to compute
i/tΦ(dvol(SM))=Φ(iΦ(/t)dvol(SM))
From the definition of Φ, we have that Φ(/t) is the Reeb vector field on SM, i.e. the vector field generating geodesic flow. Therefore, Φ(/t) is equal, at a point (x,v) to the horizontal lift of the vector v. Therefore, using the definition of the induced volume form on a hypersurface of a Riemannian manifold, we find
i/tΦ(dvol(SM))=v,νdvol((SM))
where ν is the inward pointing unit normal to (SM) in SM. This shows that a(x,v)=v,ν and completes the proof.

Friday, September 11, 2015

The Index Form

Let f:[0,T]×(ϵ,ϵ)M be a family of parametrized curves in a Riemannian manifold (M,g). To simplify this calculation, we assume that f(0,s)=p,f(T,s)=q for some p,qM and all s(ϵ,ϵ). (This assumption is not necessary, but without it our variational formulae will have additional boundary terms.)

For convenience, set ˙f=f/t and f=f/s. For each s(ϵ,ϵ) we define the energy functional E=E(s) to be
E(s)=12T0|˙f|2dt.
The first variation is
dEds=T0f˙f,˙fdt =T0˙ff,˙fdt =T0f,˙f˙fdt

Set γ(t):=f(t,0) and X(t)=f(t) (thought of as a vector field supported on γ). Evaluating the above at s=0 we obtain
dEds|s=0=T0X,˙γ˙γdt,
which shows immediately that

Theorem. γ is a critical point of the energy functional if and only if ˙γ˙γ=0.


The second variation is
d2Eds2=T0ff,˙f˙f+f,f˙f˙fdt =T0ff,˙f˙f+f,˙ff˙fdt+f,R(f,˙f)˙fdt =T0ff,˙f˙f˙ff,f˙fdt+f,R(f,˙f)˙fdt =T0ff,˙f˙f˙ff,˙ffdt+f,R(f,˙f)˙fdt

Assume now that γ is a geodesic, i.e. ˙γ˙γ=0. Then evaluating the above at s=0, we obtain
d2Eds2=T0|˙γX|2X,R(X,˙γ)˙γdt.

Definition. Let γ be a geodesic. The index form associated to variations X,Y of γ is
I(X,Y)=T0˙γX,˙γYdtY,R(X,˙γ)˙γ =T0Y,2˙γX+R(X,˙γ)˙γ
It follows from symmetries of the Riemann tensor that I(X,Y)=I(Y,X) and also I(X,X)=E as above.

Theorem. Suppose that X is the infinitesimal variation of a family of affine geodesics about a fixed geodesic \gamma. Then
\nabla_{\dot \gamma}^2 X + R(X, \dot\gamma)\dot\gamma = 0.
In particular, I(X, -) = 0.

Proof. Let f(t,s) denote the family as above. By hypothesis, we have that \nabla_{\dot f} \dot f = 0 for all s, so that
\nabla_{f'} \nabla_{\dot f} \dot f = 0.
Commuting the derivatives using the curvature tensor, we have
0 = \nabla_{\dot f} \nabla_{f'} \dot f + R(f', \dot f) \dot f.
Now use \nabla_{\dot f} f' = \nabla_{f'} \dot f and evaluate at s=0 to obtain
0 = \nabla_{\dot \gamma}^2 X + R(X, \dot \gamma)\dot\gamma.

Thursday, September 3, 2015

Boundary Distance

Recently, I've been learning some topics related to machine learning, and especially manifold learning. These both fall under the general notion of inverse problems: given some mathematical object X (it could be a function f: A \to B, or a Riemannian manifold (M,g), or a probability measure d\mu on a space X, etc.), can we effectively reconstruct X given only the information of some auxiliary measurements? What if we can only perform finitely many measurements? What if the measurements are noisy? Can we reconstruct X at least approximately? Can we measure in some precise way, how close our approximate reconstruction is to the unknown object X? And so on, and so forth.

Anyway, this post is about a cute observation, which I was reminded of while reading a paper on the inverse Gel'fand problem. Let M be a compact manifold with smooth boundary \partial M. Then with no additional data required, we have a Banach space L^\infty(\partial M) consisting of the essentially bounded measureable functions on the boundary. Since it is a Banach space, it comes with a complete metric d_\infty(f,g) := \|f-g\|_{L^\infty(\partial M)}.

Now, suppose that g is a Riemannian metric on M. Then we have the Riemannian distance function d_g(x,y) which is defined to be the infimum of arclengths of all smooth paths connecting x and y. For any x \in M, we obtain a function r_x \in L^\infty(\partial M) defined by
r_x(z) = d_g(x,z), \forall z \in \partial M.
This gives a map \phi_g: M \to L^\infty(\partial M), defined by x \mapsto r_x.

Theorem. Suppose that for any two distinct x,y \in M, there is a unique length-minimizing geodesic connecting x and y. Then \phi_g: M \to L^\infty(\partial M) is an isometric embedding, i.e. d_g(x,y) = d_\infty(r_x, r_y) for all x,y \in M.

Proof. Let x,y be distinct and let \gamma be the unique geodesic from x to y. For any point z on the boundary, we have
|d_g(x,z) - d_g(y,z)| \leq d_g(x,y).
which is the triangle inequality. Now let \gamma be the unique geodesic from x to y, and extend \gamma until it hits some boundary point z_\ast. Then since x,y,z_\ast all lie on a length-minimizing geodesic, we have
d_g(x,z_\ast) - d_g(y,z_\ast) = d_g(x,y).
Therefore, the bound above is always saturated, and we find
\sup_{z \in \partial M} |d_g(x,z) - d_g(y,z)| = d_g(x,y).
But the expression on the left is nothing but the L^\infty(\partial M)-norm of r_x-r_y, so the theorem is proved.

Monday, August 31, 2015

Hamilton-Jacobi equation and Riemannian distance

Consider the cotangent bundle T^\ast X as a symplectic manifold with canonical symplectic form \omega. Consider the Hamilton-Jacobi equation
\frac{\partial S}{\partial t} + H(x, \nabla S) = 0,
for the classical Hamilton function S(x,t). Setting x=x(t), p(t) = (\nabla S)(x(t), t) one sees immediately from the method of characteristics that this PDE is solved by the classical action
S(x,t) = \int_0^t (p \dot{x} - H) ds,
where the integral is taken over the solution (x(s),p(s)) of Hamilton's equations with x(0)=x_0 and x(t) = x. The choice of basepoint x_0 involves an overall additive constant of S, and really this solution is only valid in some neighbourhood U of x_0. (Reason: S is in general multivalued, as the differential "dS" is closed but not necessarily exact.)

Now consider the case where X is Riemannian, with Hamiltonian H(x,p) = \frac{1}{2} |p|^2. The solutions to Hamilton's equations are affinely parametrized geodesics, and by a simple Legendre transform we have
S(x, t) = \frac{1}{2} \int_0^t |\dot x|^2 ds
where the integral is along the affine geodesic with x(0) = x_0 and x(t) = x. Since x(s) is a geodesic, |\dot x(s)| is a constant (in s) and therefore
S(x, t) = \frac{t}{2} |\dot x(0)|^2.
Now consider the path \gamma(s) = x(|\dot x(0)|^{-1}s). This is an affine geodesic with \gamma(0) = x_0, \gamma(|\dot x(0)|t) = x and |\dot \gamma| = 1. Therefore, the Riemannian distance between x_0 and x (provided x is sufficiently close to x_0) is
d(x_0, x) = |\dot x(0)| t.
Combining this with the previous calculation, we see that
S(x, t) = \frac{1}{2t} d(x_0, x)^2.
Now insert this back into the Hamilton-Jacobi equation above. With a bit of rearranging, we have the following.

Theorem. Let x_0 denote a fixed basepoint of X. Then for all x in a sufficiently small neighborhood U of x_0, the Riemannian distance function satisfies the Eikonal equation
|\nabla_x d(x_0, x)|^2 = 1.

Now, for convenience set r(x) = d(x_0, x). Then |\nabla r|^2 = 1, from which we obtain (by differentiating twice and contracting)
g^{ij} g^{kl}\left(\nabla_{lki} r \nabla_j r + \nabla_{ki}r \nabla_{lj} r\right) = 0.
Quick calculation shows that
\nabla_{lki} r = \nabla_{ilk} r - \left.R_{li}\right.^{b}_k \nabla_b r
Therefore, tracing over l and k we obtain
g^{lk} \nabla_{lki} r = \nabla_i ( \Delta r) + Rc(\nabla r, -)
Plugging this back into the equation derived above, we have
\nabla r \cdot \nabla(\Delta r) + Rc(\nabla r, \nabla r) + |Hr|^2 = 0,
where Hr denotes the Hessian of r regarded as a 2-tensor. Now, using r as a local coordinate, it is easy to see that \partial_r = \nabla r (as vector fields). So we can rewrite this identity as
\partial_r (\Delta r) + Rc(\partial_r, \partial_r) + |Hr|^2 = 0.

Now, we can get a nice result out of this. First, note that the Hessian Hr always has at least one eigenvalue equal to zero, because the Eikonal equation implies that Hr(\partial_r, -)=0. Let \lambda_2, \dots, \lambda_n denote the non-zero eigenvalues of Hr. We have
|Hr|^2 = \lambda_2^2 + \dots + \lambda_n^2,
while on the other hand
|\Delta r|^2 = (\lambda_2 + \dots + \lambda_n)^2
By Cauchy-Schwarz, we have
|\Delta r|^2 \leq (n-1)|Hr|^2

Proposition. Suppose that the Ricci curvature of X satisfies Rc \geq (n-1)\kappa, and let u = (n-1)(\Delta r)^{-1}. Then
u' \geq 1 + \kappa u^2.

Proof. From preceding formulas, |Hr|^2 can be expressed in terms of the Ricci curvature and the radial derivative of \Delta r. On the other hand, |\Delta|^2 is bounded above by (n-1) |Hr|^2. The claimed inequality then follows from simple rearrangement.

Now, the amazing thing is that this deceptively simple inequality is the main ingredient of the Bishop-Gromov comparison theorem. The Bishop-Gromov comparison theorem, in turn, is the main ingredient of the proof of Gromov(-Cheeger) precompactness. I hope to discuss these topics in a future post.

Tuesday, August 18, 2015

The Classical Partition Function

Let (M, \omega) be a symplectic manifold of dimension 2n, and let H: M \to \mathbf{R} be a classical Hamiltonian. The symplectic form \omega allows us to define a measure on M, given by integration against the top form \omega^n / n!. We will denote this measure by d\mu.

We imagine that (M, \omega, H) represents some classical mechanical system. We suppose that the dynamics of this dynamical system are very complicated, e.g. some system of 10^{23} particles. The system is so complicated that not only can we not solve the equations of motion exactly, and even if we could, their solutions might be so complicated that we can't expect to learn very much from them.

So instead, we ask statistical questions. Imagine that we cannot measure the state of the system exactly (e.g. particles in a box), so we try to guess a probability distribution \rho(x,p,t) on M indicating that at time t the system has probability \rho(x,p,t) d\mu of being in the state (x,p). Obviously, \rho should satisfy the constraint \int_M \rho d\mu = 1.

How does \rho evolve in time? We know that the system obeys Hamilton's equations,
(\dot x, \dot p) = X_H = (\partial H / \partial p, -\partial H / \partial x)
 in local Darboux coordinates. Therefore, a particle located at (x,p) in phase space at time t will be located at (x,p)+X_H dt in phase space at time t+dt. Therefore, the probability that a particle is at point (x,p) at time t+dt, should be equal to the probability that the particle is at point (x,p)-X_H dt at time t. Therefore, we have
\frac{\partial \rho}{\partial t} = \frac{\partial H}{\partial x} \frac{\partial \rho}{\partial p} - \frac{\partial H}{\partial p} \frac{\partial \rho}{\partial x} = \{H, \rho\}

Given a probability distribution \rho, the entropy is defined to be
S[\rho] = -\int_M  \rho \log \rho d\mu.

(A version of) the second law of thermodynamics. For a given average energy U, the system assumes a distribution of maximal possible entropy at thermodynamic equilibrium.

The goal now, is to determine what distribution \rho will maximize the entropy, subject to the constraints (for fixed U)
\begin{align*} \int_M H \rho d\mu &= U \\\ \int_M \rho d\mu &= 1 \end{align*}

Setting aside technical issues of convergence, etc., this variational problem is easily solved using the method of Lagrange multipliers. Introducing parameters \lambda_1, \lambda_2, we consider the modified functional
S[\rho, \lambda_1, \lambda_2, U] = \int_M\left(-\rho \log \rho +\lambda_1\rho +\lambda_2(H\rho)\right)d\mu -\lambda_1-\lambda_2 U.

Note that \partial S / \partial U = -\lambda_2, and this is conventionally identified with (minus) the inverse temperature.

Taking the variation with respect to \rho, we find
0= \frac{\delta S}{\delta \rho} = -\log \rho-1+\lambda_1+H\lambda_2
Therefore, rho is proportional to e^{-\beta H} where we have set \beta=-\lambda_2. Define the partition function Z to be
Z = \int_M e^{-\beta H} d\mu.
We therefore have proved (formally and heuristically only!):

Theorem. The probability distribution \rho assumed by the system at thermodynamic equilibrium is given by
  \rho = \frac{e^{-\beta H}}{Z}
where \beta > 0 is a real parameter, called the inverse temperature.

Corollary. At thermodynamic equilibrium, the average energy is given by
U = -\frac{\partial \log Z}{\partial \beta} ,
and the entropy is given by
S = \beta U + \log Z.

Thursday, January 29, 2015

What is generalized geometry?

The following are my notes for a short introductory talk. References below are not intended to be comprehensive!

Math references:

 
Physics references:

 What is geometry?

Before trying to define generalized geometry, we should first decide what we mean by ordinary geometry. Of course, this question doesn't have a unique answer, so there are many ways to generalize the classical notions of manifolds and varieties. The viewpoint taken in generalized geometry is the following: the distinguishing feature of smooth manifolds is the existence of a tangent bundle
TM \to M
which satisfies some nice axioms. The basic idea of generalized geometry is to replace the tangent bundle with some other vector bundle L \to M, again satisfying some nice axioms. Different generalized geometries on M will correspond to different choices of bundle L \to M, as well as auxiliary data compatible with L in some appropriate sense.

Definition. A Lie algebroid over M is a smooth vector bundle L \to M together with a vector bundle map a: L \to TM called the anchor map and a bracket [\cdot, \cdot]: H^0(M, L) \otimes H^0(M, L) \to H^0(M, L) satisfying the following axioms:
  • [\cdot,\cdot] is  a Lie bracket on H^0(M, L)
  • [X, fY] = f[X,Y] + a(X)f \cdot Y for X,Y \in H^0(M,L) and f \in H^0(M, \mathcal{O}_M)
Note that we can take L to be either a real or complex vector bundle. In the latter case the anchor map should map to the complexified tangent bundle.

Example 1. We can take L to be TM with anchor map the identity.

Example 2. Let \sigma be a Poisson tensor on M. Then we define a bracket by [X,Y] = \sigma(X,Y) and an anchor by X \mapsto \sigma(X, \cdot). This makes T^\ast M into a Lie algebroid.

Example 3. Let M be a complex manifold of and let L \subset TM \otimes \mathbf{C} be the sub-bundle of vectors spanned by \{\partial / \partial z_1, \dots, \partial / \partial z_n\} in local holomorphic coordinates. Then L \to M is a (complex) Lie algebroid.


Courant Bracket

We'd like to try to fit the preceding examples into a common framework. Let \mathbf{T}M = TM \oplus T^\ast M. This bundle has a natural symmetric bilinear pairing given by
\langle X \oplus \alpha, Y \oplus \beta \rangle = \frac{1}{2} \alpha(Y) + \frac{1}{2} \beta(X)
Note that this bilinear form is of split signature (n,n). We define a bracket on sections of \mathbf{T}M by
[X\oplus \alpha, Y\oplus \beta] = [X,Y] \oplus \left(L_X \beta + \frac{1}{2}(d \alpha(Y))- L_Y \alpha -\frac{1}{2} d( \beta(X)) \right )
Note that this bracket is not a Lie bracket. We also have an anchor map a: \mathbf{T}M \to TM which is just the projection.

 Let B be a 2-form on M. Define an action of B on sections of \mathbf TM by
X + \alpha \mapsto X + \alpha + i_X B

Proposition. This action preserves the Courant bracket if and only if B is closed.

This shows that the diffeomorphisms of M as a generalized manifold are large than the ordinary diffeomorphisms of M. In fact is is the semidirect product of the diffeomorphism group of M with the vector space of closed 2-forms.

Dirac Structures

Definition. A Dirac structure on M is an Lagrangian sub-bundle L \subset \mathbf{T}M which is closed under the Courant bracket.

Theorem (Courant). A Lagrangian sub-bundle L \subset \mathbf{T} M is a Dirac structure if and only if L \to M is a Lie algebroid over M, with bracket induced by the Courant bracket and anchor given by projection.

Example 1. TM \subset \mathbf{T}M.

Example 2. Take L to be the graph of a Poisson tensor.

Example 3. Take L to be the graph of a closed 2-form.

Admissible Functions

We now let L \to M be a Dirac structure on M.

Definition. A smooth function f on M is called admissible if there exists a vector field X_f such that (X_f, df) is a section of L.

The Poisson bracket is defined as follows. If f,g are admissible, then define
\{f, g\} = X_f g.
It is easy to check from the definitions that the bracket on admissible functions is well-defined (independent of choice of X_f) and skew-symmetric. With a little bit of calculation, we find the following.

Proposition. The vector space of admissible functions is naturally a Poisson algebra, and moreover the natural bracket satisfies the Leibniz rule.


Generalized Complex Structures

Definition. A generalized complex structure is a skew endomorphism J of \mathbf T M such that J^2 = -1 and such that the +i-eigenbundle is involutive under the Courant bracket.

Equivalently: A generalized complex structure is a (complex) Dirac structure L \subset \mathbf TM satisfying the condition L \cap \overline L = 0.

Example 1. Let J be an ordinary complex structure on M. Then the endomorphism
\begin{bmatrix} -J & 0 \\ 0 & J^\ast \end{bmatrix}
defines a generalized complex structure on M.

Example 2. Let \omega be a symplectic form on M. Then the endomorphism
\begin{bmatrix} 0 & -\omega^{-1} \\ \omega & 0 \end{bmatrix}
defines a generalized complex structure on M.

Thus, generalized geometry gives a common framework for both complex geometry and symplectic geometry. Such a connection is exactly what is conjectured by mirror symmetry.

Example 3. Let J be a complex structure on M and let \sigma be a holomorphic Poisson tensor. Consider the subbundle L \subset \mathbf TM defined as the span of
\frac{\partial}{\partial \bar z_1}, \dots, \frac{\partial}{\partial \bar z_n}, dz_1 - \sigma(dz_1), \dots, dz_n - \sigma(dz_n)
Then L defines a generalized complex structure on M.

The last example shows that deformations of M as a generalized  complex manifold contain non-commutative deformations of the structure sheaf.  We also have the following theorem, which shows that there is an intimate relation between generalized complex geometry and holomorphic Poisson geometry.

Theorem (Bailey). Near any point of a generalized complex manifold, M is locally isomorphic to the product of a holomorphic Poisson manifold with a symplectic manifold.


Generalized Kähler Manifolds

Let (g, J, \omega) be a Kähler triple. The Kähler property requires that
\omega = g J.
Let I_1 denote the generalized complex structure induced by J, and let I_1 denote the generalized complex structure induced by the symplectic form \omega. We have
I_1 I_2 = \begin{bmatrix} - J & 0 \\ 0 & J^\ast \end{bmatrix} \begin{bmatrix} 0 & -\omega^{-1} \\ \omega & 0 \end{bmatrix} = \begin{bmatrix} 0 & g^{-1} \\ g & 0 \end{bmatrix} = I_2 I_1

 Definition. A generalized Kähler manifold is a manifold with two commuting generalized complex structure I_1, I_2 such that the bilinear pairing (I_1 I_2 u, v) is positive definite.

Theorem (Gualtieri). A generalized Kähler structure on M induces a Riemannian metric g, two integrable almost complex structures J_\pm Hermitian with respect to g, and two affine connections \nabla_\pm with skew-torsion \pm H which preserve the metric and complex structure J_\pm. Conversely, these data determine a generalized Kähler structure which is unique up to a B-field transformation.

Thus the notion of generalized Kähler manifold recovers the bihermitian geometry investigated by physicists in the context of susy non-linear \sigma-models.



Generalized Calabi-Yau Manifolds

Definition. A generalized Calabi-Yau manifold is a manifold M together with a complex-valued differential form \phi, which is either purely even or purely odd, which is a pure spinor for the action of Cl(\mathbf TM) and satisfies the non-degeneracy condition (\phi, \bar \phi) \neq 0.

Note that (by definition) \phi is pure if its annihilator is a maximal isotropic subspace. Let L \subset \mathbf TM be its annihilator. Then it is not hard to see that L defines a generalized complex structure on M, so indeed a generalized Calabi-Yau manifold is in particular a generalized complex manifold.

Example. If M is a complex manifold with a nowhere vanishing holomorphic (n,0) form, then it is generalized Calabi-Yau.

Example. If M is symplectic with symplectic form \omega, then \phi = \exp(i\omega) gives M the structure of a generalized Calabi-Yau manifold.

If (M, \phi) is generalized Calabi-Yau, then so is (M, \exp(B) \phi) for any closed real 2-form B. In the symplectic case, we obtain
\phi = \exp(B+i\omega)
This explains the appearance of the B-field (or "complexified Kähler form") in discussions of mirror symmetry.