Math Physics Learning Seminar: symplectic geometry

Showing posts with label symplectic geometry. Show all posts

Monday, August 31, 2015

Hamilton-Jacobi equation and Riemannian distance

Consider the cotangent bundle $T^\ast X$ as a symplectic manifold with canonical symplectic form $\omega$. Consider the Hamilton-Jacobi equation
\[ \frac{\partial S}{\partial t} + H(x, \nabla S) = 0, \]
for the classical Hamilton function $S(x,t)$. Setting $x=x(t), p(t) = (\nabla S)(x(t), t)$ one sees immediately from the method of characteristics that this PDE is solved by the classical action
\[ S(x,t) = \int_0^t (p \dot{x} - H) ds, \]
where the integral is taken over the solution $(x(s),p(s))$ of Hamilton's equations with $x(0)=x_0$ and $x(t) = x$. The choice of basepoint $x_0$ involves an overall additive constant of $S$, and really this solution is only valid in some neighbourhood $U$ of $x_0$. (Reason: $S$ is in general multivalued, as the differential "$dS$" is closed but not necessarily exact.)

Now consider the case where $X$ is Riemannian, with Hamiltonian $H(x,p) = \frac{1}{2} |p|^2$. The solutions to Hamilton's equations are affinely parametrized geodesics, and by a simple Legendre transform we have
\[ S(x, t) = \frac{1}{2} \int_0^t |\dot x|^2 ds \]
where the integral is along the affine geodesic with $x(0) = x_0$ and $x(t) = x$. Since $x(s)$ is a geodesic, $|\dot x(s)|$ is a constant (in $s$) and therefore
\[ S(x, t) = \frac{t}{2} |\dot x(0)|^2. \]
Now consider the path $\gamma(s) = x($|\dot x(0)|^{-1}$s)$. This is an affine geodesic with $\gamma(0) = x_0$, $\gamma(|\dot x(0)|t) = x$ and $|\dot \gamma| = 1$. Therefore, the Riemannian distance between $x_0$ and $x$ (provided $x$ is sufficiently close to $x_0$) is
\[ d(x_0, x) = |\dot x(0)| t. \]
Combining this with the previous calculation, we see that
\[ S(x, t) = \frac{1}{2t} d(x_0, x)^2. \]
Now insert this back into the Hamilton-Jacobi equation above. With a bit of rearranging, we have the following.

Theorem. Let $x_0$ denote a fixed basepoint of $X$. Then for all $x$ in a sufficiently small neighborhood $U$ of $x_0$, the Riemannian distance function satisfies the Eikonal equation
\[ |\nabla_x d(x_0, x)|^2 = 1. \]

Now, for convenience set $r(x) = d(x_0, x)$. Then $|\nabla r|^2 = 1$, from which we obtain (by differentiating twice and contracting)
\[ g^{ij} g^{kl}\left(\nabla_{lki} r \nabla_j r + \nabla_{ki}r \nabla_{lj} r\right) = 0.\]
Quick calculation shows that
\[ \nabla_{lki} r = \nabla_{ilk} r - \left.R_{li}\right.^{b}_k \nabla_b r \]
Therefore, tracing over $l$ and $k$ we obtain
\[ g^{lk} \nabla_{lki} r = \nabla_i ( \Delta r) + Rc(\nabla r, -) \]
Plugging this back into the equation derived above, we have
\[ \nabla r \cdot \nabla(\Delta r) + Rc(\nabla r, \nabla r) + |Hr|^2 = 0, \]
where $Hr$ denotes the Hessian of $r$ regarded as a 2-tensor. Now, using $r$ as a local coordinate, it is easy to see that $\partial_r = \nabla r$ (as vector fields). So we can rewrite this identity as
\[ \partial_r (\Delta r) + Rc(\partial_r, \partial_r) + |Hr|^2 = 0. \]

Now, we can get a nice result out of this. First, note that the Hessian $Hr$ always has at least one eigenvalue equal to zero, because the Eikonal equation implies that $Hr(\partial_r, -)=0$. Let $\lambda_2, \dots, \lambda_n$ denote the non-zero eigenvalues of $Hr$. We have
\[ |Hr|^2 = \lambda_2^2 + \dots + \lambda_n^2, \]
while on the other hand
\[ |\Delta r|^2 = (\lambda_2 + \dots + \lambda_n)^2 \]
By Cauchy-Schwarz, we have
\[ |\Delta r|^2 \leq (n-1)|Hr|^2 \]

Proposition. Suppose that the Ricci curvature of $X$ satisfies $Rc \geq (n-1)\kappa$, and let $u = (n-1)(\Delta r)^{-1}$. Then
\[ u' \geq 1 + \kappa u^2. \]

Proof. From preceding formulas, $|Hr|^2$ can be expressed in terms of the Ricci curvature and the radial derivative of $\Delta r$. On the other hand, $|\Delta|^2$ is bounded above by $(n-1) |Hr|^2$. The claimed inequality then follows from simple rearrangement.

Now, the amazing thing is that this deceptively simple inequality is the main ingredient of the Bishop-Gromov comparison theorem. The Bishop-Gromov comparison theorem, in turn, is the main ingredient of the proof of Gromov(-Cheeger) precompactness. I hope to discuss these topics in a future post.

Tuesday, August 18, 2015

The Classical Partition Function

Let $(M, \omega)$ be a symplectic manifold of dimension $2n$, and let $H: M \to \mathbf{R}$ be a classical Hamiltonian. The symplectic form $\omega$ allows us to define a measure on $M$, given by integration against the top form $\omega^n / n!$. We will denote this measure by $d\mu$.

We imagine that $(M, \omega, H)$ represents some classical mechanical system. We suppose that the dynamics of this dynamical system are very complicated, e.g. some system of $10^{23}$ particles. The system is so complicated that not only can we not solve the equations of motion exactly, and even if we could, their solutions might be so complicated that we can't expect to learn very much from them.

So instead, we ask statistical questions. Imagine that we cannot measure the state of the system exactly (e.g. particles in a box), so we try to guess a probability distribution $\rho(x,p,t)$ on $M$ indicating that at time $t$ the system has probability $\rho(x,p,t) d\mu$ of being in the state $(x,p)$. Obviously, $\rho$ should satisfy the constraint $\int_M \rho d\mu = 1$.

How does $\rho$ evolve in time? We know that the system obeys Hamilton's equations,
\[ (\dot x, \dot p) = X_H = (\partial H / \partial p, -\partial H / \partial x) \]
in local Darboux coordinates. Therefore, a particle located at $(x,p)$ in phase space at time $t$ will be located at $(x,p)+X_H dt$ in phase space at time $t+dt$. Therefore, the probability that a particle is at point $(x,p)$ at time $t+dt$, should be equal to the probability that the particle is at point $(x,p)-X_H dt$ at time $t$. Therefore, we have
\[ \frac{\partial \rho}{\partial t} = \frac{\partial H}{\partial x} \frac{\partial \rho}{\partial p} - \frac{\partial H}{\partial p} \frac{\partial \rho}{\partial x} = \{H, \rho\} \]

Given a probability distribution $\rho$, the entropy is defined to be
\[ S[\rho] = -\int_M \rho \log \rho d\mu. \]

(A version of) the second law of thermodynamics. For a given average energy $U$, the system assumes a distribution of maximal possible entropy at thermodynamic equilibrium.

The goal now, is to determine what distribution $\rho$ will maximize the entropy, subject to the constraints (for fixed $U$)
\begin{align*} \int_M H \rho d\mu &= U \\\
\int_M \rho d\mu &= 1 \end{align*}

Setting aside technical issues of convergence, etc., this variational problem is easily solved using the method of Lagrange multipliers. Introducing parameters $\lambda_1, \lambda_2$, we consider the modified functional
\[ S[\rho, \lambda_1, \lambda_2, U] = \int_M\left(-\rho \log \rho +\lambda_1\rho +\lambda_2(H\rho)\right)d\mu -\lambda_1-\lambda_2 U. \]

Note that $\partial S / \partial U = -\lambda_2$, and this is conventionally identified with (minus) the inverse temperature.

Taking the variation with respect to $\rho$, we find
\[ 0= \frac{\delta S}{\delta \rho} = -\log \rho-1+\lambda_1+H\lambda_2\]
Therefore, $rho$ is proportional to $e^{-\beta H}$ where we have set $\beta=-\lambda_2$. Define the partition function $Z$ to be
\[ Z = \int_M e^{-\beta H} d\mu. \]
We therefore have proved (formally and heuristically only!):

Theorem. The probability distribution $\rho$ assumed by the system at thermodynamic equilibrium is given by
\[ \rho = \frac{e^{-\beta H}}{Z} \]
where $\beta > 0$ is a real parameter, called the inverse temperature.

Corollary. At thermodynamic equilibrium, the average energy is given by
\[ U = -\frac{\partial \log Z}{\partial \beta} , \]
and the entropy is given by
\[ S = \beta U + \log Z.\]

Thursday, January 29, 2015

What is generalized geometry?

The following are my notes for a short introductory talk. References below are not intended to be comprehensive!

Math references:

Courant, Dirac manifolds
Hitchin, Generalized Calabi-Yau manifolds
Gualtieri, Generalized complex geometry
Cavalcanti, New aspects of the ddc-lemma
Cavalcanti and Gualtieri, Generalized complex geometry and T-duality
Bailey and Gualtieri, Local analytic geometry of generalized complex structures

Physics references:

Dijkgraaf, Gukov, Neitzke, Vafa, Topological M-theory as unification of form theories of gravity
Grana, Flux compactifications in string theory: a comprehensive review

What is geometry?

Before trying to define generalized geometry, we should first decide what we mean by ordinary geometry. Of course, this question doesn't have a unique answer, so there are many ways to generalize the classical notions of manifolds and varieties. The viewpoint taken in generalized geometry is the following: the distinguishing feature of smooth manifolds is the existence of a tangent bundle
\[ TM \to M \]
which satisfies some nice axioms. The basic idea of generalized geometry is to replace the tangent bundle with some other vector bundle $L \to M$, again satisfying some nice axioms. Different generalized geometries on $M$ will correspond to different choices of bundle $L \to M$, as well as auxiliary data compatible with $L$ in some appropriate sense.

Definition. A Lie algebroid over $M$ is a smooth vector bundle $L \to M$ together with a vector bundle map $a: L \to TM$ called the anchor map and a bracket $[\cdot, \cdot]: H^0(M, L) \otimes H^0(M, L) \to H^0(M, L)$ satisfying the following axioms:

$[\cdot,\cdot]$ is a Lie bracket on $H^0(M, L)$
$[X, fY] = f[X,Y] + a(X)f \cdot Y$ for $X,Y \in H^0(M,L)$ and $f \in H^0(M, \mathcal{O}_M)$

Note that we can take $L$ to be either a real or complex vector bundle. In the latter case the anchor map should map to the complexified tangent bundle.

Example 1. We can take $L$ to be $TM$ with anchor map the identity.

Example 2. Let $\sigma$ be a Poisson tensor on $M$. Then we define a bracket by $[X,Y] = \sigma(X,Y)$ and an anchor by $X \mapsto \sigma(X, \cdot)$. This makes $T^\ast M$ into a Lie algebroid.

Example 3. Let $M$ be a complex manifold of and let $L \subset TM \otimes \mathbf{C}$ be the sub-bundle of vectors spanned by $\{\partial / \partial z_1, \dots, \partial / \partial z_n\}$ in local holomorphic coordinates. Then $L \to M$ is a (complex) Lie algebroid.

Courant Bracket

We'd like to try to fit the preceding examples into a common framework. Let $\mathbf{T}M = TM \oplus T^\ast M$. This bundle has a natural symmetric bilinear pairing given by
\[ \langle X \oplus \alpha, Y \oplus \beta \rangle = \frac{1}{2} \alpha(Y) + \frac{1}{2} \beta(X) \]
Note that this bilinear form is of split signature $(n,n)$. We define a bracket on sections of $\mathbf{T}M$ by
\[ [X\oplus \alpha, Y\oplus \beta] = [X,Y] \oplus \left(L_X \beta + \frac{1}{2}(d \alpha(Y))- L_Y \alpha -\frac{1}{2} d( \beta(X)) \right ) \]
Note that this bracket is not a Lie bracket. We also have an anchor map $a: \mathbf{T}M \to TM$ which is just the projection.

Let $B$ be a 2-form on $M$. Define an action of $B$ on sections of $\mathbf TM$ by
\[ X + \alpha \mapsto X + \alpha + i_X B \]

Proposition. This action preserves the Courant bracket if and only if $B$ is closed.

This shows that the diffeomorphisms of $M$ as a generalized manifold are large than the ordinary diffeomorphisms of $M$. In fact is is the semidirect product of the diffeomorphism group of $M$ with the vector space of closed 2-forms.

Dirac Structures

Definition. A Dirac structure on $M$ is an Lagrangian sub-bundle $L \subset \mathbf{T}M$ which is closed under the Courant bracket.

Theorem (Courant). A Lagrangian sub-bundle $L \subset \mathbf{T} M$ is a Dirac structure if and only if $L \to M$ is a Lie algebroid over $M$, with bracket induced by the Courant bracket and anchor given by projection.

Example 1. $TM \subset \mathbf{T}M$.

Example 2. Take $L$ to be the graph of a Poisson tensor.

Example 3. Take $L$ to be the graph of a closed 2-form.

Admissible Functions

We now let $L \to M$ be a Dirac structure on $M$.

Definition. A smooth function $f$ on $M$ is called admissible if there exists a vector field $X_f$ such that $(X_f, df)$ is a section of $L$.

The Poisson bracket is defined as follows. If $f,g$ are admissible, then define
\[ \{f, g\} = X_f g. \]
It is easy to check from the definitions that the bracket on admissible functions is well-defined (independent of choice of $X_f$) and skew-symmetric. With a little bit of calculation, we find the following.

Proposition. The vector space of admissible functions is naturally a Poisson algebra, and moreover the natural bracket satisfies the Leibniz rule.

Generalized Complex Structures

Definition. A generalized complex structure is a skew endomorphism $J$ of $\mathbf T M$ such that $J^2 = -1$ and such that the $+i$-eigenbundle is involutive under the Courant bracket.

Equivalently: A generalized complex structure is a (complex) Dirac structure $L \subset \mathbf TM$ satisfying the condition $L \cap \overline L = 0$.

Example 1. Let $J$ be an ordinary complex structure on $M$. Then the endomorphism
\[ \begin{bmatrix} -J & 0 \\ 0 & J^\ast \end{bmatrix} \]
defines a generalized complex structure on $M$.

Example 2. Let $\omega$ be a symplectic form on $M$. Then the endomorphism
\[ \begin{bmatrix} 0 & -\omega^{-1} \\ \omega & 0 \end{bmatrix} \]
defines a generalized complex structure on $M$.

Thus, generalized geometry gives a common framework for both complex geometry and symplectic geometry. Such a connection is exactly what is conjectured by mirror symmetry.

Example 3. Let $J$ be a complex structure on $M$ and let $\sigma$ be a holomorphic Poisson tensor. Consider the subbundle $L \subset \mathbf TM$ defined as the span of
\[ \frac{\partial}{\partial \bar z_1}, \dots, \frac{\partial}{\partial \bar z_n}, dz_1 - \sigma(dz_1), \dots, dz_n - \sigma(dz_n) \]
Then $L$ defines a generalized complex structure on $M$.

The last example shows that deformations of $M$ as a generalized complex manifold contain non-commutative deformations of the structure sheaf. We also have the following theorem, which shows that there is an intimate relation between generalized complex geometry and holomorphic Poisson geometry.

Theorem (Bailey). Near any point of a generalized complex manifold, $M$ is locally isomorphic to the product of a holomorphic Poisson manifold with a symplectic manifold.

Generalized Kähler Manifolds

Let $(g, J, \omega)$ be a Kähler triple. The Kähler property requires that
\[ \omega = g J. \]
Let $I_1$ denote the generalized complex structure induced by $J$, and let $I_1$ denote the generalized complex structure induced by the symplectic form $\omega$. We have
\[ I_1 I_2 = \begin{bmatrix} - J & 0 \\ 0 & J^\ast \end{bmatrix} \begin{bmatrix} 0 & -\omega^{-1} \\ \omega & 0 \end{bmatrix} = \begin{bmatrix} 0 & g^{-1} \\ g & 0 \end{bmatrix} = I_2 I_1 \]

Definition. A generalized Kähler manifold is a manifold with two commuting generalized complex structure $I_1, I_2$ such that the bilinear pairing $(I_1 I_2 u, v)$ is positive definite.

Theorem (Gualtieri). A generalized Kähler structure on $M$ induces a Riemannian metric $g$, two integrable almost complex structures $J_\pm$ Hermitian with respect to $g$, and two affine connections $\nabla_\pm$ with skew-torsion $\pm H$ which preserve the metric and complex structure $J_\pm$. Conversely, these data determine a generalized Kähler structure which is unique up to a B-field transformation.

Thus the notion of generalized Kähler manifold recovers the bihermitian geometry investigated by physicists in the context of susy non-linear $\sigma$-models.

Generalized Calabi-Yau Manifolds

Definition. A generalized Calabi-Yau manifold is a manifold $M$ together with a complex-valued differential form $\phi$, which is either purely even or purely odd, which is a pure spinor for the action of $Cl(\mathbf TM)$ and satisfies the non-degeneracy condition $(\phi, \bar \phi) \neq 0$.

Note that (by definition) $\phi$ is pure if its annihilator is a maximal isotropic subspace. Let $L \subset \mathbf TM$ be its annihilator. Then it is not hard to see that $L$ defines a generalized complex structure on $M$, so indeed a generalized Calabi-Yau manifold is in particular a generalized complex manifold.

Example. If $M$ is a complex manifold with a nowhere vanishing holomorphic $(n,0)$ form, then it is generalized Calabi-Yau.

Example. If $M$ is symplectic with symplectic form $\omega$, then $\phi = \exp(i\omega)$ gives $M$ the structure of a generalized Calabi-Yau manifold.

If $(M, \phi)$ is generalized Calabi-Yau, then so is $(M, \exp(B) \phi)$ for any closed real 2-form $B$. In the symplectic case, we obtain
\[ \phi = \exp(B+i\omega) \]
This explains the appearance of the $B$-field (or "complexified Kähler form") in discussions of mirror symmetry.

Thursday, December 13, 2012

The Weyl and Wigner Transforms

Today I'd like to try to understand better how deformation quantization is related to the usual canonical quantization, and especially how the latter might be used to deduce the former, i.e., given an honest quantization (in the sense of operators), how might be reproduce the formula for the Moyal star product?

We'll fix our symplectic manifold once and for all to be $\mathbb{R}^2$ with its standard symplectic structure, with Darboux coordinates $x$ and $p$. Let $\mathcal{A}$ be the algebra of observables on $\mathbb{R}^2$. For technical reasons, we'll restrict to those smooth functions that are polynomially bounded in the momentum coordinate (but of course the star product makes sense in general). Let $\mathcal{D}$ be the algebra of pseudodifferential operators on $\mathbb{R}$. We want to define a quantization map

\[ \Psi: \mathcal{A} \to \mathcal{D} \]

such that

\[ \Psi(x) = x \in \mathcal{D} \]

\[ \Psi(p) = -i\hbar \partial \]
Out of thin air, let us define
\[ \langle q| \Psi(f) |q' \rangle = \int e^{ik(q-q')} f(\frac{q+q'}{2}, k) dk \]
This is the Weyl transform. Its inverse is the Wigner transform, given by
\[ \Phi(A, q, k) = \int e^{-ikq'} \left\langle q+\frac{q'}{2} \right| A \left| q - \frac{q'}{2} \right\rangle dq' \]
Note: I am (intentionally) ignoring all factors of $2\pi$ involved. It's not hard to work out what they are, but annoying to keep track of them in calculations, so I won't.

Theorem For suitably well-behaved $f$, we have $ \Phi(\Psi(f)) = f$.

Proof Using the "ignore $2\pi$" conventions, we have the formal identities
\[ \int e^{ikx} dx = \delta(k), \ \ \int e^{ikx} dk = \delta(x). \]
The theorem is a formal result of these:
\begin{align} \Phi(\Psi(f))(q, k) &= \int e^{-ikq'} \left\langle q + \frac{q'}{2} \right| \Psi(f) \left| q - \frac{q'}{2} \right\rangle \\\
&= \int e^{-ikq'} e^{ik'q'} f(q, k) dk' dq' \\\
&= f(q,k).
\end{align}

One may easily check that $\Psi(x) = x$ and $Psi(k) = -i\partial$, so this certainly gives a quantization. But why is it particularly natural? To see this, let $Q$ be the operator of multiplication by $x$, and let $P$ be the operator $-i\partial$. We'd like to take $f(q,p)$ and replace it by $f(Q, P)$, but we can't literally substitute like this due to order ambiguity. However, we could work formally as follows:
\begin{align}
f(Q, P) &= \int \delta(Q-q) \delta(P - p) f(q,p) dq dp \\\
&= \int e^{ik(Q-q) + iq'(P-p)} f(q,p) dq dq' dp dk.
\end{align}
In this last expression, there is no order ambiguity in the argument of the exponential (since it is a sum and not a product), and furthermore the expression itself make sense since it is the exponential of a skew-adjoint operator. So let's check that this agrees with the Weyl transform. Using a special case of the Baker-Campbell-Hausdorff formula for the Heisenberg algebra, we have
\[ e^{ik(Q-q) + iq'(P-p)} = e^{ik(Q-q)} e^{iq'(P-p)} e^{-ikq'/2} \]
Let us compute the matrix element:
\begin{align}
\langle q_1 | P | q_2 \rangle &= \int \langle q_1 | p_1 \rangle
\langle p_1 | P | p_2 \rangle \langle p_2 | q_2 \rangle dp_1 dp_2 \\\
&= \int e^{iq_1p_1 - iq_2 p_2} p_2 \delta(p_2 - p_1) dp_1 dp_2 \\\
&= \int e^{i p(q_1-q_2)} p dp.
\end{align}
Hence we find that the matrix element for the exponential is
\begin{align} \langle q_1 |e^{ik(Q-q) + iq'(P-p)} | q_2 \rangle
&= e^{-ikq'/2 + ik(q_1-q)} \langle q_1 | e^{iq'(P-p)} | q_2 \rangle \\\
&=  \int e^{-ikq'/2 + ik(q_1-q) -iq'p} e^{iq'p'' + ip''(q_1-q_2)} dp'' \\\
&= \delta(q' + q_1 - q_2)  e^{-ikq'/2 + ik(q_1-q) -iq'p}
\end{align}
Plugging this back into the expression for $f(Q, P)$ we find
\begin{align}
\langle q_1 | f(Q. P) | q_2 \rangle &= \int \delta(q' + q_1 - q_2)  e^{-ikq'/2 + ik(q_1-q) -iq'p}
f(q,p) dq dq' dp dk \\\
&= \int e^{ ik(q_1/2 +q_2/2-q) -ip(q_1-q_2)} f(q,p) dq dp dk \\\
&= \int e^{ip(q_1-q_2)} f(\frac{q_1+q_2}{2}, p) dp,
\end{align}
which is the original expression we gave for the Weyl transform.

Saturday, November 24, 2012

The Moyal Product

Today I want to understand the Moyal product, as we will need to understand it in order to construct quantizations of symplectic quotients. (More precisely, to incorporate stability conditions.)

Let $A$ be the algebra of polynomial functions on $T^\ast \mathbb{C}^n$. This algebra has a natural Poisson bracket, given by
\[ \{p_i, x_j\} = \delta_{ij}. \]
We would like to define a new associative product $\ast$ on $A((\hbar))$ satisfying:

$f \ast g = fg + O(\hbar) $
$f \ast g - g \ast f = \hbar \{f, g\} + O(\hbar^2)$
$1 \ast f = f \ast 1 = f$
$(f \ast g)^\ast = -g^\ast \ast f^\ast$

In the last line, the map $(\cdot)^\ast$ takes $x_i \mapsto x_i$ and $p_i \mapsto -p_i$. To figure out what this new product should be, let's take $f,g \in A$ and expand $f \ast g$ in power series:

\[ f \ast g = \sum_{n=0}^\infty c_n(f,g) \hbar^n \]

Now, equations (1) and (2) will be satisfied by taking $c_0(f,g) = fg$ and $c_1(f,g) = \{f,g\}/2$. Let $\sigma$ be the Poisson bivector defining the Poisson bracket. This defines a differential operator $\Pi$ on $A \otimes A$ by

\[ \Pi = \sigma^{ij} (\partial_i \otimes \partial_j) \]

Let $B = \sum_{n=0}^\infty B_n \hbar^n$ and write the product as

\[ f \ast g = m \circ B(f \otimes g). \]
Now, condition (2) tells us that $B(0) = 1$ and that
\[ \left. \frac{dB}{d\hbar} \right|_{\hbar=0} = \frac{\Pi}{2} \]
So
\[ B = 1 + \frac{\hbar \Pi}{2} + O(\hbar^2) \]
It is natural to guess that $B$ should be built out of powers of $\Pi$, and a natural guess is
\[ B = \exp(\frac{\hbar \Pi}{2}), \]
which certainly reproduces the first two terms of our expansion. Let's see that this choice actually works, i.e. defines an associative $\ast$-product. Let $m: A \otimes A \to A$ be the multiplication, and
$m_{12}, m_{23}: A \otimes A \otimes A \to A \otimes A$, $m_{123}: A \otimes A \otimes A \to A$ the induced multiplication maps. Then
\begin{align}
f \ast (g \ast h) &= m \circ(B( f \otimes m \circ B(g \otimes h) ) ) \\\
&= m \circ B( m_{23} \circ (1 \otimes B)(f \otimes g \otimes h) ) \\\
&= m_{123} (B \otimes 1)(1 \otimes B)(f \otimes g \otimes h)
\end{align}
On the other hand, we have

\begin{align}
(f \ast g) \ast h) &= m \circ(B( m \circ B(f \otimes g) \otimes h) ) ) \\\
&= m \circ B( m_{12} \circ (B \otimes 1)(f \otimes g \otimes h) ) \\\
&= m_{123} (1 \otimes B)(B \otimes 1)(f \otimes g \otimes h)
\end{align}

Hence, associativity is the condition
\[ m_{123} \circ [1\otimes B, B \otimes 1] = 0. \]

On $A \otimes A \otimes A$, write $\partial_i^1$ for the partial derivative acting on the first factor, $\partial_i^2$ on the second, etc. Then
\[ 1 \otimes B = \sum_n \frac {\hbar^n}{2^n n!}
\Pi^{i_1 j_1} \cdots \Pi^{i_n j_n} \partial^2_{i_1} \partial^3_{j_1} \cdots
\partial^2_{i_n} \partial^3_{j_n} \]
and similarly for $B \otimes 1$. So we have
\begin{align}

m_{123} (B\otimes 1)(1 \otimes B) &= \sum_n \sum_{k=0}^n \frac {\hbar^n}{2^n k! (n-k)!}

\Pi^{k_1 l_1} \cdots \Pi^{k_k l_k} \partial_{k_1} \partial_{l_1} \cdots
\partial_{k_k} \partial_{l_k} \\\
& \ \times \Pi^{i_1 j_1} \cdots \Pi^{i_{n-k} j_{n-k}} \partial_{i_1} \partial_{j_1} \cdots

\partial_{i_{n-k}} \partial_{j_{n-k}} \\\
&= m_{123}(1 \otimes B)(B \otimes 1)
\end{align}
Hence we obtain an associative $\ast$-product. This is called Moyal product.

Sheafifying the Construction

Now suppose that $U$ is a (Zariski) open subset of $X = T^\ast \mathbb{C}^n$. Then the star product induces a well-defined map
\[ \ast: O_X(U)((\hbar)) \otimes_\mathbb{C} O_X(U)((\hbar)) \to O_X(U)((\hbar)) \]
In this way we obtain a sheaf $\mathcal{D}$ of $O_X$ modules with a non-commutative $\ast$-product defined as above.

Define a $\mathbb{C}^\ast$ action on $T^\ast \mathbb{C}^n$ by acting on $x_i$ and $p_i$ with weight 1. Extend this to an action on $\mathcal{D}$ by acting on $hbar$ with weight -1.

Proposition: The algebra $C^\ast$-invariant global sections of $\mathcal{D}$ is naturally identified with the algebra of differential operators on $\mathbb{C}^n$.

Proof: The $\mathbb{C}^\ast$-invariant global sections are generated by $\hbar^{-1} x_i$ and $\hbar^{-1} p_i$. So define a map $\Gamma(\mathcal{D})^{\mathbb{C}^\ast} \to \mathbb{D}$ by
\[ \hbar^{-1} x_i \mapsto x_i \]
\[ \hbar^{-1} p_i \mapsto \partial_i \]
From the definition of the star product, it is clear that this is an algebra map, and that it is both injective and surjective.

Thursday, November 22, 2012

An Exercise in Quantum Hamiltonian Reduction

Semiclassical Setup

Let the group $GL(2)$ act on $V = \mathrm{Mat}_{2\times n}$ and consider the induced symplectic action on $T^\ast V$. If we use variables $(x,p)$ with $x$ a $2 \times n$ matrix and $p$ an $n \times 2$ matrix, then the classical moment map $\mu$ is given by
\[ \mu(x,p) = xp \]
This is equivariant with respect to the adjoint action, so we can form the $GL(2)$-invariant functions
\[ Z_1 = \mathrm{Tr} \mu \]
\[ Z_2 = \mathrm{Tr} (\mu)^2 \]
If we think of $x$ as being made of column vectors
\[ x = ( x_1 \cdots x_n ) \]
and similarly think of $p$ as being made of row vectors, then there are actually many more $GL(2)$ invariants, given by
\[ f_{ij} = \mathrm{Tr} x_i p_j = p_j x_i \]
In terms of the invariants, the $Z$ functions are
\[ Z_1 = \sum_k f_{kk} \]
\[ Z_2 = \sum_{jk} f_{jk} f_{kj} \]
Let us compute Poisson brackets:
\begin{align}
\{f_{ij}, f_{kl}\} &= \{p_j^\mu x_i^\mu, p_l^\nu x_k^\nu\} \\\
&= x_i^\mu p_l^\nu \delta_{jk} \delta^{\mu\nu} - p_j^\mu x_k^\nu \delta_{il} \delta^{\mu\nu} \\\
&= f_{il} \delta_{jk} - f_{kj} \delta_{il}.
\end{align}
So we see that the invariants form a Poisson subalgebra (as they should!). Let's compute:
\begin{align}
\{Z_1, f_{ij} &= \sum_k \{ f_{kk}, f_{ij} \} \\\
&= \sum_k \left( f_{kj} \delta_{ki} - f_{ik} \delta_{kj} \right) \\\
&= f_{ij} - f_{ij} = 0.
\end{align}
Hence $Z_1$ is central with respect to the invariant functions $f_{ij}$. Similarly,
\begin{align}
\{Z_2, f_{kl}\} &= \sum_{ij} \{f_{ij} f_{ji}, f_{kl}\} \\\
&= \sum_{ij} f_{ij} \left(f_{jl} \delta_{ik} - f_{ki} \delta_{jl} \right) + f_{ji} \left(f_{il} \delta_{jk} - f_{kj} \delta_{il} \right) \\\
&= \sum_j f_{kj} f_{jl} - \sum_i f_{il} f_{ki} + \sum_i f_{ki} f_{il} - \sum_j f_{jl} f_{kj} \\\
&= 0.
\end{align}
So we see that the $Z_i$ are in the center of the invariant algebra. In fact, they generate it, so we'll denote by $Z$ the algebra generated by $Z_1, Z_2$. Let $A$ be the algebra generated by the $f_{ij}$. The inclusion $Z \hookrightarrow A$ can be thought of as a purely algebraic version of the moment map. In particular, given any character $\lambda: Z \to \mathbb{C}$, we can define the Hamiltonian reduction of $A$ to be
\[ A_\lambda := A / A\langle \ker \lambda \rangle \]
The corresponding space is of course $\mathrm{Spec} A$.

The Cartan Algebra and the Center

Define functions

\[ h_1 = Z_1 = \sum_i f_{ii} \]
\[ h_2 = Z_2 = \sum_{ij} f_{ij} f_{ji} \]
\[ h_3 = \sum_{ijk} f_{ij} f_{jk} f_{ki} \]
\[ h_k = \sum_{i_1, i_2, \ldots, i_k} f_{i_1 i_2} f_{i_2 i_3} \cdots f_{i_k i_1} \]

These are just the traces of various powers of the $n \times n$ matrix $px$. In particular, $h_k$ for $k>n$ may be expressed as a function of the $h_i$ for $i \leq n$. The algebra generated by the $H$ plays the role of a Cartan subalgebra. So we have inclusions
\[ Z \subset H \subset A \]

Quantization

Now we wish to construct a quantization of $A$ and $A_\lambda$. The quantization of $A$ is obvious: we quantize $T^\ast V$ by taking the algebraic differential operators on $V$. Denote this algebra by $\mathbb{D}$. It is generated by $x_i$ and (\partial_i\) satisfying the relation
\[ [\partial_i, x_j] = \delta_{ij} \]
Then we simply the subalgebra of $GL(2)$-invariant differential operators as our quantization of $A$. Call this subalgebra $U$. We can define Hamiltonian reduction analogously by taking central quotients. So we need to understand the center $Z(U)$, but this is just the subalgebra generated by quantizations of $Z_1$ and $Z_2$, i.e. the subalgebra of all elements whose associated graded lies in $Z(A)$.

More to come: stability conditions, $\mathbb{D}$-affineness, and maybe proofs of some of my claims.

Thursday, July 26, 2012

Generating Functions

Method of Generating Functions

Let $X$ and $Y$ be two smooth manifolds, and let $M = T^\ast X, N = T^\ast Y$ with corresponding symplectic forms $\omega_M$ and $\omega_N$.

Question: How can we produce symplectomorphisms $\phi: M \to N$?

The most important construction from classical mechanics is the method of generating functions. I will outline this method, shameless stolen from Ana Cannas da Silva's lecture notes.

Suppose we have a smooth function $f \in C^\infty(X \times Y)$. Then its graph $\Gamma$ is a submanifold of $M \times N$: $ \Gamma = \{ (x,y, df_{x,y}) \in M \times N \}$. Since $M \times N$ is a product, we have projections $\pi_M, \pi_N$, and this allows us to write the graph as
\[ \Gamma = \{ (x, y, df_x, df_y) \}\]
Now there is a not-so-obvious trick: we consider the twisted graph $\Gamma^\sigma$ given by
\[ \Gamma^\sigma = \{(x,y, df_x, -df_y) \} \]
Note the minus sign.

Proposition If $\Gamma^\sigma$ is the graph of a diffeomorphism $\phi: M \to N$, then $\phi$ is a symplectomorphism.

Proof By construction, $\Gamma^\sigma$ is a Lagrangian submanifold of $M \times N$ with respect to the twisted symplectic form $\pi_M^\ast \omega_M - \pi_N^\ast \omega_N$. It is a standard fact that a diffeomorphism is a symplectomorphism iff its graph is Lagrangian with respect to the twisted symplectic form, so we're done.

Now we have:

Modified question: Given $f \in C^\infty(M \times N)$, when is its graph the graph of a diffeomorphism $\phi: M \to N$?

Pick coordinates $x$ on $X$ and $y$ on $Y$, with corresponding momenta $\xi$ and $\eta$. Then if $\phi(x,\xi) = (y,\eta)$, we obtain
\[ \xi = d_x f, \ \eta = -d_y f \]
Note the simlarity to Hamilton's equations. By the implicit function theorem, we can construct a (local) diffeomorphism $\phi$ as long as $f$ is sufficiently non-degenerate.

Different Types of Generating Functions

We now concentrate on the special case of $M = T^\ast \mathbb{R} = \mathbb{R} \times \mathbb{R}^\ast$. Note that this is a cotangent bundle in two ways: $T^\ast \mathbb{R} \cong T^\ast \mathbb{R}^\ast$. Hence we can construct local diffeomorphisms $T^\ast \mathbb{R} \to T^\ast \mathbb{R}$ in four ways, by taking functions of the forms

\[ f(x_1, x_2), \ f(x_1, p_2), \ f(p_1, x_2), \ f(p_1, p_2) \]

Origins from the Action Principle, and Hamilton-Jacobi

Suppose that we have two actions

\[ S_1 = \int p_1 \dot{q}_1 - H_1 dt, \ S_2 = \int p_2 \dot{q}_2 - H_2 dt \]

which give rise to the same dynamics. Then the Lagrangians must differ by a total derivative, i.e.

\[ p_1 \dot{q}_1 - H_1 = p_2 \dot{q}_2 - H_2 + \frac{d f}{dt} \]

Suppose that $f = -q_2 p_2 + g(q_1, p_2, t)$. Then we have

\[ p_1 \dot{q}_1 - H_1 = -q_2 \dot{p}_2 - H_2 + \frac{\partial g}{\partial t} + \frac{\partial g}{\partial q_1}\dot{q}_1 + \frac{\partial g}{\partial p_2} \dot{p_2} \]

Comparing coefficients, we find
\[ p_1 = \frac{\partial g}{\partial q_1}, \ q_2 = \frac{\partial g}{\partial p_2}, \ H_2 = H_1 + \frac{\partial g}{\partial t} \]

Now suppose that the coordinates $(q_2, p_2)$ are chosen so that Hamilton's equations become
\[ \dot{q_2} = 0, \ \dot{p}_2 = 0 \]
Then we must have $H_2 = 0$, i.e.
\[ H_1 + \frac{\partial g}{\partial t} = 0 \]
Now we also have $\partial H_2 / \partial p_2 = 0$, so this tells us that $g$ is independent of $p_2$, i.e. $g = g(q_1, t)$. Since $p_1 = \partial g / \partial q_1$, we obtain
\[ \frac{\partial g}{\partial t} + H_1(q_1, \frac{\partial g}{\partial q_1}) = 0 \]
This is the Hamilton-Jacobi equation, usually written as
\[ \frac{\partial S}{\partial t} + H(x, \frac{\partial S}{\partial x}) = 0 \]
Note the similarity to the Schrodinger equation! In fact, one can derive the Hamilton-Jacobi equation from the Schrodinger equation by taking a wavefunction of the form
\[ \psi(x,t) = A(x,t) \exp({\frac{i}{\hbar} S(x,t)}) \]
and expanding in powers of $\hbar$. This also helps to motivate the path integral formulation of quantum theory.

Thursday, September 1, 2011

Classical Mechanics 6: Poisson brackets and the Heisenberg picture

Last time we saw that a classical mechanical system which has a Lagrangian formulation can (under some mild assumptions) be repackaged as a symplectic manifold $(M, \omega)$ together with a smooth function $H$ called the Hamiltonian. The equations of motion then become
\[ \dot{x} = X_H \]
where $X_H$ is the Hamiltonian vector field associated with $H$ (sometimes called the symplectic gradient of $H$). This identifies solutions to the equations of motion with certain curves in the manifold $M$, which together form a 1-parameter group of diffeomorphisms (in fact, symplectomorphisms) of $M$.

Today, I'd like to discuss a dual formulation. Instead of thinking of the equations of motion as describing evolution of the points of $M$, we will instead think of the equations of motion as describing evolution of the functions on $M$. We will see later that this is the classical analog of the Heisenberg picture in quantum mechanics.

Definition An observable on $M$ is a smooth real-valued function on $M$.

Suppose we have a classical mechanical system $M, \omega, H$. By integrating the equations of motion, we obtain a 1-parameter family of symplectomorphisms $\phi_t$. For any point $x \in M$, the curve $x(t)$ defined by
\[ x(t) = \phi_t(x) \]
solved Hamilton's equations.

Now suppose we have some observable $f$. Its value along any solution to the equations of motion is
\[ f(x(t) = f(\phi_t(x)) = f_t(x) \]
where $f_t := f \circ \phi_t$. So, if we only care about the values of observables along any solution to the equations of motion, the transformation
\[ x \mapsto x(t) = \phi_t(x) \]
\[ f \mapsto f \]
which is the Schrodinger picture, is indistinguishable from the transformation
\[ x \mapsto x \]
\[ f \mapsto f_t = f \circ \phi_t \]
which we will call the Heisenberg picture. What is the analog of Hamilton's equations for the Heisenberg picture? Let us compute:
\[ \frac{d}{dt} f_t = \frac{d}{dt}f(\phi_t) = df \circ \frac{d}{dt}\phi_t \]
Since $\phi_t$ is generated by the vector field $X_H$, we obtain
\[ \frac{d}{dt} f_t = X_H(f) = df(X_H) = \omega^{-1}(dH, df) \]
For two observables $f$ and $g$, let us define the Poisson bracket of $f$ and $g$ as
\[ \{f, g\} := \omega^{-1}(df, dg). \]
Then we have
\[ \frac{d}{dt} f_t = \{H, f_t \} \]
which is called the Heisenberg equation of motion.

Theorem The Poisson bracket $\{\cdot, \cdot\}$ satisfies the following properties:
1. It is $\mathbb{R}$-linear and skew-symmetric.
2. It satisfies the Jacobi identity.
3. It satisfies the Leibniz rule $\{fg,h\}) = f\{g,h\} + \{f,h\}g$.

Proof Too lazy for now! But it's really easy.

Now let $\mathscr{A} = C^\infty(M, \mathbb{R})$ be the commutative algebra of observables. This has an additional structure: the Poisson bracket $\{,\}$, so we will call $\mathscr{A}$ a Poisson algebra.

Now let us consider something completely crazy. Consider the following generalization of mechanical system.

Tentative Definition A mechanical system is an algebra $\mathscr{A}$ together with a Poisson bracket $\{\}$ on $\mathscr{A}$ and an element $H \in \mathscr{A}$ called the Hamiltonian. The Heisenberg equations of motion are
\[ \frac{d}{dt} f_t = \{H, f_t\} \]
for any $f \in \mathscr{A}$.

This definition is a little too vague at the moment, since without specifying a topology on $\mathscr{A}$ we have no way of making sense of the Heisenberg equation. However, up to this caveat, this definition of mechanical system captures the essence of all of classical mechanics, classical field theory, quantum mechanics, and quantum field theory!

Thursday, August 25, 2011

Classical Mechanics 5: Symplectic structures

As we saw in the previous post, the equations of motion for a mechanical system can be cast into a 1st order form called Hamilton's equations, which are naturally interpreted as describing a path in the phase space $T^\ast M$ associated to the configuration space $M$. Let us investigate the geometry of $T^\ast M$ see why Hamilton's equations are so nice.

Definition The canonical (or sometimes tautological) 1-form on the cotangent bundle $T^\ast M$ is the 1-form $\theta$ defined by
\[ \theta_{(q,p)}(X) = p(\pi_\ast X), \]
where $\pi_\ast$ is the pushforward induced by the natural projection $\pi: T^\ast M \to TM$. In other words, the form is defined by
\[ \theta_{(q,p)} = \pi^\ast p. \]

Definition The canonical symplectic form on the cotangent bundle $T^\ast M$ is the 2-form $\omega$ defined by
\[ \omega = -d\theta. \]

Let $\omega_\flat: T M \to T^\ast M$ be the map given by $X \mapsto \iota(X)\omega$.

Proposition The canonical symplectic form satisfies the following two conditions:
1. It is closed, i.e. $d\omega = 0$.
2. It is nondegenerate, i.e. the map $\omega_\flat$ is invertible with inverse $\omega^\sharp: T^\ast M \to TM$.

Proof The first property follows from $d^2 = 0$. To prove the second, suppose we have local coordinates $q^i$ on $M$ with cotangent coordinates $p^i$. Then it is easily seen that
\[ \theta = p^i dq^i, \]
so that
\[ \omega = dq^i \wedge dp^i, \]
from which nondegeneracy is obvious.

Definition Any 2-form on a manifold $N$ (not necessarily a cotangent bundle) which satisfies the above two properties will be called symplectic. A pair $(N, \omega)$ will be called symplectic if $\omega$ is a symplectic 2-form on $N$.

Definition Given a function $H$ on a symplectic manifold $(N, \omega)$, the Hamiltonian vector field associated to $H$ is the vector field $X_H$ uniquely defined by
\[ dH = \omega_\flat X_H. \]

Proposition For $N = T^\ast M$ a cotangent bundle with the canonical symplectic form, Hamilton's equations with respect to a Hamiltonian function $H$ describe the flow of the vector field $X_H$.

Proof Again pick local coordinates $q$ and $p$. Then the inverse map $\omega^\sharp$ is given by
\[ dq \mapsto -\frac{\partial}{\partial p} \]
\[ dp \mapsto \frac{\partial}{\partial q} \]
Since
\[ dH = \frac{\partial H}{\partial q} dq + \frac{\partial H}{\partial p} dp, \]
we see that
\[ X_H = \frac{\partial H}{\partial p} \frac{\partial}{\partial q} - \frac{\partial H}{\partial q} \frac{\partial}{\partial p} \]
But then the equation describing the flow of $X_H$ is (in components)
\[ \dot{q} = \frac{\partial H}{\partial p} \]
\[ \dot{p} = -\frac{\partial H}{\partial q} \]
which are exactly Hamilton's equations.

Tuesday, August 23, 2011

Classical Mechanics 4: Hamilton's Equations

Recall from last time that a classical mechanical system consists of a manifold $M$ (the configuration space) and a function $L$ on the tangent bundle $TM$. The equations of motion for a path $x(t)$ in $M$ are the 2nd order Euler-Lagrange equations:
\[ \frac{d}{dt}\left( \frac{\partial L}{\partial v}(x, \dot{x}) \right) = \frac{\partial L}{\partial x}(x, \dot{x}) \]

Hamilton discovered a way of recasting these equations as a first order system for a path in a related manifold, the cotangent bundle $T^\ast M$. The benefits are twofold: in addition to reducing the equations to a first order system (at the cost of introducing new variables), it turns out that this framework makes it much easier to find conserved quantities and prove theorems about mechanical systems. So let's see what he did.

Theorem Under mild assumptions on $L$, there is a function $H$ on $T^\ast M$ constructed canonically out of $L$ such that solutions of the Euler-Lagrange equations are in 1-1 correspondence with solutions $q(t), p(t)$ on $T^\ast M$ of Hamilton's equations
\[ \dot{q} = \frac{\partial H}{\partial p}(q, p) \]
\[ \dot{p} = -\frac{\partial H}{\partial q}(q,p) \]
Furthermore, if the original Lagrangian function $L$ is not explicitly time-dependent, then the function $H$ is constant for any solution of the equations of motion.

To start with, introduce coordinates $x, v$ on $TM$ as before. We will define new coordinates $q,p$ as follows:
\[ q(x,v) = x \]
\[ p(x,v) = \frac{\partial L}{\partial x}(x,v) \]
Our assumption on $L$ will be the following: the above formulas can be inverted to obtain $x$ and $v$ as functions of $q$ and $p$. It is easily seen from the definition of $p$ that under a coordinate transformation (of $x$), it behaves as a 1-form, so the coordinates $q$ and $p$ can be interpreted as local coordinates on the cotangent bundle $T^\ast M$. We construct the Hamiltonian as
\[ H(q,p) = p\cdot v(q,p) - L(x(q,p), v(q,p)) \]
(this is the Legendre transform--more later). Of course, the above formula is not well-defined if we cannot solve for $x$ and $v$ in terms of $p$ and $q$--hence the assumption. Now we check:
\[ \frac{\partial H}{\partial p} = v + p \frac{\partial v}{\partial p} - \frac{\partial L}{\partial x} \frac{\partial x}{\partial p} -\frac{\partial L}{\partial v}\frac{\partial v}{\partial p} = v + p\frac{\partial v}{\partial p} - p\frac{\partial v}{\partial p} = v\]
Since $\dot{x} = v$, this is the first of Hamilton's equations.

For the second, we perform a similar computation:
\[ \frac{\partial H}{\partial q} = -\frac{\partial L}{\partial x} \frac{\partial x}{\partial q} - \frac{\partial L}{\partial v} \frac{\partial v}{\partial q} = -\frac{\partial L}{\partial x} \]
But the Euler-Lagrange equations say that
\[ \dot{p} = \frac{d}{dt}\frac{\partial L}{\partial v} = \frac{\partial L}{\partial x} = -\frac{\partial H}{\partial x}, \]
so we've obtained the second of Hamilton's equations.

For the last part, suppose that the Lagrangian does not depend explicitly on time, i.e.
\[ \frac{\partial L}{\partial t} = 0. \]
Then we compute:
\[ \frac{d}{dt}H = \frac{\partial H}{\partial q}\dot{q} + \frac{\partial H}{\partial p}\dot{p} = \frac{\partial H}{\partial q} \frac{\partial H}{\partial p} - \frac{\partial H}{\partial p} \frac{\partial H}{\partial q} = 0. \]
Hence $H$ is automatically conserved. For this reason, $H$ is often called the energy of the system.

Later, we will see that conserved quantities correspond to symmetries, and conservation of energy is a statement about the symmetry corresponding to time translation.

Friday, August 19, 2011

Classical Mechanics 3: Hamilton's action principle.

We saw before that Newton's 2nd law can be written in a more general form as
\[ \frac{d}{dt} \frac{\partial L}{\partial v}(x, \dot{x}) = \frac{\partial L}{\partial x}(x, \dot{x}), \]
known as the Euler-Lagrange equations. Hamilton discovered a principle that explains the origin of these equations. Consider some path of the system given by a curve $\gamma$, i.e.
\[ x(t) = \gamma(t) \]
\[ \dot{x}(t) = \frac{d}{dt}\gamma(t) \]
Then we may define a quantity associated with the path $\gamma$:
\[ S = \int L(\gamma, \dot{\gamma})dt \]
called the action. Hamilton discovered the following.

Theorem The path taken by a mechanical system is one which extremizes the action.

To prove this, suppose we perturb the path a small amount, while leaving the endpoints fixed, i.e. $\gamma \mapsto \gamma + \epsilon (\delta\gamma)$ with $\epsilon > 0$ small and $\delta\gamma$ a path that is $0$ at its endpoints. Then
\[ L(\gamma + \epsilon\delta\gamma, \dot\gamma + \epsilon\delta\dot{\gamma}) = L(\gamma, \dot{\gamma}) + \epsilon \frac{\partial L}{\partial x}\delta\gamma + \epsilon \frac{\partial L}{\partial v} \delta\dot\gamma + o(\epsilon^2) \]
Thus
\[ S[\gamma + \epsilon\delta\gamma] = S[\gamma] + \epsilon \int \frac{\partial L}{\partial x} \delta \gamma dt + \epsilon \int \frac{\partial L}{\partial v} \delta \dot\gamma dt + o(\epsilon^2) \]
Integrating by parts, and using the fact that $\delta\gamma$ is $0$ on the endpoints, we have
\[ \int \frac{\partial L}{\partial v}\delta\dot\gamma dt = -\int \frac{d}{dt} \frac{\partial L}{\partial v} \delta\gamma dt \]

Combining the above, we have
\[ \frac{\delta S}{\delta \gamma}(\delta \gamma) = \int \left(\frac{\partial L}{\partial x} - \frac{d}{dt}\frac{\partial L}{\partial v} \right) \delta\gamma dt \]
Thus the variational derivative of $S$ is
\[ \frac{\delta S}{\delta \gamma} = \frac{\partial L}{\partial x} - \frac{d}{dt} \frac{\partial L}{\partial v} \]
So a path $\gamma$ is a critical point of $S$ (i.e. it extremizes $S$) if and only if the Euler-Lagrange equations are satisfied.

Classical Mechanics 2: The Euler-Lagrange equations from Newton's 2nd law.

After the previous post, we are now familiar with Newton's 2nd law

\[ \mathbf{F} = m\mathbf{a}, \]

which (suitably interpreted) holds for any system of $N$ particles. However, this equation requires the use of cartesian coordinates, which for many systems may not be the most convenient choice. Suppose we have some other coordinates $q^i = q^i(x^j)$. What is the correct analogue of Newton's 2nd law for the $q$-coordinates?

To make life easier, we will assume for now that the force $\mathbf{F}$ is conservative; i.e.

\[ \mathbf{F} = -\nabla V(x) \]

for some potential function $V(x)$. Under this assumption, Newton's 2nd law is

\[ m\mathbf{a} + \nabla V(x) = 0. \]

Let us define the function $T$ by

\[ T(x,v) = \frac{1}{2}m |v|^2, \]

and define the function $L$ as

\[ L(x,v) = T(x,v) - V(x). \]

Then we have immediately that Newton's 2nd law is equivalent to

\[ \frac{d}{dt}\frac{\partial L}{\partial v}(x, \dot{x}) - \frac{\partial L}{\partial x}(x,\dot{x}) = 0. \]

Why go through the trouble of introducing these auxiliary functions and rewriting Newton's 2nd law in this way? The answer lies in the following theorem.

Theorem For any choice of coordinates $y = y(x)$, Newton's 2nd law is equivalent to the equations

\[ \frac{d}{dt}\frac{\partial \tilde{L}}{\partial w}(y, \dot{y}) - \frac{\partial \tilde{L}}{\partial y}(y,\dot{y}) = 0, \]

where $w = dY_x(v)$ and $\tilde{L}(y,w) = L(x(y,w), v(y,w)$. These equations are called the Euler-Lagrange equations.

The proof of this theorem is a straightforward calculation using the chain rule. Let $M$ denote the manifold $\mathbb{R}^{3N}$ (or some open subset thereof). The coordinate change $y = y(x) $ can be thought of as a diffeomorphism $Y: M \to M$ given by $x \mapsto y(x) $. The differential $ dY: TM \to TM$ is also a diffeomorphism. In coordinates, we have

\[ y = y(x) \]

\[ w = dY_x(v) = Jv \]

where $y,w$ are coordinates on the target $TM$.

Now we need to compute the derivatives of $\tilde{L}$.

\[ \frac{\partial L}{\partial x} = \frac{\partial \tilde{L}}{\partial y}\frac{\partial y}{\partial x} + \frac{\partial \tilde{L}}{\partial w}\frac{\partial w}{\partial x} = \tilde{L}_y J + \tilde{L}_w H v \]

where $J$ is the matrix of mixed partials and $H$ is the Hessian matrix.

\[ \frac{\partial L}{\partial v} = \frac{\partial \tilde{L}}{\partial y} \frac{\partial y}{\partial v} + \frac{\partial \tilde{L}}{\partial w} \frac{\partial w}{\partial v} = \tilde{L}_w J \]

Then we have

\[ \frac{d}{dt}\left( \tilde{L}_w J \right) = \frac{d}{dt}\tilde{L}_w J + \tilde{L}_w H \dot{x} \]

Subtracting $\frac{\partial L}{\partial x}$ from this and using the calculation above, we obtain

\[ \left( \frac{d}{dt} \tilde{L}_w - \tilde{L}_y \right) J \]

and since $J$ is invertible, this is $0$ if and only if $\frac{d}{dt} \tilde{L}_w - \tilde{L}_y$ is.

But this is exactly what we wanted to prove!

Wednesday, January 27, 2010

The Legendre transform

Yesterday, I gave an introductory talk on Hamiltonian mechanics and symplectic geometry. The starting point is the Legendre transform. First, begin with a configuration space $Q$. The Lagrangian $\mathcal{L}$ is a smooth function on $TQ$. In local coordinates $q^i$ on $Q$, we have coordinates $(q_i, v_i)$ on $TQ$, where the $v^i$ are the components of the tangent vector
$v = v_i \partial_i \in T_q Q$. Typically, the Lagrangian will be of the form
\[ \mathcal{L}(q,v) = \frac{1}{2} g(v,v) - V(q), \]
where $g$ is some metric on $Q$. Now we introduce new coordinates $p_i$ defined by
\[ p_i = \frac{\partial \mathcal{L}}{\partial v^i}. \]
If $\mathcal{L}$ is (strictly?) convex in $v$ then we can solve for $v^i$ as a function of $(q^i, p_j)$. It is easy to check that the $p_i$ transform as covectors, and so this gives a diffeomorphism $TQ \to T^\ast Q$(which depends on $\mathcal{L}$). For example, in the above Lagrangian,
\[ \frac{\partial \mathcal{L}}{\partial v} = g(v, -), \]
which is just the dual of $v$ with respect to the metric $g$. So for Lagrangians of this form, the map $TQ \to T^\ast Q$ is just the one given by the metric.

Now comes the interesting part. There is a natural way to turn $\mathcal{L}$, which is a function on $TQ$, into a function $H$ on $T^\ast Q$, in such a way that if we repeat this process, we will get back the original function $\mathcal{L}$ on $TQ$. This is the Legendre transform:
\[ \mathcal{H} = pv - L. \]

Now suppose we have a curve $q(t), \dot{q}(t) \in TQ$ that satisfies the Euler-Lagrange equations. Then by the identification $TQ = T^\ast Q$, this gives a curve $(q(t), p(t)) \in T^\ast Q$. What equation does it satisfy? We have
\[ \frac{d}{dt} p = \frac{d}{dt} \frac{\partial \mathcal{L}}{\partial v} = \frac{\partial \mathcal{L}}{\partial q} = -\frac{\partial H}{\partial q}, \]
and
\[ \frac{d}{dt}q = v = \frac{\partial H}{\partial p}. \]
These are Hamilton's equations, and they say that the curve $\gamma = (q(t), p(t)) \in T^\ast Q$ is just an integral curve of the symplectic gradient of $H$! So classical mechanics is really just about flows of Hamiltonian vector fields on symplectic manifolds.

Monday, January 4, 2010

First day of 2010

It's the first (academic) day of 2010. I figured that the best use of this blog is as a log--that is, to log and plan my work for the semester/year/life. If you want to accomplish goals, the fist thing to do is to write them down! So here we go, crude outline for the next semester:

1. Work through Milnor's Morse theory book, cover to cover. This should be easy since I'm taking a class in morse theory anyway.

2. Work through Kirwan's thesis cover to cover. I've already been through quite a bit of it, and the only things that caused me any trouble last summer have since been cleared up.

3. Work though Gulliemin and Sternberg's Equivariant cohomology book. Again, quite a bit of the material I already know, so this should be doable.

4. Finish working through HKLR. Really the only remaining part is supersymmetric nonlinear sigma models.

5. The details of the ADHM construction, once and for all. I should know this already.

6. Hilbert schemes of points on a surface. Really, the hyperkahler metric for the scheme of points on $\mathbb{C}^2$. Again, I should know this already.

We'll see how these go--this is probably ambitious, and many of these will get extended into the summer.