Monday, August 31, 2015

Hamilton-Jacobi equation and Riemannian distance

Consider the cotangent bundle $T^\ast X$ as a symplectic manifold with canonical symplectic form $\omega$. Consider the Hamilton-Jacobi equation
\[ \frac{\partial S}{\partial t} + H(x, \nabla S) = 0, \]
for the classical Hamilton function $S(x,t)$. Setting $x=x(t), p(t) = (\nabla S)(x(t), t)$ one sees immediately from the method of characteristics that this PDE is solved by the classical action
\[ S(x,t) = \int_0^t (p \dot{x} - H) ds, \]
where the integral is taken over the solution $(x(s),p(s))$ of Hamilton's equations with $x(0)=x_0$ and $x(t) = x$. The choice of basepoint $x_0$ involves an overall additive constant of $S$, and really this solution is only valid in some neighbourhood $U$ of $x_0$. (Reason: $S$ is in general multivalued, as the differential "$dS$" is closed but not necessarily exact.)

Now consider the case where $X$ is Riemannian, with Hamiltonian $H(x,p) = \frac{1}{2} |p|^2$. The solutions to Hamilton's equations are affinely parametrized geodesics, and by a simple Legendre transform we have
\[ S(x, t) = \frac{1}{2} \int_0^t |\dot x|^2 ds \]
where the integral is along the affine geodesic with $x(0) = x_0$ and $x(t) = x$. Since $x(s)$ is a geodesic, $|\dot x(s)|$ is a constant (in $s$) and therefore
\[ S(x, t) = \frac{t}{2} |\dot x(0)|^2. \]
Now consider the path $\gamma(s) = x($|\dot x(0)|^{-1}$s)$. This is an affine geodesic with $\gamma(0) = x_0$, $\gamma(|\dot x(0)|t) = x$ and $|\dot \gamma| = 1$. Therefore, the Riemannian distance between $x_0$ and $x$ (provided $x$ is sufficiently close to $x_0$) is
\[ d(x_0, x) = |\dot x(0)| t. \]
Combining this with the previous calculation, we see that
\[ S(x, t) = \frac{1}{2t} d(x_0, x)^2. \]
Now insert this back into the Hamilton-Jacobi equation above. With a bit of rearranging, we have the following.

Theorem. Let $x_0$ denote a fixed basepoint of $X$. Then for all $x$ in a sufficiently small neighborhood $U$ of $x_0$, the Riemannian distance function satisfies the Eikonal equation
\[ |\nabla_x d(x_0, x)|^2 = 1. \]

Now, for convenience set $r(x) = d(x_0, x)$. Then $|\nabla r|^2 = 1$, from which we obtain (by differentiating twice and contracting)
\[ g^{ij} g^{kl}\left(\nabla_{lki} r \nabla_j r + \nabla_{ki}r \nabla_{lj} r\right) = 0.\]
Quick calculation shows that
\[ \nabla_{lki} r = \nabla_{ilk} r - \left.R_{li}\right.^{b}_k \nabla_b r \]
Therefore, tracing over $l$ and $k$ we obtain
\[ g^{lk} \nabla_{lki} r = \nabla_i ( \Delta r) + Rc(\nabla r, -) \]
Plugging this back into the equation derived above, we have
\[ \nabla r \cdot \nabla(\Delta r) + Rc(\nabla r, \nabla r) + |Hr|^2 = 0, \]
where $Hr$ denotes the Hessian of $r$ regarded as a 2-tensor. Now, using $r$ as a local coordinate, it is easy to see that $\partial_r = \nabla r$ (as vector fields). So we can rewrite this identity as
\[ \partial_r (\Delta r) + Rc(\partial_r, \partial_r) + |Hr|^2 = 0. \]

Now, we can get a nice result out of this. First, note that the Hessian $Hr$ always has at least one eigenvalue equal to zero, because the Eikonal equation implies that $Hr(\partial_r, -)=0$. Let $\lambda_2, \dots, \lambda_n$ denote the non-zero eigenvalues of $Hr$. We have
\[ |Hr|^2 = \lambda_2^2 + \dots + \lambda_n^2, \]
while on the other hand
\[ |\Delta r|^2 = (\lambda_2 + \dots + \lambda_n)^2 \]
By Cauchy-Schwarz, we have
\[ |\Delta r|^2 \leq (n-1)|Hr|^2 \]

Proposition. Suppose that the Ricci curvature of $X$ satisfies $Rc \geq (n-1)\kappa$, and let $u = (n-1)(\Delta r)^{-1}$. Then
\[ u' \geq 1 + \kappa u^2. \]

Proof. From preceding formulas, $|Hr|^2$ can be expressed in terms of the Ricci curvature and the radial derivative of $\Delta r$. On the other hand, $|\Delta|^2$ is bounded above by $(n-1) |Hr|^2$. The claimed inequality then follows from simple rearrangement.

Now, the amazing thing is that this deceptively simple inequality is the main ingredient of the Bishop-Gromov comparison theorem. The Bishop-Gromov comparison theorem, in turn, is the main ingredient of the proof of Gromov(-Cheeger) precompactness. I hope to discuss these topics in a future post.

Tuesday, August 18, 2015

The Classical Partition Function

Let $(M, \omega)$ be a symplectic manifold of dimension $2n$, and let $H: M \to \mathbf{R}$ be a classical Hamiltonian. The symplectic form $\omega$ allows us to define a measure on $M$, given by integration against the top form $\omega^n / n!$. We will denote this measure by $d\mu$.

We imagine that $(M, \omega, H)$ represents some classical mechanical system. We suppose that the dynamics of this dynamical system are very complicated, e.g. some system of $10^{23}$ particles. The system is so complicated that not only can we not solve the equations of motion exactly, and even if we could, their solutions might be so complicated that we can't expect to learn very much from them.

So instead, we ask statistical questions. Imagine that we cannot measure the state of the system exactly (e.g. particles in a box), so we try to guess a probability distribution $\rho(x,p,t)$ on $M$ indicating that at time $t$ the system has probability $\rho(x,p,t) d\mu$ of being in the state $(x,p)$. Obviously, $\rho$ should satisfy the constraint $\int_M \rho d\mu = 1$.

How does $\rho$ evolve in time? We know that the system obeys Hamilton's equations,
\[ (\dot x, \dot p) = X_H = (\partial H / \partial p, -\partial H / \partial x) \]
 in local Darboux coordinates. Therefore, a particle located at $(x,p)$ in phase space at time $t$ will be located at $(x,p)+X_H dt$ in phase space at time $t+dt$. Therefore, the probability that a particle is at point $(x,p)$ at time $t+dt$, should be equal to the probability that the particle is at point $(x,p)-X_H dt$ at time $t$. Therefore, we have
\[ \frac{\partial \rho}{\partial t} = \frac{\partial H}{\partial x} \frac{\partial \rho}{\partial p} - \frac{\partial H}{\partial p} \frac{\partial \rho}{\partial x} = \{H, \rho\} \]

Given a probability distribution $\rho$, the entropy is defined to be
\[ S[\rho] = -\int_M  \rho \log \rho d\mu. \]

(A version of) the second law of thermodynamics. For a given average energy $U$, the system assumes a distribution of maximal possible entropy at thermodynamic equilibrium.

The goal now, is to determine what distribution $\rho$ will maximize the entropy, subject to the constraints (for fixed $U$)
\begin{align*} \int_M H \rho d\mu &= U \\\
\int_M \rho d\mu &= 1 \end{align*}

Setting aside technical issues of convergence, etc., this variational problem is easily solved using the method of Lagrange multipliers. Introducing parameters $\lambda_1, \lambda_2$, we consider the modified functional
\[ S[\rho, \lambda_1, \lambda_2, U] = \int_M\left(-\rho \log \rho +\lambda_1\rho +\lambda_2(H\rho)\right)d\mu -\lambda_1-\lambda_2 U. \]

Note that $\partial S / \partial U = -\lambda_2$, and this is conventionally identified with (minus) the inverse temperature.

Taking the variation with respect to $\rho$, we find
\[ 0= \frac{\delta S}{\delta \rho} = -\log \rho-1+\lambda_1+H\lambda_2\]
Therefore, $rho$ is proportional to $e^{-\beta H}$ where we have set $\beta=-\lambda_2$. Define the partition function $Z$ to be
\[ Z = \int_M e^{-\beta H} d\mu. \]
We therefore have proved (formally and heuristically only!):

Theorem. The probability distribution $\rho$ assumed by the system at thermodynamic equilibrium is given by
\[  \rho = \frac{e^{-\beta H}}{Z} \]
where $\beta > 0$ is a real parameter, called the inverse temperature.

Corollary. At thermodynamic equilibrium, the average energy is given by
\[ U = -\frac{\partial \log Z}{\partial \beta} , \]
and the entropy is given by
\[ S = \beta U + \log Z.\]