We saw in previous posts that gauge-fixing is intimately related to BRST cohomology. Today I want to explain the underlying mathematical formalism, as it is actually something very well-known: Lie algebra cohomology. Let \(\mathfrak{g}\) be a Lie algebra and \(M\) a \(\mathfrak{g}\)-module. We will construct a cochain complex that computes the Lie algebra cohomology with values in \(M\), \(H^i(\mathfrak{g}, M)\). Out of thin air, we define
\[ C^\ast(\mathfrak{g}, M) = M \otimes \wedge^\ast \mathfrak{g}^\ast. \]
The grading is just the grading induced by the grading on \(\wedge^\ast \mathfrak{g}^\ast\), which we identify with the BRST ghost number. Let \(e_i\) be a basis for \(M\) and \(T_a\) be a basis for \(\mathfrak{g}\), with canonical dual basis \(S^a\). The differential is defined on generators to be
\[ d e_i = \rho(T_a) e_i \otimes S^a \]
\[ d S^a = \frac{1}{2} f^a_{bc} S^b \wedge S^c \]
where \(\rho: \mathfrak{g} \to \mathrm{End}(M)\) is the representation and \(f^a_{bc}\) are the structure constants of the group. This differential is then extended to satisfy the graded Leibniz rule, and is easily verified to satisfy \(d^2 = 0\) (this is just the Jacobi identity). The Lie algebra cohomology is just the cohomology of this cochain complex. Essentially by definition, we see that
\[ H^0(\mathfrak{g}, M) = \{m \in M \ | \ \xi \cdot m = 0 \ \forall \ \xi \in \mathfrak{g} \}, \]
i.e. \(H^0(\cdot) = (\cdot)^\mathfrak{g}\) is the invariants functor. In fact, this can be taken to be the defining property of Lie algebra cohomology:
Theorem \(H^k(\mathfrak{g}, M) = R^k (M)^\mathfrak{g}\).
Returning to field theory, we see (modulo some hard technicalities!) that, roughly, \(\mathfrak{g}\) is the Lie algebra of infinitesimal gauge transformations, and \(M\) is the algebra of functions on the space of all connections. The ghost and anti-ghost fields can then be seen to be the multiplication and contraction operators. To wit, we can take \(c^a\) to be the operator
\[ c^a: f \mapsto S^a \wedge f \]
and take \(\bar{c}^a\) to be the operator
\[ \bar{c}^a: f \mapsto \frac{\partial}{\partial S^a} f = T_a \lrcorner f.\]
Then we have
\[ [c^a, \bar{c}^b] = \delta^{ab} \]
so that \(\bar{c}\) is indeed the antifield of \(c\).
Friday, December 28, 2012
Sunday, December 23, 2012
BRST
Finally, I want to discuss gauge-invariant of the gauge-fixed theory. (!?) We saw in the previous posts that if we have a gauge theory with connection \(A\) and matter fields \(\psi\), in order to derive sensible Feynman rules we have to introduce a gauge-fixing function \(G\) as well as Fermionic fields \(c, \bar{c}\), the ghosts. (Note: last time I used \(\eta, \bar{\eta}\) for the ghosts but I want to match the more standard notation, so I've switched to \(c, \bar{c}\)).
Usually it is convenient to use the gauge-fixing function \(G(A) = \partial^\mu A_\mu\). Under an infinitesimal gauge-transformation \(\lambda\), \(A\) transforms as
\[ A \mapsto -\nabla \lambda, \]
so \(G(A)\) transforms as
\[ G(A) \mapsto G(A) - \partial^\mu \nabla_\mu \lambda. \]
Hence the term in the Lagrangian involving the ghosts is
\[ -\bar{c}^a \partial^\mu \nabla_\mu^{ab} c^b, \]
and our gauge-fixed Lagrangian is
\[ \mathcal{L} = -\frac{1}{4} |F|^2 + \bar{\psi}(iD\!\!\!/-m)\psi + -\frac{|\partial^\mu A_\mu|^2}{2\xi}
- \bar{c}^a \nabla_\mu^{ab} c^b \]
Introducing an auxiliary filed \(B^a\), this is of course equivalent to
\[ \mathcal{L} = -\frac{1}{4} |F|^2 + \bar{\psi}(iD\!\!\!/-m)\psi + \frac{\xi}{2} B^a B_a
+ B^a \partial^\mu A_{\mu a} - \bar{c}^a \nabla_\mu^{ab} c^b. \]
Now, there are two questions one might ask: (1) how can we tell that this is a gauge-theory? i.e., what remains of the original gauge symmetry? and (2) does the resulting theory depend in any way on the choice of gauge-fixing function?
The answer to both of these questions is BRST symmetry. The field \(c\) is Lie-algebra valued, so we could think of it as being an infinitesimal gauge transformation. Rather, for \(\epsilon\) a constant odd variable, \(\epsilon c\) is even and an honest infinitesimal gauge transformation. Under this transformation, we have
\[ \delta_\epsilon A = -\nabla (\epsilon c) = -\epsilon \nabla c. \]
Then we define a graded derivation \(\delta\) by
\[ \delta A = - \nabla c. \]
We have a grading by ghost number, where \(\mathrm{gh}(A) = 0, \mathrm{gh}(\psi) = 0, \mathrm{gh}(c) = 1, \mathrm{gh}(\bar{c}) = -1\). We would like to extend \(\delta\) to a derivation of degree \(+1\) that squares to 0. First, we should figure out what \(\delta c\) is. We compute:
\begin{align}
0 &= \delta^2 A \\
&= \delta(-\nabla c) \\
&= -\partial \delta c - (\delta A) c - A (\delta c) + (\delta c) A - c (\delta A) \\
&= -\partial \delta c + (\nabla c) c + c (\nabla c) - [A, \delta c] \\
&= -\nabla(\delta c) + \nabla(c^2).
\end{align}
From this, we see that \(\nabla(\delta c) = \nabla(c^2)\), so we can set
\[ \delta c = c^2 = \frac{1}{2}[c, c]. \]
Then \(\delta^2 c = 0\) is just the Jacobi identity for the group's Lie algebra! Finally, we would like to extend \(\delta\) to act on \(\psi\), \(B\), and \(\bar{c}\) so that \(\delta \mathcal{L} = 0\), and \(\delta^2 = 0\). Since the action on \(A\) is by infinitesimal gauge transformation, this leaves the curvature term of \(\mathcal{L}\) invariant. Similarly, the \(\psi\) term is invariant if we simply take
\[ \delta \psi = c \cdot \psi \]
where dot denotes the infinitesimal gauge transformation. Using the known rules for \(\delta\), we find that
\[ \delta \mathcal{L} = \frac{\xi}{2} \left(\delta B B + B \delta B \right) + \delta B \cdot \partial^\mu A_\mu
- B \cdot \partial^\mu \nabla_\mu c - \delta\bar{c} \cdot \partial^\mu \nabla_\mu c \]
By comparing coefficients, we find (together with what we've already computed)
\begin{align}
\delta A &= -\nabla c \\
\delta \psi &= c \cdot \psi \\
\delta c &= \frac{1}{2}[c,c] \\
\delta \bar{c} &= B \\
\delta B &= 0.
\end{align}
This is the BRST differential. Now, suppose that \(\mathcal{O}(A, \psi)\) is a local operator involving the physical fields \(A\) and \(psi\). Then by construction,\(delta O\) is the change of \(O\) under an infinitesimal gauge transformation. Hence, we find
\[ \langle \delta \mathcal{O} \rangle = 0 \]
for any local observable \(\mathcal{O}\). This just follows from integration by parts (this is where we have to assume the measure is \(\delta\)-closed). Now, why is this significant? First, this tells us that the space of physical observables is
\[ H^0(C^\ast_{\mathrm{BRST}}, \delta) \]
where \(C^\ast_{\mathrm{BRST}}\) is the cochain complex of local observables, graded by ghost number.
Now, the real power of the BRST formalism is the following. We find that the gauge-fixed Lagrangian can be written as
\[ \mathcal{L}_{gf} = \mathcal{L}_0 +\delta \left(\bar{c} \frac{B}{2} + \bar{c}\Lambda\right) \]
where \( \Lambda = \partial^\mu \nabla_\mu A \) is our gauge-fixing function, and \(\mathcal{L}_0\) is the original Lagrangian without gauge-fixing. Now the point is, any two choices of gauge fixing differ by terms which are BRST exact, and hence give the same expectation values on the physical observables \(H^0\). So we have restored gauge invariance, while obtaining a gauge-fixed perturbation theory!
Usually it is convenient to use the gauge-fixing function \(G(A) = \partial^\mu A_\mu\). Under an infinitesimal gauge-transformation \(\lambda\), \(A\) transforms as
\[ A \mapsto -\nabla \lambda, \]
so \(G(A)\) transforms as
\[ G(A) \mapsto G(A) - \partial^\mu \nabla_\mu \lambda. \]
Hence the term in the Lagrangian involving the ghosts is
\[ -\bar{c}^a \partial^\mu \nabla_\mu^{ab} c^b, \]
and our gauge-fixed Lagrangian is
\[ \mathcal{L} = -\frac{1}{4} |F|^2 + \bar{\psi}(iD\!\!\!/-m)\psi + -\frac{|\partial^\mu A_\mu|^2}{2\xi}
- \bar{c}^a \nabla_\mu^{ab} c^b \]
Introducing an auxiliary filed \(B^a\), this is of course equivalent to
\[ \mathcal{L} = -\frac{1}{4} |F|^2 + \bar{\psi}(iD\!\!\!/-m)\psi + \frac{\xi}{2} B^a B_a
+ B^a \partial^\mu A_{\mu a} - \bar{c}^a \nabla_\mu^{ab} c^b. \]
Now, there are two questions one might ask: (1) how can we tell that this is a gauge-theory? i.e., what remains of the original gauge symmetry? and (2) does the resulting theory depend in any way on the choice of gauge-fixing function?
The answer to both of these questions is BRST symmetry. The field \(c\) is Lie-algebra valued, so we could think of it as being an infinitesimal gauge transformation. Rather, for \(\epsilon\) a constant odd variable, \(\epsilon c\) is even and an honest infinitesimal gauge transformation. Under this transformation, we have
\[ \delta_\epsilon A = -\nabla (\epsilon c) = -\epsilon \nabla c. \]
Then we define a graded derivation \(\delta\) by
\[ \delta A = - \nabla c. \]
We have a grading by ghost number, where \(\mathrm{gh}(A) = 0, \mathrm{gh}(\psi) = 0, \mathrm{gh}(c) = 1, \mathrm{gh}(\bar{c}) = -1\). We would like to extend \(\delta\) to a derivation of degree \(+1\) that squares to 0. First, we should figure out what \(\delta c\) is. We compute:
\begin{align}
0 &= \delta^2 A \\
&= \delta(-\nabla c) \\
&= -\partial \delta c - (\delta A) c - A (\delta c) + (\delta c) A - c (\delta A) \\
&= -\partial \delta c + (\nabla c) c + c (\nabla c) - [A, \delta c] \\
&= -\nabla(\delta c) + \nabla(c^2).
\end{align}
From this, we see that \(\nabla(\delta c) = \nabla(c^2)\), so we can set
\[ \delta c = c^2 = \frac{1}{2}[c, c]. \]
Then \(\delta^2 c = 0\) is just the Jacobi identity for the group's Lie algebra! Finally, we would like to extend \(\delta\) to act on \(\psi\), \(B\), and \(\bar{c}\) so that \(\delta \mathcal{L} = 0\), and \(\delta^2 = 0\). Since the action on \(A\) is by infinitesimal gauge transformation, this leaves the curvature term of \(\mathcal{L}\) invariant. Similarly, the \(\psi\) term is invariant if we simply take
\[ \delta \psi = c \cdot \psi \]
where dot denotes the infinitesimal gauge transformation. Using the known rules for \(\delta\), we find that
\[ \delta \mathcal{L} = \frac{\xi}{2} \left(\delta B B + B \delta B \right) + \delta B \cdot \partial^\mu A_\mu
- B \cdot \partial^\mu \nabla_\mu c - \delta\bar{c} \cdot \partial^\mu \nabla_\mu c \]
By comparing coefficients, we find (together with what we've already computed)
\begin{align}
\delta A &= -\nabla c \\
\delta \psi &= c \cdot \psi \\
\delta c &= \frac{1}{2}[c,c] \\
\delta \bar{c} &= B \\
\delta B &= 0.
\end{align}
This is the BRST differential. Now, suppose that \(\mathcal{O}(A, \psi)\) is a local operator involving the physical fields \(A\) and \(psi\). Then by construction,\(delta O\) is the change of \(O\) under an infinitesimal gauge transformation. Hence, we find
An operator \(\mathcal{O}\) is gauge invariant \(\iff \delta\mathcal{O} = 0\).Now, suppose the functional measure \(\mathcal{D}A \mathcal{D}\psi \mathcal{D}B \mathcal{D}c \mathcal{D}\bar{c}\) is gauge-invariant, i.e. is BRST closed. (This assumption is equivalent to the absence of anomalies, but we'll completely ignore this in today's post.) Then we have
\[ \langle \delta \mathcal{O} \rangle = 0 \]
for any local observable \(\mathcal{O}\). This just follows from integration by parts (this is where we have to assume the measure is \(\delta\)-closed). Now, why is this significant? First, this tells us that the space of physical observables is
\[ H^0(C^\ast_{\mathrm{BRST}}, \delta) \]
where \(C^\ast_{\mathrm{BRST}}\) is the cochain complex of local observables, graded by ghost number.
Now, the real power of the BRST formalism is the following. We find that the gauge-fixed Lagrangian can be written as
\[ \mathcal{L}_{gf} = \mathcal{L}_0 +\delta \left(\bar{c} \frac{B}{2} + \bar{c}\Lambda\right) \]
where \( \Lambda = \partial^\mu \nabla_\mu A \) is our gauge-fixing function, and \(\mathcal{L}_0\) is the original Lagrangian without gauge-fixing. Now the point is, any two choices of gauge fixing differ by terms which are BRST exact, and hence give the same expectation values on the physical observables \(H^0\). So we have restored gauge invariance, while obtaining a gauge-fixed perturbation theory!
Fadeev-Popov Ghosts, continued
Last time I sketched how we can represent an integral over a submanifold \(M \subset \mathbb{R}^n\) by an integral of the form
\[ \int_{\mathbb{R}^n} f(x) \delta(G(x)) \exp\left(\bar{\eta}G(x+\eta) \right) d\eta d\bar{\eta} dx. \]
Here, \(\eta, \bar{\eta}\) are Fermionic variables called Fadeev-Popov ghosts, which are introduced to cancel an unwanted determinant factor. The function \(G(x)\) singles out the submanifold \(M\) as \(M = G^{-1}(0)\).
Now suppose that we start with a vector (or affine) space \(V\), which is acted on by a group \(H\). We would like to undstand integrals over the quotient \(V / H\) in terms of integrals over \(V\). Suppose there is some function \(G(x)\) on \(V\) satisfying the following property:
We have some integral
\[Z = \int \delta(G(x) - w) \exp\left\{\frac{i}{\hbar}S(x) + \bar{\eta}dG \eta\right\} dx d\eta d\bar{\eta} \]
which is independent of \(w\). So we add a Gaussian weight an integrate over \(w\):
\begin{align}
Z' &= \int \delta(G(x) -w) \exp\left\{\frac{i}{\hbar}S(x) + \bar{\eta}dG \eta - \frac{1}{2\xi} |w|^2\right\}
dx dw d\eta d\bar{\eta}\\
&= \int \exp\left\{\frac{i}{\hbar}S(x) + \bar{\eta}dG \eta - \frac{1}{2\xi} |G(x)|^2 \right\} dx d\eta d\bar{\eta}.
\end{align}
Here, \(\xi\) is an arbitrary real positive constant, and we denote the new integral by \(Z'\) to indicate that it differs from the old path integral \(Z\) by (at most) an overall constant. Now the important thing is that the new action appear in the integrand of \(Z'\) is gauge-fixed and hence there is no problem whatsoever in deriving sensible, meaningful Feynman rules. The gauge-fixing term \(|G(x)|^2\) serves to make the action non-degenerate, so that propagators are well-defined, while the term involving the Fermions \(\eta, \bar{\eta}\) generates new Feynman rules that "cancel" the superfluous degrees of freedom due to gauge redundancy.
The question remains, what if we choose some other gauge-fixing function? i.e., what happens if we perturb \(G(x)\) to some new function satisfying property (*)? We'll answer this using the BRST formalism.
\[ \int_{\mathbb{R}^n} f(x) \delta(G(x)) \exp\left(\bar{\eta}G(x+\eta) \right) d\eta d\bar{\eta} dx. \]
Here, \(\eta, \bar{\eta}\) are Fermionic variables called Fadeev-Popov ghosts, which are introduced to cancel an unwanted determinant factor. The function \(G(x)\) singles out the submanifold \(M\) as \(M = G^{-1}(0)\).
Now suppose that we start with a vector (or affine) space \(V\), which is acted on by a group \(H\). We would like to undstand integrals over the quotient \(V / H\) in terms of integrals over \(V\). Suppose there is some function \(G(x)\) on \(V\) satisfying the following property:
For each level \(w\) of \(G\), the subspace \(M_w := G^{-1}(w)\) intersects the orbits transversely, and furthermore every \(H\)-orbit intersects \(M_w\) exactly once. (*)We call such a function a gauge-fixing function, and a level \(w\) a gauge-fixing. By assumption, we have \(V/H \cong M_w\) for any \(w\). Hence using the integral we derived last time, we can integrate over \(M_w\) for any particular choice of \(w\), and this ought to be the same as integrating over \(V/H\). The problem, however, is that in the QFT setting it's not clear what the Feynman rules should be for such a path integral. The final trick is that since the answer should be independent of \(w\), and by integrating over all possible \(w\) we obtain a Lagrangian from which we can derive sensible Feynman rules.
We have some integral
\[Z = \int \delta(G(x) - w) \exp\left\{\frac{i}{\hbar}S(x) + \bar{\eta}dG \eta\right\} dx d\eta d\bar{\eta} \]
which is independent of \(w\). So we add a Gaussian weight an integrate over \(w\):
\begin{align}
Z' &= \int \delta(G(x) -w) \exp\left\{\frac{i}{\hbar}S(x) + \bar{\eta}dG \eta - \frac{1}{2\xi} |w|^2\right\}
dx dw d\eta d\bar{\eta}\\
&= \int \exp\left\{\frac{i}{\hbar}S(x) + \bar{\eta}dG \eta - \frac{1}{2\xi} |G(x)|^2 \right\} dx d\eta d\bar{\eta}.
\end{align}
Here, \(\xi\) is an arbitrary real positive constant, and we denote the new integral by \(Z'\) to indicate that it differs from the old path integral \(Z\) by (at most) an overall constant. Now the important thing is that the new action appear in the integrand of \(Z'\) is gauge-fixed and hence there is no problem whatsoever in deriving sensible, meaningful Feynman rules. The gauge-fixing term \(|G(x)|^2\) serves to make the action non-degenerate, so that propagators are well-defined, while the term involving the Fermions \(\eta, \bar{\eta}\) generates new Feynman rules that "cancel" the superfluous degrees of freedom due to gauge redundancy.
The question remains, what if we choose some other gauge-fixing function? i.e., what happens if we perturb \(G(x)\) to some new function satisfying property (*)? We'll answer this using the BRST formalism.
Friday, December 21, 2012
Fadeev-Popov Ghosts
Today I want to review the Fadeev-Popov procedure, with a view toward BRST and eventually BV.
First we'll review the Fedeev-Popov method, to motivate the introduction of ghosts. Suppose we have a gauge theory involving a \(G\)-connection \(A\) and some field \(\phi\) charged under \(G\). Under a gauge transformation \(g(x)\), \(\phi\) transforms as
\[ \phi \mapsto g \cdot \phi. \]
We would like that the covariant derivative transforms in the same way, i.e.
\[ \nabla \phi \mapsto g \nabla \phi. \]
In terms of the connection 1-form \(A\), the covariant derivative is
\[ \nabla = d + A. \]
Let \(\nabla'\) denote the gauge-transformed covariant derivative, and \(\phi'\) the gauge-transformed field. Then we want
\[ \nabla' \phi' = g \nabla \phi. \]
We compute
\begin{align}
\nabla' \phi' &= (d + A')(g \phi) \\
&= dg \phi + g d\phi + A'(g\phi) \\
&= gg^{-1}dg \phi + gd\phi + g g^{-1} A' g\phi \\
&= g(d\phi + g^{-1} A' g \phi + g^{-1} dg \phi \\
&= g(d\phi + A\phi)
\end{align}
Comparing terms, we see that
\[ A = g^{-1} A' g + g^{-1} dg, \]
so upon re-arranging we have
\[ A' = g A g^{-1} - dg g^{-1} \]
This causes a problem: at critical points of the action, the Hessian of the action is degenerate in directions tangent to the gauge orbits. This means that the propagator is not well-defined, and there is no obvious way to derive the Feynman rules for perturbation theory. The solution is to take the quotient by gauge-transformations. To do this, we pick some gauge-fixing function \(G(A)\) which ought to be transverse to the orbits. Then we can restrict to the space \(G(A) = 0\), on which the Hessian of the action is non-degenerate, leading to a well-defined propagator. Formally, the path integral is
\[ Z = \int_{\{G(A) = 0\}} \exp{\frac{i}{\hbar} S[A, \phi]} \mathcal{D}A \mathcal{D}\phi \]
Formally, this suggests that the path integral should be something like
\[ Z = \int \delta(G(A)) \exp{\frac{i}{\hbar} S[A, \phi]} \mathcal{D}A \mathcal{D}\phi, \]
but this is not quite right! To understand the source of the problem, we'll first study the finite-dimensional case and then use this to solve the problem in infinite-dimensions.
Suppose we are on \(\mathbb{R}^n\), and we would like to integrate a function \(f(x)\) over a submanifold \(M\) defined by \(M = G^{-1}(0)\) for some smooth function \(G: \mathbb{R}^n \to \mathbb{R}^k\). Naively, we might expect that the answer is
\[ \int_M f(x) \stackrel{?}{=} \int f(x) \delta(G(x)) dx. \]
To see why this is not correct, write the delta function as
\[ \delta(G(x)) = \frac{1}{(2\pi)^k}\int e^{ip\cdot G(x)} d^k p. \]
We can regularize this by taking the limit as \(\epsilon \to 0\) of
\[ \frac{1}{(2\pi)^k}\int \exp\left\{ip\cdot G(x) -\frac{\epsilon}{2} |p|^2\right\} d^k p \]
This integral is Gaussian, so we obtain explicitly
\[ \left(\frac{2\pi}{\epsilon} \right)^{\frac{k}{2}} \exp\left\{-\frac{1}{2\epsilon} |G(x)|^2 \right\}. \]
So our original guess becomes
\[ \left(\frac{1}{2\pi \epsilon}\right)^\frac{k}{2} \int f(x) \exp\left\{ -\frac{1}{2\epsilon} |G(x)|^2 \right\} dx .\]
As \(\epsilon \to 0\), this integral localizes on the locus \(\{G(x) = 0\}\), as desired, but does not give the right answer! To see this, let \(u\) be a coordinate on \(M = G^{-1}(0)\) and \(v\) coordinates normal to \(M\). Then we have
\[ G(x) = G(u,v) = v^T H(u) v + o(|v|^3) \]
where \(H(x)\) is the Hessian of \(|G|^2\) at the point \(x = (u, 0)\). So the integral becomes (as \(\epsilon \to 0\))
\begin{align}
I_\epsilon &= \left(\frac{1}{2\pi \epsilon}\right)^\frac{k}{2} \int\int
f(u, v) \exp\left\{ -\frac{1}{2\epsilon} v^T H(u) v \right\} du dv \\
&= \int_M \frac{f(u)}{\sqrt{\det H(u)}} du.
\end{align}
This is not correct. We have to account for the determinant of the Hessian. Now, the Hessian is given by
\begin{align} H_{ij} &= \frac{1}{2} \frac{\partial^2 |G|^2}{\partial v^i \partial v^j} \\
&= \frac{\partial}{\partial v^i} \left( G^a \partial_j G^a \right) \\
&= \left(\partial_i G^a \partial_j G^a + G^a \partial_{ij} G^a \right) \\
&= \partial_i G^a \partial_j G^a
\end{align}
where we have used the fact that \(G = 0\) on \(x = (u, 0)\). Hence we see that
\[ \det H = (\det A)^2 \]
where \(A\) is the \(k \times k\) matrix with entries \(\partial_i G^a\). Hence
\[ \sqrt{\det H} = \det A. \]
Now there is a straightforward way to eliminate the determinant. We introduce Fermionic coordinates \(\eta^i, \theta^i\), \(i = 1, \ldots, k\). Then by Berezin integration, we have
\[ \int e^{\eta^i G^i(0, \theta^j)} d\theta d\eta = \det A. \]
So in the end, we find
\[ \int_M f(x) d\mu = \int_{\mathbb{R}^n}
f(x) \delta(G(x)) \exp\left(\eta \cdot G(x+ \theta) \right)
dx d\theta d\eta. \]
Gauge-Invariance and Gauge-Fixing
First we'll review the Fedeev-Popov method, to motivate the introduction of ghosts. Suppose we have a gauge theory involving a \(G\)-connection \(A\) and some field \(\phi\) charged under \(G\). Under a gauge transformation \(g(x)\), \(\phi\) transforms as
\[ \phi \mapsto g \cdot \phi. \]
We would like that the covariant derivative transforms in the same way, i.e.
\[ \nabla \phi \mapsto g \nabla \phi. \]
In terms of the connection 1-form \(A\), the covariant derivative is
\[ \nabla = d + A. \]
Let \(\nabla'\) denote the gauge-transformed covariant derivative, and \(\phi'\) the gauge-transformed field. Then we want
\[ \nabla' \phi' = g \nabla \phi. \]
We compute
\begin{align}
\nabla' \phi' &= (d + A')(g \phi) \\
&= dg \phi + g d\phi + A'(g\phi) \\
&= gg^{-1}dg \phi + gd\phi + g g^{-1} A' g\phi \\
&= g(d\phi + g^{-1} A' g \phi + g^{-1} dg \phi \\
&= g(d\phi + A\phi)
\end{align}
Comparing terms, we see that
\[ A = g^{-1} A' g + g^{-1} dg, \]
so upon re-arranging we have
\[ A' = g A g^{-1} - dg g^{-1} \]
This causes a problem: at critical points of the action, the Hessian of the action is degenerate in directions tangent to the gauge orbits. This means that the propagator is not well-defined, and there is no obvious way to derive the Feynman rules for perturbation theory. The solution is to take the quotient by gauge-transformations. To do this, we pick some gauge-fixing function \(G(A)\) which ought to be transverse to the orbits. Then we can restrict to the space \(G(A) = 0\), on which the Hessian of the action is non-degenerate, leading to a well-defined propagator. Formally, the path integral is
\[ Z = \int_{\{G(A) = 0\}} \exp{\frac{i}{\hbar} S[A, \phi]} \mathcal{D}A \mathcal{D}\phi \]
Formally, this suggests that the path integral should be something like
\[ Z = \int \delta(G(A)) \exp{\frac{i}{\hbar} S[A, \phi]} \mathcal{D}A \mathcal{D}\phi, \]
but this is not quite right! To understand the source of the problem, we'll first study the finite-dimensional case and then use this to solve the problem in infinite-dimensions.
The Fadeev-Popov Determinant
Suppose we are on \(\mathbb{R}^n\), and we would like to integrate a function \(f(x)\) over a submanifold \(M\) defined by \(M = G^{-1}(0)\) for some smooth function \(G: \mathbb{R}^n \to \mathbb{R}^k\). Naively, we might expect that the answer is
\[ \int_M f(x) \stackrel{?}{=} \int f(x) \delta(G(x)) dx. \]
To see why this is not correct, write the delta function as
\[ \delta(G(x)) = \frac{1}{(2\pi)^k}\int e^{ip\cdot G(x)} d^k p. \]
We can regularize this by taking the limit as \(\epsilon \to 0\) of
\[ \frac{1}{(2\pi)^k}\int \exp\left\{ip\cdot G(x) -\frac{\epsilon}{2} |p|^2\right\} d^k p \]
This integral is Gaussian, so we obtain explicitly
\[ \left(\frac{2\pi}{\epsilon} \right)^{\frac{k}{2}} \exp\left\{-\frac{1}{2\epsilon} |G(x)|^2 \right\}. \]
So our original guess becomes
\[ \left(\frac{1}{2\pi \epsilon}\right)^\frac{k}{2} \int f(x) \exp\left\{ -\frac{1}{2\epsilon} |G(x)|^2 \right\} dx .\]
As \(\epsilon \to 0\), this integral localizes on the locus \(\{G(x) = 0\}\), as desired, but does not give the right answer! To see this, let \(u\) be a coordinate on \(M = G^{-1}(0)\) and \(v\) coordinates normal to \(M\). Then we have
\[ G(x) = G(u,v) = v^T H(u) v + o(|v|^3) \]
where \(H(x)\) is the Hessian of \(|G|^2\) at the point \(x = (u, 0)\). So the integral becomes (as \(\epsilon \to 0\))
\begin{align}
I_\epsilon &= \left(\frac{1}{2\pi \epsilon}\right)^\frac{k}{2} \int\int
f(u, v) \exp\left\{ -\frac{1}{2\epsilon} v^T H(u) v \right\} du dv \\
&= \int_M \frac{f(u)}{\sqrt{\det H(u)}} du.
\end{align}
This is not correct. We have to account for the determinant of the Hessian. Now, the Hessian is given by
\begin{align} H_{ij} &= \frac{1}{2} \frac{\partial^2 |G|^2}{\partial v^i \partial v^j} \\
&= \frac{\partial}{\partial v^i} \left( G^a \partial_j G^a \right) \\
&= \left(\partial_i G^a \partial_j G^a + G^a \partial_{ij} G^a \right) \\
&= \partial_i G^a \partial_j G^a
\end{align}
where we have used the fact that \(G = 0\) on \(x = (u, 0)\). Hence we see that
\[ \det H = (\det A)^2 \]
where \(A\) is the \(k \times k\) matrix with entries \(\partial_i G^a\). Hence
\[ \sqrt{\det H} = \det A. \]
Now there is a straightforward way to eliminate the determinant. We introduce Fermionic coordinates \(\eta^i, \theta^i\), \(i = 1, \ldots, k\). Then by Berezin integration, we have
\[ \int e^{\eta^i G^i(0, \theta^j)} d\theta d\eta = \det A. \]
So in the end, we find
\[ \int_M f(x) d\mu = \int_{\mathbb{R}^n}
f(x) \delta(G(x)) \exp\left(\eta \cdot G(x+ \theta) \right)
dx d\theta d\eta. \]
Thursday, December 13, 2012
The Weyl and Wigner Transforms
Today I'd like to try to understand better how deformation quantization is related to the usual canonical quantization, and especially how the latter might be used to deduce the former, i.e., given an honest quantization (in the sense of operators), how might be reproduce the formula for the Moyal star product?
We'll fix our symplectic manifold once and for all to be \(\mathbb{R}^2\) with its standard symplectic structure, with Darboux coordinates \(x\) and \(p\). Let \(\mathcal{A}\) be the algebra of observables on \(\mathbb{R}^2\). For technical reasons, we'll restrict to those smooth functions that are polynomially bounded in the momentum coordinate (but of course the star product makes sense in general). Let \(\mathcal{D}\) be the algebra of pseudodifferential operators on \(\mathbb{R}\). We want to define a quantization map
\[ \Psi: \mathcal{A} \to \mathcal{D} \]
such that
\[ \Psi(x) = x \in \mathcal{D} \]
\[ \Psi(p) = -i\hbar \partial \]
Out of thin air, let us define
\[ \langle q| \Psi(f) |q' \rangle = \int e^{ik(q-q')} f(\frac{q+q'}{2}, k) dk \]
This is the Weyl transform. Its inverse is the Wigner transform, given by
\[ \Phi(A, q, k) = \int e^{-ikq'} \left\langle q+\frac{q'}{2} \right| A \left| q - \frac{q'}{2} \right\rangle dq' \]
Note: I am (intentionally) ignoring all factors of \(2\pi\) involved. It's not hard to work out what they are, but annoying to keep track of them in calculations, so I won't.
Theorem For suitably well-behaved \(f\), we have \( \Phi(\Psi(f)) = f\).
Proof Using the "ignore \(2\pi\)" conventions, we have the formal identities
\[ \int e^{ikx} dx = \delta(k), \ \ \int e^{ikx} dk = \delta(x). \]
The theorem is a formal result of these:
\begin{align} \Phi(\Psi(f))(q, k) &= \int e^{-ikq'} \left\langle q + \frac{q'}{2} \right| \Psi(f) \left| q - \frac{q'}{2} \right\rangle \\\
&= \int e^{-ikq'} e^{ik'q'} f(q, k) dk' dq' \\\
&= f(q,k).
\end{align}
One may easily check that \(\Psi(x) = x\) and \(Psi(k) = -i\partial\), so this certainly gives a quantization. But why is it particularly natural? To see this, let \(Q\) be the operator of multiplication by \(x\), and let \(P\) be the operator \(-i\partial\). We'd like to take \(f(q,p)\) and replace it by \(f(Q, P)\), but we can't literally substitute like this due to order ambiguity. However, we could work formally as follows:
\begin{align}
f(Q, P) &= \int \delta(Q-q) \delta(P - p) f(q,p) dq dp \\\
&= \int e^{ik(Q-q) + iq'(P-p)} f(q,p) dq dq' dp dk.
\end{align}
In this last expression, there is no order ambiguity in the argument of the exponential (since it is a sum and not a product), and furthermore the expression itself make sense since it is the exponential of a skew-adjoint operator. So let's check that this agrees with the Weyl transform. Using a special case of the Baker-Campbell-Hausdorff formula for the Heisenberg algebra, we have
\[ e^{ik(Q-q) + iq'(P-p)} = e^{ik(Q-q)} e^{iq'(P-p)} e^{-ikq'/2} \]
Let us compute the matrix element:
\begin{align}
\langle q_1 | P | q_2 \rangle &= \int \langle q_1 | p_1 \rangle
\langle p_1 | P | p_2 \rangle \langle p_2 | q_2 \rangle dp_1 dp_2 \\\
&= \int e^{iq_1p_1 - iq_2 p_2} p_2 \delta(p_2 - p_1) dp_1 dp_2 \\\
&= \int e^{i p(q_1-q_2)} p dp.
\end{align}
Hence we find that the matrix element for the exponential is
\begin{align} \langle q_1 |e^{ik(Q-q) + iq'(P-p)} | q_2 \rangle
&= e^{-ikq'/2 + ik(q_1-q)} \langle q_1 | e^{iq'(P-p)} | q_2 \rangle \\\
&= \int e^{-ikq'/2 + ik(q_1-q) -iq'p} e^{iq'p'' + ip''(q_1-q_2)} dp'' \\\
&= \delta(q' + q_1 - q_2) e^{-ikq'/2 + ik(q_1-q) -iq'p}
\end{align}
Plugging this back into the expression for \(f(Q, P)\) we find
\begin{align}
\langle q_1 | f(Q. P) | q_2 \rangle &= \int \delta(q' + q_1 - q_2) e^{-ikq'/2 + ik(q_1-q) -iq'p}
f(q,p) dq dq' dp dk \\\
&= \int e^{ ik(q_1/2 +q_2/2-q) -ip(q_1-q_2)} f(q,p) dq dp dk \\\
&= \int e^{ip(q_1-q_2)} f(\frac{q_1+q_2}{2}, p) dp,
\end{align}
which is the original expression we gave for the Weyl transform.
Out of thin air, let us define
\[ \langle q| \Psi(f) |q' \rangle = \int e^{ik(q-q')} f(\frac{q+q'}{2}, k) dk \]
This is the Weyl transform. Its inverse is the Wigner transform, given by
\[ \Phi(A, q, k) = \int e^{-ikq'} \left\langle q+\frac{q'}{2} \right| A \left| q - \frac{q'}{2} \right\rangle dq' \]
Note: I am (intentionally) ignoring all factors of \(2\pi\) involved. It's not hard to work out what they are, but annoying to keep track of them in calculations, so I won't.
Theorem For suitably well-behaved \(f\), we have \( \Phi(\Psi(f)) = f\).
Proof Using the "ignore \(2\pi\)" conventions, we have the formal identities
\[ \int e^{ikx} dx = \delta(k), \ \ \int e^{ikx} dk = \delta(x). \]
The theorem is a formal result of these:
\begin{align} \Phi(\Psi(f))(q, k) &= \int e^{-ikq'} \left\langle q + \frac{q'}{2} \right| \Psi(f) \left| q - \frac{q'}{2} \right\rangle \\\
&= \int e^{-ikq'} e^{ik'q'} f(q, k) dk' dq' \\\
&= f(q,k).
\end{align}
One may easily check that \(\Psi(x) = x\) and \(Psi(k) = -i\partial\), so this certainly gives a quantization. But why is it particularly natural? To see this, let \(Q\) be the operator of multiplication by \(x\), and let \(P\) be the operator \(-i\partial\). We'd like to take \(f(q,p)\) and replace it by \(f(Q, P)\), but we can't literally substitute like this due to order ambiguity. However, we could work formally as follows:
\begin{align}
f(Q, P) &= \int \delta(Q-q) \delta(P - p) f(q,p) dq dp \\\
&= \int e^{ik(Q-q) + iq'(P-p)} f(q,p) dq dq' dp dk.
\end{align}
In this last expression, there is no order ambiguity in the argument of the exponential (since it is a sum and not a product), and furthermore the expression itself make sense since it is the exponential of a skew-adjoint operator. So let's check that this agrees with the Weyl transform. Using a special case of the Baker-Campbell-Hausdorff formula for the Heisenberg algebra, we have
\[ e^{ik(Q-q) + iq'(P-p)} = e^{ik(Q-q)} e^{iq'(P-p)} e^{-ikq'/2} \]
Let us compute the matrix element:
\begin{align}
\langle q_1 | P | q_2 \rangle &= \int \langle q_1 | p_1 \rangle
\langle p_1 | P | p_2 \rangle \langle p_2 | q_2 \rangle dp_1 dp_2 \\\
&= \int e^{iq_1p_1 - iq_2 p_2} p_2 \delta(p_2 - p_1) dp_1 dp_2 \\\
&= \int e^{i p(q_1-q_2)} p dp.
\end{align}
Hence we find that the matrix element for the exponential is
\begin{align} \langle q_1 |e^{ik(Q-q) + iq'(P-p)} | q_2 \rangle
&= e^{-ikq'/2 + ik(q_1-q)} \langle q_1 | e^{iq'(P-p)} | q_2 \rangle \\\
&= \int e^{-ikq'/2 + ik(q_1-q) -iq'p} e^{iq'p'' + ip''(q_1-q_2)} dp'' \\\
&= \delta(q' + q_1 - q_2) e^{-ikq'/2 + ik(q_1-q) -iq'p}
\end{align}
Plugging this back into the expression for \(f(Q, P)\) we find
\begin{align}
\langle q_1 | f(Q. P) | q_2 \rangle &= \int \delta(q' + q_1 - q_2) e^{-ikq'/2 + ik(q_1-q) -iq'p}
f(q,p) dq dq' dp dk \\\
&= \int e^{ ik(q_1/2 +q_2/2-q) -ip(q_1-q_2)} f(q,p) dq dp dk \\\
&= \int e^{ip(q_1-q_2)} f(\frac{q_1+q_2}{2}, p) dp,
\end{align}
which is the original expression we gave for the Weyl transform.
Thursday, November 29, 2012
Equations of Motion and Noether's Theorem in the Functional Formalism
First, let us recall the derivation of the equations of motion and Noether's theorem in classical field theory. We have some action functional \(S[\phi]\) defined by some local Lagrangian:
\[ S[\phi] = \int L(\phi, \partial \phi) dx. \]
The classical equations of motion are just the Euler-Lagrange equations
\[ \frac{\delta S}{\delta \phi(x)} = 0
\iff \partial_\mu \left( \frac{\partial L}{\partial(\partial_\mu\phi)} \right)
= \frac{\partial L}{\partial \phi} \]
Now suppose that \(S\) is invariant under some transformation \(\phi(x) \mapsto \phi(x) + \epsilon(x) \eta(x)\), so that \(S[\phi] = S[\phi+\epsilon \eta]\). Here we treat \(\eta\) as a fixed function but \(\epsilon\) may be an arbitrary infinitesimal function. The Lagrangian is not necessarily invariant, but rather can transform with a total derivative:
\[ L(\phi+\epsilon \eta) = L(\phi)
+ \frac{\partial L}{\partial (\partial_\mu \phi)} \eta \partial_\mu \epsilon
+ \epsilon \partial_\mu f^\mu \]
For some unknown vector field \(f^\mu\) (which we could compute given any particular Lagrangian). So let's compute
\begin{align}\delta_\epsilon S &= \int \delta_\epsilon L \\\
&= \int \frac{\partial L}{\partial (\partial_\mu \phi)}
\eta \partial_\mu \epsilon + \epsilon \partial_\mu f^\mu \\\
&= \int \partial_\mu \left(f^\mu - \frac{\partial L}{\partial (\partial_\mu \phi)} \eta \right) \epsilon
\end{align}
Let us define the Noether current \(J^\mu\) by
\[ J^\mu = \frac{\partial L}{\partial (\partial_\mu \phi)} \eta - f^\mu. \]
Then the previous computation showed that
\[ \frac{\delta S}{\delta \epsilon} = -\partial_\mu J^\mu. \]
If \(\phi\) is a solution to the Euler-Lagrange equations, then the variation \(dS\) vanishes, hence we obtain:
Theorem (Noether's theorem) The Noether current is divergence free, i.e.
\[ \partial_\mu J^\mu = 0.\]
First, we derive the functional analogue of the classical equations of motion. Consider an expectation value
\[ \langle \mathcal{O(\phi)} \rangle
= \int \mathcal{O}(\phi) e^{\frac{i}{\hbar} S} \mathcal{D}\phi \]
We'll assume that \(\phi\) takes values in a vector space (or bundle). Then we can perform a change of variables \(\psi = \phi + \epsilon\), and since \(\mathcal{D}\phi = \mathcal{D}\psi\) we find that
\[ \int \mathcal{O}(\phi+\epsilon) \exp\left(\frac{i}{\hbar} S[\phi] \right) \mathcal{D}\phi \]
is independent of \(\epsilon\). Expanding to first order in \(\epsilon\), we have
\[ 0 = \int \left(\frac{\delta\mathcal{O}}{\delta \phi}
+ \frac{i \mathcal{O}}{\hbar} \frac{\delta S}{\delta \phi} \right)
\exp \left( \frac{i}{\hbar} S \right) \mathcal{D}\phi \]
So we find the quantum analogue of the equations of motion:
\[ \left\langle \frac{\delta \mathcal{O}}{\delta \phi} \right\rangle
+ \frac{i}{\hbar} \left\langle \mathcal{O} \frac{\delta S}{\delta \phi} \right\rangle = 0\]
Next, we move on to the quantum version of Noether's theorem. Suppose there is a transformation \(Q\) of the fields leaving the action invariant. Assuming the path integral measure is invariant, we obtain
\[ \left\langle QF \right \rangle + \frac{i}{\hbar} \left\langle F QS \right\rangle = 0\]
To compare with the classical result, consider \(Q\) to be the (singular) operator
\[ Q = \frac{\delta}{\delta \epsilon(x)} \]
Then by the previous calculations,
\[ Q S = -\delta_\mu J^\mu, \]
so we obtain
\[ \left\langle \frac{\delta \mathcal{O}}{\delta \epsilon(x)} \right\rangle
= \frac{i}{\hbar} \left\langle \mathcal{O} \partial_\mu J^\mu \right\rangle. \]
This is the Ward-Takahashi identity, the quantum analogue of Noether's theorem.
\[ S[\phi] = \int L(\phi, \partial \phi) dx. \]
The classical equations of motion are just the Euler-Lagrange equations
\[ \frac{\delta S}{\delta \phi(x)} = 0
\iff \partial_\mu \left( \frac{\partial L}{\partial(\partial_\mu\phi)} \right)
= \frac{\partial L}{\partial \phi} \]
Now suppose that \(S\) is invariant under some transformation \(\phi(x) \mapsto \phi(x) + \epsilon(x) \eta(x)\), so that \(S[\phi] = S[\phi+\epsilon \eta]\). Here we treat \(\eta\) as a fixed function but \(\epsilon\) may be an arbitrary infinitesimal function. The Lagrangian is not necessarily invariant, but rather can transform with a total derivative:
\[ L(\phi+\epsilon \eta) = L(\phi)
+ \frac{\partial L}{\partial (\partial_\mu \phi)} \eta \partial_\mu \epsilon
+ \epsilon \partial_\mu f^\mu \]
For some unknown vector field \(f^\mu\) (which we could compute given any particular Lagrangian). So let's compute
\begin{align}\delta_\epsilon S &= \int \delta_\epsilon L \\\
&= \int \frac{\partial L}{\partial (\partial_\mu \phi)}
\eta \partial_\mu \epsilon + \epsilon \partial_\mu f^\mu \\\
&= \int \partial_\mu \left(f^\mu - \frac{\partial L}{\partial (\partial_\mu \phi)} \eta \right) \epsilon
\end{align}
Let us define the Noether current \(J^\mu\) by
\[ J^\mu = \frac{\partial L}{\partial (\partial_\mu \phi)} \eta - f^\mu. \]
Then the previous computation showed that
\[ \frac{\delta S}{\delta \epsilon} = -\partial_\mu J^\mu. \]
If \(\phi\) is a solution to the Euler-Lagrange equations, then the variation \(dS\) vanishes, hence we obtain:
Theorem (Noether's theorem) The Noether current is divergence free, i.e.
\[ \partial_\mu J^\mu = 0.\]
Functional Version
First, we derive the functional analogue of the classical equations of motion. Consider an expectation value
\[ \langle \mathcal{O(\phi)} \rangle
= \int \mathcal{O}(\phi) e^{\frac{i}{\hbar} S} \mathcal{D}\phi \]
We'll assume that \(\phi\) takes values in a vector space (or bundle). Then we can perform a change of variables \(\psi = \phi + \epsilon\), and since \(\mathcal{D}\phi = \mathcal{D}\psi\) we find that
\[ \int \mathcal{O}(\phi+\epsilon) \exp\left(\frac{i}{\hbar} S[\phi] \right) \mathcal{D}\phi \]
is independent of \(\epsilon\). Expanding to first order in \(\epsilon\), we have
\[ 0 = \int \left(\frac{\delta\mathcal{O}}{\delta \phi}
+ \frac{i \mathcal{O}}{\hbar} \frac{\delta S}{\delta \phi} \right)
\exp \left( \frac{i}{\hbar} S \right) \mathcal{D}\phi \]
So we find the quantum analogue of the equations of motion:
\[ \left\langle \frac{\delta \mathcal{O}}{\delta \phi} \right\rangle
+ \frac{i}{\hbar} \left\langle \mathcal{O} \frac{\delta S}{\delta \phi} \right\rangle = 0\]
Next, we move on to the quantum version of Noether's theorem. Suppose there is a transformation \(Q\) of the fields leaving the action invariant. Assuming the path integral measure is invariant, we obtain
\[ \left\langle QF \right \rangle + \frac{i}{\hbar} \left\langle F QS \right\rangle = 0\]
To compare with the classical result, consider \(Q\) to be the (singular) operator
\[ Q = \frac{\delta}{\delta \epsilon(x)} \]
Then by the previous calculations,
\[ Q S = -\delta_\mu J^\mu, \]
so we obtain
\[ \left\langle \frac{\delta \mathcal{O}}{\delta \epsilon(x)} \right\rangle
= \frac{i}{\hbar} \left\langle \mathcal{O} \partial_\mu J^\mu \right\rangle. \]
This is the Ward-Takahashi identity, the quantum analogue of Noether's theorem.
Saturday, November 24, 2012
The Moyal Product
Today I want to understand the Moyal product, as we will need to understand it in order to construct quantizations of symplectic quotients. (More precisely, to incorporate stability conditions.)
Let \(A\) be the algebra of polynomial functions on \(T^\ast \mathbb{C}^n\). This algebra has a natural Poisson bracket, given by
\[ \{p_i, x_j\} = \delta_{ij}. \]
We would like to define a new associative product \(\ast\) on \(A((\hbar))\) satisfying:
\[ f \ast g = m \circ B(f \otimes g). \]
Now, condition (2) tells us that \(B(0) = 1\) and that
\[ \left. \frac{dB}{d\hbar} \right|_{\hbar=0} = \frac{\Pi}{2} \]
So
\[ B = 1 + \frac{\hbar \Pi}{2} + O(\hbar^2) \]
It is natural to guess that \(B\) should be built out of powers of \(\Pi\), and a natural guess is
\[ B = \exp(\frac{\hbar \Pi}{2}), \]
which certainly reproduces the first two terms of our expansion. Let's see that this choice actually works, i.e. defines an associative \(\ast\)-product. Let \(m: A \otimes A \to A\) be the multiplication, and
\(m_{12}, m_{23}: A \otimes A \otimes A \to A \otimes A\), \(m_{123}: A \otimes A \otimes A \to A\) the induced multiplication maps. Then
\begin{align}
f \ast (g \ast h) &= m \circ(B( f \otimes m \circ B(g \otimes h) ) ) \\\
&= m \circ B( m_{23} \circ (1 \otimes B)(f \otimes g \otimes h) ) \\\
&= m_{123} (B \otimes 1)(1 \otimes B)(f \otimes g \otimes h)
\end{align}
On the other hand, we have
\begin{align}
(f \ast g) \ast h) &= m \circ(B( m \circ B(f \otimes g) \otimes h) ) ) \\\
&= m \circ B( m_{12} \circ (B \otimes 1)(f \otimes g \otimes h) ) \\\
&= m_{123} (1 \otimes B)(B \otimes 1)(f \otimes g \otimes h)
\end{align}
Hence, associativity is the condition
\[ m_{123} \circ [1\otimes B, B \otimes 1] = 0. \]
On \(A \otimes A \otimes A\), write \(\partial_i^1\) for the partial derivative acting on the first factor, \(\partial_i^2\) on the second, etc. Then
\[ 1 \otimes B = \sum_n \frac {\hbar^n}{2^n n!}
\Pi^{i_1 j_1} \cdots \Pi^{i_n j_n} \partial^2_{i_1} \partial^3_{j_1} \cdots
\partial^2_{i_n} \partial^3_{j_n} \]
and similarly for \(B \otimes 1\). So we have
\begin{align}
m_{123} (B\otimes 1)(1 \otimes B) &= \sum_n \sum_{k=0}^n \frac {\hbar^n}{2^n k! (n-k)!}
\Pi^{k_1 l_1} \cdots \Pi^{k_k l_k} \partial_{k_1} \partial_{l_1} \cdots
\partial_{k_k} \partial_{l_k} \\\
& \ \times \Pi^{i_1 j_1} \cdots \Pi^{i_{n-k} j_{n-k}} \partial_{i_1} \partial_{j_1} \cdots
\partial_{i_{n-k}} \partial_{j_{n-k}} \\\
&= m_{123}(1 \otimes B)(B \otimes 1)
\end{align}
Hence we obtain an associative \(\ast\)-product. This is called Moyal product.
Now suppose that \(U\) is a (Zariski) open subset of \(X = T^\ast \mathbb{C}^n\). Then the star product induces a well-defined map
\[ \ast: O_X(U)((\hbar)) \otimes_\mathbb{C} O_X(U)((\hbar)) \to O_X(U)((\hbar)) \]
In this way we obtain a sheaf \(\mathcal{D}\) of \(O_X\) modules with a non-commutative \(\ast\)-product defined as above.
Define a \(\mathbb{C}^\ast\) action on \(T^\ast \mathbb{C}^n\) by acting on \(x_i\) and \(p_i\) with weight 1. Extend this to an action on \(\mathcal{D}\) by acting on \(hbar\) with weight -1.
Proposition: The algebra \(C^\ast\)-invariant global sections of \(\mathcal{D}\) is naturally identified with the algebra of differential operators on \(\mathbb{C}^n\).
Proof: The \(\mathbb{C}^\ast\)-invariant global sections are generated by \(\hbar^{-1} x_i\) and \(\hbar^{-1} p_i\). So define a map \(\Gamma(\mathcal{D})^{\mathbb{C}^\ast} \to \mathbb{D}\) by
\[ \hbar^{-1} x_i \mapsto x_i \]
\[ \hbar^{-1} p_i \mapsto \partial_i \]
From the definition of the star product, it is clear that this is an algebra map, and that it is both injective and surjective.
Let \(A\) be the algebra of polynomial functions on \(T^\ast \mathbb{C}^n\). This algebra has a natural Poisson bracket, given by
\[ \{p_i, x_j\} = \delta_{ij}. \]
We would like to define a new associative product \(\ast\) on \(A((\hbar))\) satisfying:
- \(f \ast g = fg + O(\hbar) \)
- \(f \ast g - g \ast f = \hbar \{f, g\} + O(\hbar^2)\)
- \(1 \ast f = f \ast 1 = f\)
- \((f \ast g)^\ast = -g^\ast \ast f^\ast\)
In the last line, the map \((\cdot)^\ast\) takes \(x_i \mapsto x_i\) and \(p_i \mapsto -p_i\). To figure out what this new product should be, let's take \(f,g \in A\) and expand \(f \ast g\) in power series:
\[ f \ast g = \sum_{n=0}^\infty c_n(f,g) \hbar^n \]
Now, equations (1) and (2) will be satisfied by taking \(c_0(f,g) = fg\) and \(c_1(f,g) = \{f,g\}/2\). Let \(\sigma\) be the Poisson bivector defining the Poisson bracket. This defines a differential operator \(\Pi\) on \(A \otimes A\) by
\[ \Pi = \sigma^{ij} (\partial_i \otimes \partial_j) \]
Let \(B = \sum_{n=0}^\infty B_n \hbar^n\) and write the product as
\[ f \ast g = m \circ B(f \otimes g). \]
Now, condition (2) tells us that \(B(0) = 1\) and that
\[ \left. \frac{dB}{d\hbar} \right|_{\hbar=0} = \frac{\Pi}{2} \]
So
\[ B = 1 + \frac{\hbar \Pi}{2} + O(\hbar^2) \]
It is natural to guess that \(B\) should be built out of powers of \(\Pi\), and a natural guess is
\[ B = \exp(\frac{\hbar \Pi}{2}), \]
which certainly reproduces the first two terms of our expansion. Let's see that this choice actually works, i.e. defines an associative \(\ast\)-product. Let \(m: A \otimes A \to A\) be the multiplication, and
\(m_{12}, m_{23}: A \otimes A \otimes A \to A \otimes A\), \(m_{123}: A \otimes A \otimes A \to A\) the induced multiplication maps. Then
\begin{align}
f \ast (g \ast h) &= m \circ(B( f \otimes m \circ B(g \otimes h) ) ) \\\
&= m \circ B( m_{23} \circ (1 \otimes B)(f \otimes g \otimes h) ) \\\
&= m_{123} (B \otimes 1)(1 \otimes B)(f \otimes g \otimes h)
\end{align}
On the other hand, we have
\begin{align}
(f \ast g) \ast h) &= m \circ(B( m \circ B(f \otimes g) \otimes h) ) ) \\\
&= m \circ B( m_{12} \circ (B \otimes 1)(f \otimes g \otimes h) ) \\\
&= m_{123} (1 \otimes B)(B \otimes 1)(f \otimes g \otimes h)
\end{align}
Hence, associativity is the condition
\[ m_{123} \circ [1\otimes B, B \otimes 1] = 0. \]
On \(A \otimes A \otimes A\), write \(\partial_i^1\) for the partial derivative acting on the first factor, \(\partial_i^2\) on the second, etc. Then
\[ 1 \otimes B = \sum_n \frac {\hbar^n}{2^n n!}
\Pi^{i_1 j_1} \cdots \Pi^{i_n j_n} \partial^2_{i_1} \partial^3_{j_1} \cdots
\partial^2_{i_n} \partial^3_{j_n} \]
and similarly for \(B \otimes 1\). So we have
\begin{align}
m_{123} (B\otimes 1)(1 \otimes B) &= \sum_n \sum_{k=0}^n \frac {\hbar^n}{2^n k! (n-k)!}
\Pi^{k_1 l_1} \cdots \Pi^{k_k l_k} \partial_{k_1} \partial_{l_1} \cdots
\partial_{k_k} \partial_{l_k} \\\
& \ \times \Pi^{i_1 j_1} \cdots \Pi^{i_{n-k} j_{n-k}} \partial_{i_1} \partial_{j_1} \cdots
\partial_{i_{n-k}} \partial_{j_{n-k}} \\\
&= m_{123}(1 \otimes B)(B \otimes 1)
\end{align}
Hence we obtain an associative \(\ast\)-product. This is called Moyal product.
Sheafifying the Construction
Now suppose that \(U\) is a (Zariski) open subset of \(X = T^\ast \mathbb{C}^n\). Then the star product induces a well-defined map
\[ \ast: O_X(U)((\hbar)) \otimes_\mathbb{C} O_X(U)((\hbar)) \to O_X(U)((\hbar)) \]
In this way we obtain a sheaf \(\mathcal{D}\) of \(O_X\) modules with a non-commutative \(\ast\)-product defined as above.
Define a \(\mathbb{C}^\ast\) action on \(T^\ast \mathbb{C}^n\) by acting on \(x_i\) and \(p_i\) with weight 1. Extend this to an action on \(\mathcal{D}\) by acting on \(hbar\) with weight -1.
Proposition: The algebra \(C^\ast\)-invariant global sections of \(\mathcal{D}\) is naturally identified with the algebra of differential operators on \(\mathbb{C}^n\).
Proof: The \(\mathbb{C}^\ast\)-invariant global sections are generated by \(\hbar^{-1} x_i\) and \(\hbar^{-1} p_i\). So define a map \(\Gamma(\mathcal{D})^{\mathbb{C}^\ast} \to \mathbb{D}\) by
\[ \hbar^{-1} x_i \mapsto x_i \]
\[ \hbar^{-1} p_i \mapsto \partial_i \]
From the definition of the star product, it is clear that this is an algebra map, and that it is both injective and surjective.
Thursday, November 22, 2012
An Exercise in Quantum Hamiltonian Reduction
Semiclassical Setup
Let the group \(GL(2)\) act on \(V = \mathrm{Mat}_{2\times n}\) and consider the induced symplectic action on \(T^\ast V\). If we use variables \((x,p)\) with \(x\) a \(2 \times n\) matrix and \(p\) an \(n \times 2\) matrix, then the classical moment map \(\mu\) is given by\[ \mu(x,p) = xp \]
This is equivariant with respect to the adjoint action, so we can form the \(GL(2)\)-invariant functions
\[ Z_1 = \mathrm{Tr} \mu \]
\[ Z_2 = \mathrm{Tr} (\mu)^2 \]
If we think of \(x\) as being made of column vectors
\[ x = ( x_1 \cdots x_n ) \]
and similarly think of \(p\) as being made of row vectors, then there are actually many more \(GL(2)\) invariants, given by
\[ f_{ij} = \mathrm{Tr} x_i p_j = p_j x_i \]
In terms of the invariants, the \(Z\) functions are
\[ Z_1 = \sum_k f_{kk} \]
\[ Z_2 = \sum_{jk} f_{jk} f_{kj} \]
Let us compute Poisson brackets:
\begin{align}
\{f_{ij}, f_{kl}\} &= \{p_j^\mu x_i^\mu, p_l^\nu x_k^\nu\} \\\
&= x_i^\mu p_l^\nu \delta_{jk} \delta^{\mu\nu} - p_j^\mu x_k^\nu \delta_{il} \delta^{\mu\nu} \\\
&= f_{il} \delta_{jk} - f_{kj} \delta_{il}.
\end{align}
So we see that the invariants form a Poisson subalgebra (as they should!). Let's compute:
\begin{align}
\{Z_1, f_{ij} &= \sum_k \{ f_{kk}, f_{ij} \} \\\
&= \sum_k \left( f_{kj} \delta_{ki} - f_{ik} \delta_{kj} \right) \\\
&= f_{ij} - f_{ij} = 0.
\end{align}
Hence \(Z_1\) is central with respect to the invariant functions \(f_{ij}\). Similarly,
\begin{align}
\{Z_2, f_{kl}\} &= \sum_{ij} \{f_{ij} f_{ji}, f_{kl}\} \\\
&= \sum_{ij} f_{ij} \left(f_{jl} \delta_{ik} - f_{ki} \delta_{jl} \right) + f_{ji} \left(f_{il} \delta_{jk} - f_{kj} \delta_{il} \right) \\\
&= \sum_j f_{kj} f_{jl} - \sum_i f_{il} f_{ki} + \sum_i f_{ki} f_{il} - \sum_j f_{jl} f_{kj} \\\
&= 0.
\end{align}
So we see that the \(Z_i\) are in the center of the invariant algebra. In fact, they generate it, so we'll denote by \(Z\) the algebra generated by \(Z_1, Z_2\). Let \(A\) be the algebra generated by the \(f_{ij}\). The inclusion \(Z \hookrightarrow A\) can be thought of as a purely algebraic version of the moment map. In particular, given any character \(\lambda: Z \to \mathbb{C}\), we can define the Hamiltonian reduction of \(A\) to be
\[ A_\lambda := A / A\langle \ker \lambda \rangle \]
The corresponding space is of course \(\mathrm{Spec} A\).
The Cartan Algebra and the Center
Define functions\[ h_1 = Z_1 = \sum_i f_{ii} \]
\[ h_2 = Z_2 = \sum_{ij} f_{ij} f_{ji} \]
\[ h_3 = \sum_{ijk} f_{ij} f_{jk} f_{ki} \]
\[ h_k = \sum_{i_1, i_2, \ldots, i_k} f_{i_1 i_2} f_{i_2 i_3} \cdots f_{i_k i_1} \]
These are just the traces of various powers of the \(n \times n\) matrix \(px\). In particular, \(h_k\) for \(k>n\) may be expressed as a function of the \(h_i\) for \(i \leq n\). The algebra generated by the \(H\) plays the role of a Cartan subalgebra. So we have inclusions
\[ Z \subset H \subset A \]
Quantization
Now we wish to construct a quantization of \(A\) and \(A_\lambda\). The quantization of \(A\) is obvious: we quantize \(T^\ast V\) by taking the algebraic differential operators on \(V\). Denote this algebra by \(\mathbb{D}\). It is generated by \(x_i\) and (\partial_i\) satisfying the relation\[ [\partial_i, x_j] = \delta_{ij} \]
Then we simply the subalgebra of \(GL(2)\)-invariant differential operators as our quantization of \(A\). Call this subalgebra \(U\). We can define Hamiltonian reduction analogously by taking central quotients. So we need to understand the center \(Z(U)\), but this is just the subalgebra generated by quantizations of \(Z_1\) and \(Z_2\), i.e. the subalgebra of all elements whose associated graded lies in \(Z(A)\).
More to come: stability conditions, \(\mathbb{D}\)-affineness, and maybe proofs of some of my claims.
Wednesday, November 7, 2012
The 1PI Effective Action
In this post I'd like to try to understand the 1PI effective action that is often of interest. Suppose we have a QFT in some bosonic field \(\phi(x)\) taking values in a vector space (this is important). Then its vev \(\phi_{cl}(x) := \langle \phi(x) \rangle\) is just an ordinary (but possibly distributional) field on spacetime. The question is, what is the field equation satisfied by \(\phi_{cl}\)? I.e., if we average over quantum effects by replacing all fields by their vevs, what is the action that governs this (now completely classical) theory? The 1PI effective action answers exactly this question.
Consider the generating functional
\[ Z[J] = \int e^{-S[\phi]+\langle \phi, J \rangle} \mathcal{D}\phi \]
Then for a given source \(J\), define the \(J\)-vev of \(\phi(x)\) to be
\[ \phi_J(x) = \frac{\partial \log Z[J]}{\partial J}. \]
Now let's take \(\Gamma\) to be the Legendre transform of \(\log Z\) with respect to \(J\):
\[ \Gamma[\phi_J] = \langle J, \phi_J \rangle - \log Z[J] \]
Then we compute:
\[ \frac{\partial \Gamma}{\partial \phi_J} = J + \frac{\partial J}{\partial \phi_J} \phi_J
-\frac{\partial \log Z[J]}{\partial J} \frac{\partial J}{\partial \phi_J} = J. \]
Now consider the situation without a background source, i.e. \(J = 0\). Then \(\phi_0 = \phi_{cl}\) and we find
\[ \frac{\partial \Gamma}{\partial \phi_{cl}} = 0 \]
Hence, \(\phi_{cl}\) satisfies the Euler-Lagrange equations associated to the functional \(\Gamma\). Note that from the Legendre transform, \(\Gamma\) takes quantum effects (i.e. Feynman diagrams with loops) into account, even though the field and the equations are purely classical!
By studying these equations, we might find instanton solutions (or solitons in Lorentz signature).
Now for the name. Some combinatorics and algebra (which I will skip!) show that \(\Gamma[\phi_{cl}]\) is itself a generating functional for certain correlation functions, then 1PI correlation functions:
\[ \frac{\partial^n \Gamma}{\partial \phi(x_1) \cdots \partial \phi(x_n)} = \langle \phi(x_1) \cdots \phi(x_n) \rangle_{1PI}. \]
The 1PI subscript means that the RHS is computed in perturbation theory by summing over only the connection 1PI (1 particle irreducible) Feynman diagrams.
Warning: As usual, there are regularization issues, both in the UV and IR. UV divergences can be solved by a cutoff (if we only care about effective field theory), but IR divergences are much more technical. For this reason (and others), it is sometimes preferable to try to understand the low energy dynamics by studying the Wilsonian effective action. As the Wilsonian effective action does not take IR modes into account, it can avoid many of the difficulties of the 1PI effective action.
Consider the generating functional
\[ Z[J] = \int e^{-S[\phi]+\langle \phi, J \rangle} \mathcal{D}\phi \]
Then for a given source \(J\), define the \(J\)-vev of \(\phi(x)\) to be
\[ \phi_J(x) = \frac{\partial \log Z[J]}{\partial J}. \]
Now let's take \(\Gamma\) to be the Legendre transform of \(\log Z\) with respect to \(J\):
\[ \Gamma[\phi_J] = \langle J, \phi_J \rangle - \log Z[J] \]
Then we compute:
\[ \frac{\partial \Gamma}{\partial \phi_J} = J + \frac{\partial J}{\partial \phi_J} \phi_J
-\frac{\partial \log Z[J]}{\partial J} \frac{\partial J}{\partial \phi_J} = J. \]
Now consider the situation without a background source, i.e. \(J = 0\). Then \(\phi_0 = \phi_{cl}\) and we find
\[ \frac{\partial \Gamma}{\partial \phi_{cl}} = 0 \]
Hence, \(\phi_{cl}\) satisfies the Euler-Lagrange equations associated to the functional \(\Gamma\). Note that from the Legendre transform, \(\Gamma\) takes quantum effects (i.e. Feynman diagrams with loops) into account, even though the field and the equations are purely classical!
By studying these equations, we might find instanton solutions (or solitons in Lorentz signature).
Now for the name. Some combinatorics and algebra (which I will skip!) show that \(\Gamma[\phi_{cl}]\) is itself a generating functional for certain correlation functions, then 1PI correlation functions:
\[ \frac{\partial^n \Gamma}{\partial \phi(x_1) \cdots \partial \phi(x_n)} = \langle \phi(x_1) \cdots \phi(x_n) \rangle_{1PI}. \]
The 1PI subscript means that the RHS is computed in perturbation theory by summing over only the connection 1PI (1 particle irreducible) Feynman diagrams.
Warning: As usual, there are regularization issues, both in the UV and IR. UV divergences can be solved by a cutoff (if we only care about effective field theory), but IR divergences are much more technical. For this reason (and others), it is sometimes preferable to try to understand the low energy dynamics by studying the Wilsonian effective action. As the Wilsonian effective action does not take IR modes into account, it can avoid many of the difficulties of the 1PI effective action.
Monday, November 5, 2012
Spontaneous Symmetry Breaking in QFT
In this post I want to try to understand symmetry breaking and the origin of the moduli space of vacua. Most of this can be found in the lectures by Witten in vol. 2 of Quantum Fields and Strings.
Consider a particle in one spatial dimension, with Hamiltonian
\[ H = -\frac{\hbar^2}{2} + (a^2-x^2)^2. \]
The classical ground states are given by the stationary solutions \(x(t) = \pm a\). Hence we might expect that the quantum Hamiltonian has a degenerate ground state, i.e. the eigenspace of the lowest eigenvalue has dimension greater than one. However, this is not the case!
Sketch of proof: Define a function \(E(\phi)\) on the unit sphere in \(L^2(\mathbb{R})\). If \(\phi\) is a global minimum, then it necessarily satisfies \(H\psi = E_0 \phi\) where \(E_0\) is the lowest eigenvalue of \(H\). On the other hand, \(E(\phi) = E(|\phi|)\) so \(|\phi|\) is also a global minimum, hence satisfies \(H|\phi| = E_0|\phi|\). This equation is elliptic, so by elliptic regularity \(|\phi|\) must be at least \(C^1\), and hence \(\phi(x)\) has constant phase, so we might as well take \(\phi(x)\) to be real and positive. Any other ground state \(\psi\) would have these properties, and hence \((\phi,\psi) \neq 0\). Hence the eigenspace is 1-dimensional and the ground state is not degenerate.
Now by the Stone-von Neumann theorem, there is a unique irreducible representation of the canonical commutation relations on a separable Hilbert space, and by the argument above there is a unique (up to scale) vector \(|\Omega\rangle\) which is the ground state of the Hamiltonian, called the vacuum vector. So for QM, we find a unique representation on \(\mathcal{H}\) together with a unique vacuum vector \(|\Omega\rangle \in \mathcal{H}\). The point I want to stress here is that the Poisson algebra of observables together with the Hamiltonian determine the data \(\mathcal{H}, |\Omega\rangle)\) in an essentially unique way, so there is no ambiguity in quantization and no further choices need to be made.
Now we'll argue that a similar symmetry breaking phenomenon should be expected whenever the spatial part of spacetime has finite volume. We'll have to use formal path integral arguments, so of course this won't be totally rigorous. Suppose we have two representations \(\mathcal{H}_\pm\) with vacua \(|\Omega\rangle_\pm\). Then we consider the direct sum \(\mathcal{H} = \mathcal{H}_+ \oplus \mathcal{H}_-\). Then by construction, we should have
\[ (\Omega_+, e^{-tH} \Omega_-) = 0 \]
since \(|\Omega\rangle_\pm\) are orthogonal eigenstates of \(H\). On the other hand, we can compute this inner product using the Feynman-Kac formula. The semi-classical approximation to the path integral yields
\[ (\Omega_+, e^{-tH} \Omega_-) = C \exp\left(- \frac{S(t) V}{\hbar} \right) \]
Where \(S(t)\) is the classical least action and \(V\) is the volume of space. If \(V < \infty\), the right hand side is non-zero, contradicting our assumptions! Hence the vacuum is non-degenerate. So we find that QFT with finite spatial volume is much like QM, at least as far as symmetry breaking is concerned.
Note that this argument is essentially just a formal manipulation of the path integral, so you should expect a result of this form independent of the particular regularization scheme used to define the path integral.
So we see that the situation is significantly more complicated. We can expect a moduli space \(\mathcal{M}\) of vacua of the theory, and the low energy effective theory is described by a \(\sigma\)-model with target \(\mathcal{M}\). I'll try to discuss this in more detail in follow-up posts.
Non-Example: Quantum Mechanics
The main point of confusion for me is that my quantum intuition comes from ordinary quantum mechanics. However, this turns out to be incredibly misleading because for most reasonable quantum mechanical systems, spontaneous symmetry breaking cannot occur. In fact, we'll see that even in field theory, the question of whether spontaneous symmetry breaking can occur is intimately related to the geometry of spacetime. Since quantum mechanics is QFT in \(0+1\) dimensions (i.e., the spatial part of spacetime is just a point), spontaneous symmetry breaking is forbidden.Consider a particle in one spatial dimension, with Hamiltonian
\[ H = -\frac{\hbar^2}{2} + (a^2-x^2)^2. \]
The classical ground states are given by the stationary solutions \(x(t) = \pm a\). Hence we might expect that the quantum Hamiltonian has a degenerate ground state, i.e. the eigenspace of the lowest eigenvalue has dimension greater than one. However, this is not the case!
Sketch of proof: Define a function \(E(\phi)\) on the unit sphere in \(L^2(\mathbb{R})\). If \(\phi\) is a global minimum, then it necessarily satisfies \(H\psi = E_0 \phi\) where \(E_0\) is the lowest eigenvalue of \(H\). On the other hand, \(E(\phi) = E(|\phi|)\) so \(|\phi|\) is also a global minimum, hence satisfies \(H|\phi| = E_0|\phi|\). This equation is elliptic, so by elliptic regularity \(|\phi|\) must be at least \(C^1\), and hence \(\phi(x)\) has constant phase, so we might as well take \(\phi(x)\) to be real and positive. Any other ground state \(\psi\) would have these properties, and hence \((\phi,\psi) \neq 0\). Hence the eigenspace is 1-dimensional and the ground state is not degenerate.
Now by the Stone-von Neumann theorem, there is a unique irreducible representation of the canonical commutation relations on a separable Hilbert space, and by the argument above there is a unique (up to scale) vector \(|\Omega\rangle\) which is the ground state of the Hamiltonian, called the vacuum vector. So for QM, we find a unique representation on \(\mathcal{H}\) together with a unique vacuum vector \(|\Omega\rangle \in \mathcal{H}\). The point I want to stress here is that the Poisson algebra of observables together with the Hamiltonian determine the data \(\mathcal{H}, |\Omega\rangle)\) in an essentially unique way, so there is no ambiguity in quantization and no further choices need to be made.
QFT in Finite Volume
Now we'll argue that a similar symmetry breaking phenomenon should be expected whenever the spatial part of spacetime has finite volume. We'll have to use formal path integral arguments, so of course this won't be totally rigorous. Suppose we have two representations \(\mathcal{H}_\pm\) with vacua \(|\Omega\rangle_\pm\). Then we consider the direct sum \(\mathcal{H} = \mathcal{H}_+ \oplus \mathcal{H}_-\). Then by construction, we should have
\[ (\Omega_+, e^{-tH} \Omega_-) = 0 \]
since \(|\Omega\rangle_\pm\) are orthogonal eigenstates of \(H\). On the other hand, we can compute this inner product using the Feynman-Kac formula. The semi-classical approximation to the path integral yields
\[ (\Omega_+, e^{-tH} \Omega_-) = C \exp\left(- \frac{S(t) V}{\hbar} \right) \]
Where \(S(t)\) is the classical least action and \(V\) is the volume of space. If \(V < \infty\), the right hand side is non-zero, contradicting our assumptions! Hence the vacuum is non-degenerate. So we find that QFT with finite spatial volume is much like QM, at least as far as symmetry breaking is concerned.
Note that this argument is essentially just a formal manipulation of the path integral, so you should expect a result of this form independent of the particular regularization scheme used to define the path integral.
QFT in Infinite Volume
Now we consider the case \(V = \infty\), i.e. a non-compact space. Then the preceding argument fails spectacularly, as does the Stone-von Neumann theorem. So there is no guarantee of a unique irreducible representation of the algebra, and no guarantee of a unique vacuum vector.So we see that the situation is significantly more complicated. We can expect a moduli space \(\mathcal{M}\) of vacua of the theory, and the low energy effective theory is described by a \(\sigma\)-model with target \(\mathcal{M}\). I'll try to discuss this in more detail in follow-up posts.
Sunday, November 4, 2012
Seiberg-Witten Theory and the Riemann-Hilbert Problem
References:
The Classical Moduli Space of Vacua
For definiteness, we'll consider just the case of \(SU(2)\) considered by Seiberg and Witten. There is a scalar Higgs field \(\phi\). The classical vacua of the theory are given by the absolute minima of the potential energy, which in this case is proportional to\[ \mathrm{Tr}[\phi, \phi^\dagger]^2 \]
Hence at the minimum, \([\phi,\phi^\dagger]=0\) and \(\phi\) is diagonalizable. Hence the classical moduli space of vacua \(\mathcal{M}_{cl}\) is just \(\mathbb{C}\), with complex coodinate \(a\), corresponding to the Higgs field
\[ \phi = \left( \begin{array}{rr}a & 0 \\ 0 & -a\end{array} \right) \]
Actually, due to gauge invariance, it is better to introduce another copy of the complex plane \(\mathcal{B}\) with local coordinate \(u = \frac{1}{2} Tr \phi^2 = a^2\). Then we can think of \(\mathcal{M}_{cl}\) as a branched cover of \(\mathcal{B}\), with \(a\) a (local) choice of square root of \(u\).
The goal is to understand the low energy effective theory. We introduce a cutoff \(\Lambda\) to define the quantum theory, and integrate out all degrees of freedom except for the low momentum modes of \(\phi\) (in particular, we integrate out the gauge field d.o.f.). The result is a \(\sigma\)-model with target \(\mathcal{M}_{cl}\). The kinetic term of the \(\sigma\)-model is governed by the metric on \(\mathcal{M}_{cl}\), hence the low energy effective action determines a metric on \(\mathcal{M}_{cl}\).
We'll see that 1-loop calculations introduce monodromy, so that in the quantum theory, "functions" on \(\mathcal{M}_{cl}\) are actually sections of non-trivial bundles over \(\mathcal{M}_{cl}\), and furthermore that the metric receives corrections from instantons (or BPS states). So what we really would like to understand/construct is the quantum moduli space of vacua \(\mathcal{M}\), which will be some non-trivial modification of \(\mathcal{M}_{cl}\). The key to the Seiberg-Witten solution is that susy allows us to reduce the problem to finding a specified set of holomorphic functions (in the \(u\) coordinate\) satsfying certain monodromies, and that once we know the monodromies the solution is given to us by the Riemann-Hilbert correspondence.
The Riemann-Hilbert Correspondence
Let \(X\) be \(\mathbb{P}^1\) with punctures at the points \(z_1, \ldots, z_n\). Let \(U\) be the universal cover of \(X\) and let \(G\) be the fundamental group of \(X\) (pick some basepoint away from the punctures). A set of monodromy matrices is exactly what is needed to specify a representation \(V\) of \(G\). Since \(U / G = X\), we can form the associated bundle \(E = U \times_G V\) over \(X\). The (trivial) \(G\)-connection on \(U \to X\) induces a flat connection \(\nabla\) on \(E\). This gives a map from representations of \(G\) to flat connections on \(X\).Conversely, given a flat connection on \(X\), the monodromy about the punctures determines a representation of \(G\). Hence monodromy is a map from flat connections on \(X\) to representations of \(G\). The Riemann-Hilbert correspondence is that these two maps are bijections, modulo the natural notions of equivalence (conjugacy and gauge transformations).
Gross Overview of the Seiberg-Witten Approach
We are now ready to sketch the "big picture" idea of Seiberg and Witten, which applies not only to their \(N=2, d=4\) example but also to certain other compactifications of the \(N=1, d=6\) theory (in particular, the one considered by Gaiotto-Moore-Neitzke).As discussed above, the theory will have a classical moduli space of vacua \(\mathcal{M}\), which turns out to be a complex manifold (or variety, and possibly with singularities). We'll let \(u\) be an abstract local complex coordinate on \(\mathcal{M}\). Supersymmetry then tells us that the main quantities we are interested in (to compute the low energy effective action) are holomorphic in \(u\) (away from the singularities/punctures of \(\mathcal{M}\)!). The general outline is as follows:
- Identify functions \(f_i(u)\) which by susy are holomorphic in \(u\).
- Compute the 1-loop corrections to \(f_i(u)\).
- Compute monodromies of the corrected \(f_i(u)\).
- Find the desired \(f_i(u)\) by solving the Riemann-Hilbert problem for these monodromies.
Now, to be more clear, it is a consequence of susy that the renormalized quantities \(f_i(u)\) are given schematically by
\[ f_{i, \mathrm{ren}}(u) = f_{i, \mathrm{cl}}(u) + f_{i,1}(\frac{u}{\Lambda})
+ \sum_{k=0}^\infty c_{i,k} \left(\frac{\Lambda}{u} \right)^k \]
Here, \(f_{i,\mathrm{cl}}(u)\) is the classical function, \(f_{i,1}(u)\) is the one-loop correction, and the terms in the series are corrections coming from instantions (BPS states). Non-renormalization theorems due to susy guarantee that there are no higher loop corrections. One expects the instanton series to converge, and hence the monodromy is completely determined by the one-loop calculation. This is the key: by Riemann-Hilbert, the monodromy determines the \(f_i(u)\) uniquely--solving the Riemann-Hilbert problem is equivalent to computing the infinitely-many instantion corrections!
Now in general, solving the Riemann-Hilbert problem is difficult, so this reduction is of a theoretical but not necessarily practical nature. The second main idea of Seiberg and Witten is that we can solve this Riemann-Hilbert problem explicitly by introducing a family of curves \(\{C_u\}_{u\in\mathcal{B}}\), called Seiberg-Witten curves (or spectral curves).
Here, \(f_{i,\mathrm{cl}}(u)\) is the classical function, \(f_{i,1}(u)\) is the one-loop correction, and the terms in the series are corrections coming from instantions (BPS states). Non-renormalization theorems due to susy guarantee that there are no higher loop corrections. One expects the instanton series to converge, and hence the monodromy is completely determined by the one-loop calculation. This is the key: by Riemann-Hilbert, the monodromy determines the \(f_i(u)\) uniquely--solving the Riemann-Hilbert problem is equivalent to computing the infinitely-many instantion corrections!
Now in general, solving the Riemann-Hilbert problem is difficult, so this reduction is of a theoretical but not necessarily practical nature. The second main idea of Seiberg and Witten is that we can solve this Riemann-Hilbert problem explicitly by introducing a family of curves \(\{C_u\}_{u\in\mathcal{B}}\), called Seiberg-Witten curves (or spectral curves).
Electric-Magnetic Duality
An absolutely key requirement of the Seiberg-Witten construction is electric-magnetic duality. Maxwell's equations in vacuum are
\[ dF = 0, \ \ \ d\ast F = 0. \]
Here \(F\) is a 2-form, and \(\ast F\) is its Hodge dual, a \((d-2)\)-form in \(d\)-dimensions. The first equation implies that \(F = dA\) for some 1-form \(A\), and we normally think of the second equation as the Euler-Lagrange equations for the action written in terms of \(A\). However, we could equally well take the starting point to be the second equation, taking \(\ast F = dB\), and take the first equation to be the Euler-Lagrange equations for \(B\). The problem with either of these approaches is that they allow particles of either electric or magnetic charge, but not both.
To put electric and magnetic charge on equal footing, we introduce fields \(F\) and \(F_D\) (a 2-form and a (d-2)-form\). Then the Lagrangian is (up to factors that I'm too lazy to care about)
\[ \mathcal{L} = \mathrm{Tr} F \wedge F_D \]
However, to recover Maxwell's equations, we need to impose \(\ast F_D = F\) as a constraint. So to get the right equations of motion, introduce an auxiliary field \(\lambda\) (a Lagrange multiplier), and modify the Lagrangian:
\[ \mathcal{L} = \mathrm{Tr} F \wedge F_D + \lambda(F - \ast F_D) \]
Variation with respect to \((F, F_D, \lambda)\) will reproduce Maxwell's equations exactly, but in this form the EM duality is manifest. Since EM duality exchanges electric and magnetic charges, we should consider how to modify the Lagrangian to couple the field to EM sources. Let \(J_e, J_m\) be the electric and magnetic currents, respectively. Up to conventions, Maxwell's equations read
\[ dF = J_m, \ \ \ d F_D = J_e. \]
Then we take the Lagrangian to be
\[ \mathcal{L} = \mathrm{Tr} F \wedge F_D + \lambda(F - \ast F_D) + F \wedge J_e + J_m \wedge F_D \]
to reproduce the right equations of motion.
In this form, we can consider particles with electric or magnetic charge (or both--dyons). If our gauge group has rank \(r\), then the lattice of electric charges is \(\mathbb{Z}^r\), while the lattice of magnetic charges is \((\mathbb{Z}^\ast)^r\). Hence the lattice of electromagnetic charges is
\[ \Gamma = \mathbb{Z}^r \oplus (\mathbb{Z}^\ast)^r \]
which comes with a natural symplectic pairing
\[\langle \cdot, \cdot \rangle: \Gamma \otimes \Gamma \to \mathbb{Z}.\]
(You might ask why we take the natural sympletic pairing as opposed to the natrual symmetric pairing. This is because there is actually a larger \(SL(2,\mathbb{Z})\) symmetry of the theory which preserves the symplectic pairing but not the symmetric pairing.)
Now there is an obvious source of symplectic lattices. Simply let \(C\) be a genus \(r\) compact Riemann surface. Then \(\Gamma = H_1(C, \mathbb{Z})\) is a symplectic lattice of rank \(2r\), where the symplectic pairing is now given by the intersection pairing. In fact, we can say more--if we take \(a\)- and \(b\)-cycles as generators, these form a Darboux (symplectic) basis of \(\Gamma\).
Back to the gauge theory problem. Recall that the 1-loop calculation and consideration of BPS states leads to a set of monodromy data on \(\mathcal{B}\). Suppose now that we could find a complex surface \(C \to \mathcal{B}\) whose fibers \(C_u\) are (possibly singular) genus \(r\) curves, and such that the monodromies of \(\Gamma_u := H_1(C_u, \mathbb{Z})\) agree with the given monodromies. Then we can solve the Riemann-Hilbert problem by doing geometry on this family, i.e. by finding holomorphic sections of certain associated bundles.
The SU(2) Seiberg-Witten Solution
We will now specialize to the case considered in the original paper of Seiberg and Witten. I will only construct the family--the details of the solution will follow in a subsequent post.In this case, the group has rank \(r=1\), so we should be looking for a family of elliptic curves. In this case, the solution is almost obvious, given what I've said above. Seiberg and Witten argue that the moduli space \(\mathcal{B}\) must be \(\mathbb{C} \setminus \{\Lambda^2, -\Lambda^2\}\). The punctures at \(\pm \Lambda^2\) come from BPS states whose mass goes to zero at those values of \(u\). So the monodromy consists of three matrices, \(M_\infty, M_\pm\), the monodromies computed around \(\infty\) and \(\pm \Lambda^2\). These generate a certain modular subgroup \(G\) of \(SL(2, \mathbb{Z})\), allowing us to realize \(\mathcal{B}\) as the modular curve \(H / G\) (where \(H\) is the upper half-plane). Now, the space of elliptic curves is just \(H / SL(2, \mathbb{Z})\). So given any \(u \in \mathbb{B}\), we pick a lift \(\tilde{u}\) in \(H\) and let \(C_u\) be the corresponding elliptic curve. This is exactly the family needed to solve the Riemann-Hilbert problem!
Next time: details of this construction, including exact formulas, and some words about instanton counting.
Monday, October 29, 2012
BPS States and Wall-Crossing
This is the first in what I hope will become a series of posts on BPS state counting and wall-crossing. I'm participating in gLab, and our most immediate goal is to understand the Kontsevich-Soibelman wall-crossing formula (KSWCF) in the context of quadratic differentials on a (punctured) Riemann surface, following the lectures of Kontsevich and Neitzke at IHES.
The purpose of these posts is to keep a written record of my attempts to understand the physics behind the WCF as well as the work of Gaiotto-Moore-Neitzke.
References:
Video Lectures:
The purpose of these posts is to keep a written record of my attempts to understand the physics behind the WCF as well as the work of Gaiotto-Moore-Neitzke.
References:
- Witten's lectures on dynamics of QFT
- Gaiotto-Moore-Neitzke
- Kontsevich-Soibelman
- Seiberg and Witten
- Distler's blogpost, and
- This post on the n-Category cafe (note: Bridgeland's talk relates quadratic differentials to stability conditions).
Video Lectures:
- IHES Lectures by Neitzke and Kontsevich
- Lectures by Moore on the (2,0) d=6 superconformal theory
- PITP 2010 (Gaiotto, Moore, Witten, Seiberg, others!)
- Neitzke: What is a BPS state?
- Gauge fields and strings
Physics Setup:
Warning: I'm still trying to sort this all out, so a lot of this will be fuzzy and/or completely wrong. I will try to point out the points of confusion.
We will start with some kind of family of susy gauge theories (or rather, a single "theory" with a family of vacua, depending on what asymptotic boundary conditions we specify in the path integral). We let \(\mathcal{B}\) be some kind of manifold (or variety, possibly with singularities?), and \(\{\mathcal{H}_u\}_{u \in \mathcal{B}}\) a family (bundle) of Hilbert spaces, depending on \(u \in \mathcal{B}\). Concretely, \(\mathcal{B}\) will parametrize the vacuum expectation values (VEVs) of the scalar fields of the theory. (Note, for non-scalar fields we can typically expect VEVs to vanish, for example by looking at the action of the Lorentz group.) Actually, to be more precise, \(\mathcal{B}\) parametrizes the Coulomb branch--where the VEVs break the gauge symmetry to a maximal torus (as opposed to the Higgs branch, where the VEVs just break the gauge group to a smaller subgroup).
The next ingredient is a lattice \(\Gamma\), the charge lattice, which is supposed to parametrize all possible electric and magnetic charges. Since electric and magnetic charges are dual, this lattice has a pairing \(\Gamma \otimes \Gamma \to \mathbb{Z}\) which is symplectic (or possibly just Poisson?). (Actually, maybe we should think of \(\Gamma\) as being a bundle of lattices over \(\mathcal{B}\), but this isn't completely clear to me.) The lattice gives a grading of \(\mathcal{H}\):
\[ \mathcal{H} = \bigoplus_{\gamma \in \Gamma} \mathcal{H}_\gamma \].
Now, the Hilbert spaces \(\mathcal{H}_u\) are supposed to carry representations of the \(\mathcal{N}=2\) susy algebra, with central charge \(Z\). On any state of charge \(\gamma\) above the point \(u \in \mathcal{B}\), the central charge \(Z\) acts as a scalar, which we denote by \(Z_\gamma(u)\). Manipulations with the susy algebra show the BPS bound \(M \geq |Z_\gamma(u)|\), where \(M\) is the mass of a state with charge \(\gamma\). A state is called BPS if it saturates this bound.
Finally, I'll end this post by attempting to define (or at least motivate) the walls of marginal stability. In all known examples, we have
\[ |Z_{\gamma_1 + \gamma_2}(u)|^2 = |Z_{\gamma_1}(u)|^2 + |Z_{\gamma_2}(u)|^2
+2 \mathrm{Re}(Z_{\gamma_1}(u) \bar{Z}_{\gamma_2}(u) ) \]
If the cross-term is negative, then it is possible to form stable bound states (since the mass of a BPS state of charge \(\gamma_1+\gamma_2\) is strictly less than the sum of the corresponding masses); and it is impossible to form stable bound states if the cross-term is positive. This (naive!) dichotomy tells us that there is something very special about the intermediate case. For a pair of charges \(\gamma_1, \gamma_2\) we define a wall in \(\mathcal{B}\) by
\[ W(\gamma_1, \gamma_2) = \{u \in \mathcal{B} \ | \ \mathrm{Re}(Z_{\gamma_1}(u)\bar{Z}_{\gamma_2}(u)) = 0 \} \]
and we define \(W \subset \mathcal{B}\) to be the union of all the walls.
The idea of wall-crossing is the following. We define some functions \(\Omega(\gamma; u)\) on \(\mathcal{B} \setminus W\) which are locally constant. These functions are supposed to count the number of BPS states of charge \(\gamma\) (where count really means take the trace of a particular operator over \(\mathcal{H}_{\gamma, \mathrm{BPS}}\)). The wall-crossing formula is an explicit formula that relates \(\Omega(\gamma; u_+)\) and \(\Omega(\gamma; u_-)\) for \(u_+, u_-\) on opposite sides of a wall in \(\mathcal{B}\). There are two applications of WCF:
1. We pick some particular \(u \in \mathcal{B}\) for which \(\Omega\) is particularly easy to calculate ("extreme stability"). Then by KSWCF we actually know how to compute \(\Omega\) on all of \(\mathcal{B} \setminus W\).
2. Gaiotto-Moore-Neitzke study a certain QFT whose low energy effective action is a sigma model with target space \(\mathcal{M}\), the moduli space of Higgs bundles over a Riemann surface. The invariants \(\Omega(\gamma; u)\) together with KSWCF allow them to compute the low energy effective action explicitly, giving an explicit construction of holomorphic Darboux coordinates on \(\mathcal{M}\). This is enough to recover the full hyperkahler metric on \(\mathcal{M}\), in local coordinates!
Tasklist (incomplete!):
If the cross-term is negative, then it is possible to form stable bound states (since the mass of a BPS state of charge \(\gamma_1+\gamma_2\) is strictly less than the sum of the corresponding masses); and it is impossible to form stable bound states if the cross-term is positive. This (naive!) dichotomy tells us that there is something very special about the intermediate case. For a pair of charges \(\gamma_1, \gamma_2\) we define a wall in \(\mathcal{B}\) by
\[ W(\gamma_1, \gamma_2) = \{u \in \mathcal{B} \ | \ \mathrm{Re}(Z_{\gamma_1}(u)\bar{Z}_{\gamma_2}(u)) = 0 \} \]
and we define \(W \subset \mathcal{B}\) to be the union of all the walls.
The idea of wall-crossing is the following. We define some functions \(\Omega(\gamma; u)\) on \(\mathcal{B} \setminus W\) which are locally constant. These functions are supposed to count the number of BPS states of charge \(\gamma\) (where count really means take the trace of a particular operator over \(\mathcal{H}_{\gamma, \mathrm{BPS}}\)). The wall-crossing formula is an explicit formula that relates \(\Omega(\gamma; u_+)\) and \(\Omega(\gamma; u_-)\) for \(u_+, u_-\) on opposite sides of a wall in \(\mathcal{B}\). There are two applications of WCF:
1. We pick some particular \(u \in \mathcal{B}\) for which \(\Omega\) is particularly easy to calculate ("extreme stability"). Then by KSWCF we actually know how to compute \(\Omega\) on all of \(\mathcal{B} \setminus W\).
2. Gaiotto-Moore-Neitzke study a certain QFT whose low energy effective action is a sigma model with target space \(\mathcal{M}\), the moduli space of Higgs bundles over a Riemann surface. The invariants \(\Omega(\gamma; u)\) together with KSWCF allow them to compute the low energy effective action explicitly, giving an explicit construction of holomorphic Darboux coordinates on \(\mathcal{M}\). This is enough to recover the full hyperkahler metric on \(\mathcal{M}\), in local coordinates!
Tasklist (incomplete!):
- Define susy algebra, derive BPS bound
- Understand/construct the charge lattice and its pairing
- Sketch that 3d sigma model with \(\mathcal{N}=4\) has a hyperkahler target
- Sketch/understand why the low energy effective action has target Higgs
- Understand computation of effective action: Seiberg-Witten curves and all that
- Understand how KSWCF implies consistency of the Darboux coordinates
Monday, October 22, 2012
Seiberg-Witten Theory Video Lectures
I found some lectures by Sara Pasquetti on Seiberg-Witten theory here:
Unfortunately, it seems that the quality is so poor that it is impossible to read the blackboard!
Monday, August 27, 2012
A Toy Model for Effective Field Theory and Extra Dimensions
I wanted to see how the Fourier transform can turn field theory into many-particle mechanics. This is just silly fooling around, so you shouldn't take what follows too seriously (there are much better models of extra dimensions, to be sure!).
Take \(\phi(t, s)\) to be a field on a cylinder of radius \(R\). We consider the action
\[ S = \frac{1}{R} \int_{-\infty}^\infty \int_0^{R} |\nabla \phi|^2 ds dt \]
Expand \(\phi(t, s)\) in Fourier series:
\[ \phi(t,s ) = \sum_n \phi_n(t) e^{2 \pi i n s / R} \]
Then in Lorentzian signature, we have
\[ \int_0^{R} |\nabla \phi|^2 d\theta = R \sum_n \dot{\phi}_n^2 - \left(\frac{2\pi n}{R}\right)^2 \phi_n^2. \]
Putting this back into the action, we find
\[ S = \sum_n \int_{-\infty}^\infty \dot{\phi}_n^2- \left(\frac{2\pi n}{R}\right)^2 \phi_n^2 dt. \]
This is the action for infinitely many harmonic oscillators, with frequencies \(\omega_n = 2\pi |n| / R\). Recall that the energy levels of the harmonic oscillator are \(k\omega\) for \(k = 0, 1, \ldots\). So supposing that only a finite energy \(E\) is accessible in some particular experiment, we can only excite those modes \(\phi_n\) for which
\[ \frac{2\pi |n|}{R} < E. \]
In particular, only finitely many \(\phi_n\) may be excited at energies below \(E\), effectively reducing the field theory on the cylinder to many-particle quantum mechanics.
Take \(\phi(t, s)\) to be a field on a cylinder of radius \(R\). We consider the action
\[ S = \frac{1}{R} \int_{-\infty}^\infty \int_0^{R} |\nabla \phi|^2 ds dt \]
Expand \(\phi(t, s)\) in Fourier series:
\[ \phi(t,s ) = \sum_n \phi_n(t) e^{2 \pi i n s / R} \]
Then in Lorentzian signature, we have
\[ \int_0^{R} |\nabla \phi|^2 d\theta = R \sum_n \dot{\phi}_n^2 - \left(\frac{2\pi n}{R}\right)^2 \phi_n^2. \]
Putting this back into the action, we find
\[ S = \sum_n \int_{-\infty}^\infty \dot{\phi}_n^2- \left(\frac{2\pi n}{R}\right)^2 \phi_n^2 dt. \]
This is the action for infinitely many harmonic oscillators, with frequencies \(\omega_n = 2\pi |n| / R\). Recall that the energy levels of the harmonic oscillator are \(k\omega\) for \(k = 0, 1, \ldots\). So supposing that only a finite energy \(E\) is accessible in some particular experiment, we can only excite those modes \(\phi_n\) for which
\[ \frac{2\pi |n|}{R} < E. \]
In particular, only finitely many \(\phi_n\) may be excited at energies below \(E\), effectively reducing the field theory on the cylinder to many-particle quantum mechanics.
Thursday, July 26, 2012
Generating Functions
Method of Generating Functions
Let \(X\) and \(Y\) be two smooth manifolds, and let \(M = T^\ast X, N = T^\ast Y\) with corresponding symplectic forms \(\omega_M\) and \(\omega_N\).
Question: How can we produce symplectomorphisms \(\phi: M \to N\)?
The most important construction from classical mechanics is the method of generating functions. I will outline this method, shameless stolen from Ana Cannas da Silva's lecture notes.
Suppose we have a smooth function \(f \in C^\infty(X \times Y)\). Then its graph \(\Gamma\) is a submanifold of \(M \times N\): \( \Gamma = \{ (x,y, df_{x,y}) \in M \times N \}\). Since \(M \times N\) is a product, we have projections \(\pi_M, \pi_N\), and this allows us to write the graph as
\[ \Gamma = \{ (x, y, df_x, df_y) \}\]
Now there is a not-so-obvious trick: we consider the twisted graph \(\Gamma^\sigma\) given by
\[ \Gamma^\sigma = \{(x,y, df_x, -df_y) \} \]
Note the minus sign.
Proposition If \(\Gamma^\sigma\) is the graph of a diffeomorphism \(\phi: M \to N\), then \(\phi\) is a symplectomorphism.
Proof By construction, \(\Gamma^\sigma\) is a Lagrangian submanifold of \(M \times N\) with respect to the twisted symplectic form \(\pi_M^\ast \omega_M - \pi_N^\ast \omega_N\). It is a standard fact that a diffeomorphism is a symplectomorphism iff its graph is Lagrangian with respect to the twisted symplectic form, so we're done.
Now we have:
Modified question: Given \(f \in C^\infty(M \times N)\), when is its graph the graph of a diffeomorphism \(\phi: M \to N\)?
Pick coordinates \(x\) on \(X\) and \(y\) on \(Y\), with corresponding momenta \(\xi\) and \(\eta\). Then if \(\phi(x,\xi) = (y,\eta)\), we obtain
\[ \xi = d_x f, \ \eta = -d_y f \]
Note the simlarity to Hamilton's equations. By the implicit function theorem, we can construct a (local) diffeomorphism \(\phi\) as long as \(f\) is sufficiently non-degenerate.
Different Types of Generating Functions
We now concentrate on the special case of \(M = T^\ast \mathbb{R} = \mathbb{R} \times \mathbb{R}^\ast\). Note that this is a cotangent bundle in two ways: \(T^\ast \mathbb{R} \cong T^\ast \mathbb{R}^\ast\). Hence we can construct local diffeomorphisms \(T^\ast \mathbb{R} \to T^\ast \mathbb{R}\) in four ways, by taking functions of the forms
\[ f(x_1, x_2), \ f(x_1, p_2), \ f(p_1, x_2), \ f(p_1, p_2) \]
Origins from the Action Principle, and Hamilton-Jacobi
Suppose that we have two actions
\[ S_1 = \int p_1 \dot{q}_1 - H_1 dt, \ S_2 = \int p_2 \dot{q}_2 - H_2 dt \]
which give rise to the same dynamics. Then the Lagrangians must differ by a total derivative, i.e.
\[ p_1 \dot{q}_1 - H_1 = p_2 \dot{q}_2 - H_2 + \frac{d f}{dt} \]
Suppose that \(f = -q_2 p_2 + g(q_1, p_2, t)\). Then we have
\[ p_1 \dot{q}_1 - H_1 = -q_2 \dot{p}_2 - H_2 + \frac{\partial g}{\partial t} + \frac{\partial g}{\partial q_1}\dot{q}_1 + \frac{\partial g}{\partial p_2} \dot{p_2} \]
Comparing coefficients, we find
\[ p_1 = \frac{\partial g}{\partial q_1}, \ q_2 = \frac{\partial g}{\partial p_2}, \ H_2 = H_1 + \frac{\partial g}{\partial t} \]
\[ p_1 = \frac{\partial g}{\partial q_1}, \ q_2 = \frac{\partial g}{\partial p_2}, \ H_2 = H_1 + \frac{\partial g}{\partial t} \]
Now suppose that the coordinates \((q_2, p_2)\) are chosen so that Hamilton's equations become
\[ \dot{q_2} = 0, \ \dot{p}_2 = 0 \]
Then we must have \(H_2 = 0\), i.e.
\[ H_1 + \frac{\partial g}{\partial t} = 0 \]
Now we also have \(\partial H_2 / \partial p_2 = 0\), so this tells us that \(g\) is independent of \(p_2\), i.e. \(g = g(q_1, t)\). Since \(p_1 = \partial g / \partial q_1\), we obtain
\[ \frac{\partial g}{\partial t} + H_1(q_1, \frac{\partial g}{\partial q_1}) = 0 \]
This is the Hamilton-Jacobi equation, usually written as
\[ \frac{\partial S}{\partial t} + H(x, \frac{\partial S}{\partial x}) = 0 \]
Note the similarity to the Schrodinger equation! In fact, one can derive the Hamilton-Jacobi equation from the Schrodinger equation by taking a wavefunction of the form
\[ \psi(x,t) = A(x,t) \exp({\frac{i}{\hbar} S(x,t)}) \]
and expanding in powers of \(\hbar\). This also helps to motivate the path integral formulation of quantum theory.
Monday, July 23, 2012
KAM I
In this post I want to sketch the idea of KAM, following these lecture notes.
I don't want to worry too much about details, so for now we'll define an integrable system to be a Hamiltonian system \((M, \omega, H)\) for which we can choose local Darboux coordinates \((I, \phi)\) with \(I \in \mathbb{R}^N\) and \(\phi \in T^N\), such that the Hamiltonian is a function of \(I\) only. Defining \(\omega_j := \partial H / \partial I_j\), Hamilton's equations then read
\begin{align}
\dot{I}_j &= 0, \\\
\dot{\phi}_j &= \omega_j(I).
\end{align}
Hence we obtain linear motion on the torus as our dynamics. Note in particular that the sets \(\{I = \mathrm{const}\}\) are tori, and that the dynamics are constrained to these tori. We call these tori "invariant".
Now suppose that our Hamiltonian \(H\) is of the form
\[ H(I, \phi) = h(I) + f(I, \phi) \]
with \(f\) "small". What can be said of the dynamics? Specifically, do there exist invariant tori? KAM theory lets us formulate this question in a precise way, and gives an explicit quantitative answer (as long as \(f\) is nice enough, and small enough).
I want to sketch the idea of the KAM theorem, completely ignoring analytical details.
Suppose we could find a symplectomorphism \(\Phi\): (I, \phi) \mapsto (\tilde{I}, \tilde{\phi})\) such that \(H(I, \phi) = H(\tilde{I}\). Then our system would still be integrable (just in new action-angle coordinates), and we'd be done. There are two relatively easy ways of constructing symplectomorphisms: integrating symplectic vector fields, and generating functions. In the lecture notes, generating functions are used, so let's take a minute to discuss them.
Proposition Let \(\Sigma(\tilde{I}, \phi)\) be a smooth function and suppose that the transformation
\[ I = \frac{\partial \Sigma}{\partial \phi}, \
\tilde{\phi} = \frac{\partial \Sigma}{\partial \tilde{I}}\]
can be inverted to produce a diffeomorphism \(\Phi: (I, \phi) \mapsto (\tilde{I}, \tilde{\phi})\). Then \(\Phi\) is a symplectomorphism.
Proof
\[ dI = \frac{\partial^2 \Sigma}{\partial \phi \partial \tilde{I}} d \tilde{I} \]
\[ d\tilde{\phi} = \frac{\partial^2 \Sigma}{\partial \phi \partial \tilde{I}} d\phi \]
Hence
\[ dI \wedge d\phi = \frac{\partial^2 \Sigma}{\partial \phi \partial \tilde{I}} d \tilde{I} \wedge d\phi = d\tilde{I} \wedge d\tilde{\phi}. \]
We want a symplectomorphism \(\Phi\) such that
\[ H \circ \Phi(\tilde{I}, \tilde{\phi}) = \tilde{h}(\tilde{I} \]
If \(\Phi\) came from a generating function \(\Sigma\), then we have
\[ H(\frac{\partial \Sigma}{\partial \phi}, \phi) = \tilde{h}(\tilde{I}) \]
Expanding things, we have
\[ h(\frac{\partial \Sigma}{\partial \phi}) + f(\frac{\partial \Sigma}{\partial \phi}, \phi) = \tilde{h}(\tilde{I}). \]
If \(f\) is small, then we might expect \(\Phi\) to be close to the identity, and hence \(\Sigma\) ought to be close to the generating function for the identity (which is \(\langle I, \phi \rangle\)). So we take
\[ \Sigma(\tilde{I}, \phi) = \langle \tilde{I}, \phi \rangle + S(\tilde{I}, \phi) \]
where \(S\) should be "small". So we linearize the equation in \(S\):
\[ \langle \omega(\tilde{I}), \frac{\partial S}{\partial \phi} \rangle
+ f(\tilde{I}, \phi)
= \tilde{h}(\tilde{I}) - h(\tilde{I}) \]
Now we can expand \(S\) and \(f\) in Fourier series and solve coefficient-wise. This gives a formal solution \(S(\tilde{I}, \phi)\) of the equation
\[ \langle \omega, \frac{\partial S}{\partial \phi} \rangle + f(\tilde{I}, \phi) = 0. \]
Unfortunately, the Fourier series for \(S\) has no chance to converge, so instead we take a finite truncation. If we assume \(f\) is analytic, its Fourier coefficients decay exponentially fast, so this provides a very good approximate solution to the linearized equation (and we can give an explicit bound in terms of a certain norm of \(f\)). Call this function \(S_1\). We then use \(S_1\) to construct a symplectomorphism \(\Phi_1\).
Now we take
\[ H_1(I, \phi) = H \circ \Phi_1(I, \phi) = h_1(I) + f_1(I, \phi). \]
Some hard analysis then shows that \(h - h_1\) is small, and \(f_1\) is much smaller than f.
The above arguments sketch a method to put the system "closer" to an integrable form. By carefully controlling \(\epsilon\)'s and \(\delta\)'s, one then shows that iterated sequence \(\Phi_1, \Phi_2 \circ \Phi_1, \ldots\) converges to some limiting symplectomorphism \(\Phi_\infty\).
Integrable Systems
I don't want to worry too much about details, so for now we'll define an integrable system to be a Hamiltonian system \((M, \omega, H)\) for which we can choose local Darboux coordinates \((I, \phi)\) with \(I \in \mathbb{R}^N\) and \(\phi \in T^N\), such that the Hamiltonian is a function of \(I\) only. Defining \(\omega_j := \partial H / \partial I_j\), Hamilton's equations then read
\begin{align}
\dot{I}_j &= 0, \\\
\dot{\phi}_j &= \omega_j(I).
\end{align}
Hence we obtain linear motion on the torus as our dynamics. Note in particular that the sets \(\{I = \mathrm{const}\}\) are tori, and that the dynamics are constrained to these tori. We call these tori "invariant".
Now suppose that our Hamiltonian \(H\) is of the form
\[ H(I, \phi) = h(I) + f(I, \phi) \]
with \(f\) "small". What can be said of the dynamics? Specifically, do there exist invariant tori? KAM theory lets us formulate this question in a precise way, and gives an explicit quantitative answer (as long as \(f\) is nice enough, and small enough).
I want to sketch the idea of the KAM theorem, completely ignoring analytical details.
Constructing the Symplectomorphism
Suppose we could find a symplectomorphism \(\Phi\): (I, \phi) \mapsto (\tilde{I}, \tilde{\phi})\) such that \(H(I, \phi) = H(\tilde{I}\). Then our system would still be integrable (just in new action-angle coordinates), and we'd be done. There are two relatively easy ways of constructing symplectomorphisms: integrating symplectic vector fields, and generating functions. In the lecture notes, generating functions are used, so let's take a minute to discuss them.
Proposition Let \(\Sigma(\tilde{I}, \phi)\) be a smooth function and suppose that the transformation
\[ I = \frac{\partial \Sigma}{\partial \phi}, \
\tilde{\phi} = \frac{\partial \Sigma}{\partial \tilde{I}}\]
can be inverted to produce a diffeomorphism \(\Phi: (I, \phi) \mapsto (\tilde{I}, \tilde{\phi})\). Then \(\Phi\) is a symplectomorphism.
Proof
\[ dI = \frac{\partial^2 \Sigma}{\partial \phi \partial \tilde{I}} d \tilde{I} \]
\[ d\tilde{\phi} = \frac{\partial^2 \Sigma}{\partial \phi \partial \tilde{I}} d\phi \]
Hence
\[ dI \wedge d\phi = \frac{\partial^2 \Sigma}{\partial \phi \partial \tilde{I}} d \tilde{I} \wedge d\phi = d\tilde{I} \wedge d\tilde{\phi}. \]
We want a symplectomorphism \(\Phi\) such that
\[ H \circ \Phi(\tilde{I}, \tilde{\phi}) = \tilde{h}(\tilde{I} \]
If \(\Phi\) came from a generating function \(\Sigma\), then we have
\[ H(\frac{\partial \Sigma}{\partial \phi}, \phi) = \tilde{h}(\tilde{I}) \]
Expanding things, we have
\[ h(\frac{\partial \Sigma}{\partial \phi}) + f(\frac{\partial \Sigma}{\partial \phi}, \phi) = \tilde{h}(\tilde{I}). \]
If \(f\) is small, then we might expect \(\Phi\) to be close to the identity, and hence \(\Sigma\) ought to be close to the generating function for the identity (which is \(\langle I, \phi \rangle\)). So we take
\[ \Sigma(\tilde{I}, \phi) = \langle \tilde{I}, \phi \rangle + S(\tilde{I}, \phi) \]
where \(S\) should be "small". So we linearize the equation in \(S\):
\[ \langle \omega(\tilde{I}), \frac{\partial S}{\partial \phi} \rangle
+ f(\tilde{I}, \phi)
= \tilde{h}(\tilde{I}) - h(\tilde{I}) \]
Now we can expand \(S\) and \(f\) in Fourier series and solve coefficient-wise. This gives a formal solution \(S(\tilde{I}, \phi)\) of the equation
\[ \langle \omega, \frac{\partial S}{\partial \phi} \rangle + f(\tilde{I}, \phi) = 0. \]
Getting it to Work
Unfortunately, the Fourier series for \(S\) has no chance to converge, so instead we take a finite truncation. If we assume \(f\) is analytic, its Fourier coefficients decay exponentially fast, so this provides a very good approximate solution to the linearized equation (and we can give an explicit bound in terms of a certain norm of \(f\)). Call this function \(S_1\). We then use \(S_1\) to construct a symplectomorphism \(\Phi_1\).
Now we take
\[ H_1(I, \phi) = H \circ \Phi_1(I, \phi) = h_1(I) + f_1(I, \phi). \]
Some hard analysis then shows that \(h - h_1\) is small, and \(f_1\) is much smaller than f.
The Induction Step
The above arguments sketch a method to put the system "closer" to an integrable form. By carefully controlling \(\epsilon\)'s and \(\delta\)'s, one then shows that iterated sequence \(\Phi_1, \Phi_2 \circ \Phi_1, \ldots\) converges to some limiting symplectomorphism \(\Phi_\infty\).
Friday, July 13, 2012
Circle Diffeomorphisms I
This is the first of a series of posts based on these lecture notes on KAM theory. For now I just want to outline section 2, which is a toy model of KAM thoery.
We consider a map \(\phi: \mathbb{R} \to \mathbb{R}\) defined by
\[ \phi(x) = x + \rho + \eta(x) \]
where \(\rho\) is its rotation number and \(\eta(x)\) is "small".
Define \(S_\sigma\) to be the strip \(\{ |\mathrm{Im} z|<\sigma\} \subset \mathbb{C}\) and let \(B_\sigma\) be the space of holomorphic functions bounded on \(S_\sigma\) with sup norm \(\|\cdot\|_\sigma\).
Goal: Show that if \(\|\eta\|_\sigma\) is sufficiently small, then there exists some diffeomorphism \(H(x)\) such that
\[ H^{-1} \circ \phi \circ H (x) = x + \rho \]
i.e. that \(\phi\) is conjugate to a pure rotation.
The idea is that if \(\eta\) is small, then \(H\) should be close to the identity, so we suppose that
\[ H(x) = x + h(x) \]
where \(h(x)\) is small. Plugging this into the equation above and discarding higher order terms yields
\[ h(x+\rho) - h(x) = \eta(x) \]
Since \(\eta\) is periodic, we Fourier transform both sides to obtain an explicit formula for the Fourier coefficients of \(h(x)\). We have to show several things:
1. The Fourier series defining \(h(x)\) converges in some appropriate sense.
2. The function \(H(x) = x + h(x)\) is a diffeomorphism.
3. The composition \(\tilde{\phi} = H^{-1} \circ \phi \circ H\) is closer to a pure rotation than \(\phi\), in the sense that
\[ \tilde{\phi}(x) = x + \rho + \tilde{\eta}(x) \]
where \(\|\tilde{\eta}\| \ll \|\eta\|\).
\[ H_\infty^{-1} \circ \phi \circ H_\infty (x) = x + \rho, \]
as desired.
So in fact the idea of the proof is extremely simple, and all of the hard work is in proving some estimates.
Circle Diffeomorphisms
We consider a map \(\phi: \mathbb{R} \to \mathbb{R}\) defined by
\[ \phi(x) = x + \rho + \eta(x) \]
where \(\rho\) is its rotation number and \(\eta(x)\) is "small".
Define \(S_\sigma\) to be the strip \(\{ |\mathrm{Im} z|<\sigma\} \subset \mathbb{C}\) and let \(B_\sigma\) be the space of holomorphic functions bounded on \(S_\sigma\) with sup norm \(\|\cdot\|_\sigma\).
Goal: Show that if \(\|\eta\|_\sigma\) is sufficiently small, then there exists some diffeomorphism \(H(x)\) such that
\[ H^{-1} \circ \phi \circ H (x) = x + \rho \]
i.e. that \(\phi\) is conjugate to a pure rotation.
Linearization
The idea is that if \(\eta\) is small, then \(H\) should be close to the identity, so we suppose that
\[ H(x) = x + h(x) \]
where \(h(x)\) is small. Plugging this into the equation above and discarding higher order terms yields
\[ h(x+\rho) - h(x) = \eta(x) \]
Since \(\eta\) is periodic, we Fourier transform both sides to obtain an explicit formula for the Fourier coefficients of \(h(x)\). We have to show several things:
1. The Fourier series defining \(h(x)\) converges in some appropriate sense.
2. The function \(H(x) = x + h(x)\) is a diffeomorphism.
3. The composition \(\tilde{\phi} = H^{-1} \circ \phi \circ H\) is closer to a pure rotation than \(\phi\), in the sense that
\[ \tilde{\phi}(x) = x + \rho + \tilde{\eta}(x) \]
where \(\|\tilde{\eta}\| \ll \|\eta\|\).
Newton's Method
Carrying out the analysis, one finds that for appropriate epsilons and deltas, if \(\eta \in B_\sigma\) then \(H \in B_{\sigma - \delta}\) and that \(\|\tilde{\eta}\|_{\sigma-\delta} \leq C \|\eta\|_\sigma^2\). By carefully choosing the deltas, we can iterate this procedure (composing the \(H\)'s) to obtain a well-defined limit \(H_\infty \in B_{\sigma/2}\) such that\[ H_\infty^{-1} \circ \phi \circ H_\infty (x) = x + \rho, \]
as desired.
So in fact the idea of the proof is extremely simple, and all of the hard work is in proving some estimates.
Saturday, March 3, 2012
Gaussian Integrals: Wick's Theorem
We saw in the last update that the generating function \(Z[J]\) can be expressed as
\[ Z[J] = e^{\frac{1}{2} J \cdot A^{-1} J} \]
(at least as long as we've normalize things so that \(Z[0] = 1\). Now the wonderful thing is that this is something we can compute explicitly:
\[ Z[J] = \sum_{n = 0}^{\infty} \frac{(\frac{1}{2} A^{-1}_{ij} J^i J^j)^n}{n!}
= \sum_{n=0}^\infty \frac{(A^{-1}_{ij} J^i J^j)^n}{2^n n!} \]
For example, in the one-dimensional case (taking \(A = 1\)) we get
\[ Z[J] = \sum_{n=0}^\infty \frac{J^{2n}}{2^n n!} \]
On the other hand, by the definition of the generating function we have
\[ Z[J] = \sum_{n=0}^\infty \frac{\langle x^n \rangle}{n!} J^n \]
Comparing coefficients, we find
\[ \frac{\langle x^{2n} \rangle}{(2n)!} = \frac{1}{2^n n!} \]
so that
\[ \langle x^{2n} \rangle = \frac{(2n)!}{2^n n!}. \]
Let's give a combinatorial description. Given \(2n\) objects, in how many ways can we divide them into pairs? If we care about the order in which we pick the pairs, then we have
\[ {2n \choose 2}{2n - 2 \choose 2} \cdots {2n-(2n-2) \choose 2} = \frac{(2n)!}{2^n} \]
Of course, there are \(n!\) ways of ordering the \(n\) pairs, so after dividing by this (to account for the overcounting) we get exactly the expression for \(\langle x^{2n} \rangle\). This is the first case of Wick's theorem.
Now consider the general multidimensional case. Given \(I = (i_1, \cdots, i_{2n})\), we define a contraction to be
\[ \langle x^{j_1} x^{k_1} \rangle \cdots \langle x^{j_n} x^{k_n} \rangle \]
where \(j_1, k_1, \cdots, j_n, k_n\) is a choice of parition of \(I\) into pairs.
Theorem (Wick's theorem, Isserlis' theorem) The expectation value
\[ \langle x^{i_1} \cdots x^{i_{2n}} \rangle \]
is the sum over all full contractions. There are \((2n)!/ 2^n n!\) terms in the sum.
Proof This follows from our formula for the power series of the generating function. The reason is that the coefficient of \(J^I\) in \((\frac{1}{2} A^{-1}_{ij} J^i J^k)^n\) is exactly given by summing products of \(A^{-1}_{ij}\) over partitions of \(I\) into pairs, and the \(n!\) in the denominator takes care of the overcounting.
Next up: perturbation theory and Feynman diagrams.
\[ Z[J] = e^{\frac{1}{2} J \cdot A^{-1} J} \]
(at least as long as we've normalize things so that \(Z[0] = 1\). Now the wonderful thing is that this is something we can compute explicitly:
\[ Z[J] = \sum_{n = 0}^{\infty} \frac{(\frac{1}{2} A^{-1}_{ij} J^i J^j)^n}{n!}
= \sum_{n=0}^\infty \frac{(A^{-1}_{ij} J^i J^j)^n}{2^n n!} \]
For example, in the one-dimensional case (taking \(A = 1\)) we get
\[ Z[J] = \sum_{n=0}^\infty \frac{J^{2n}}{2^n n!} \]
On the other hand, by the definition of the generating function we have
\[ Z[J] = \sum_{n=0}^\infty \frac{\langle x^n \rangle}{n!} J^n \]
Comparing coefficients, we find
\[ \frac{\langle x^{2n} \rangle}{(2n)!} = \frac{1}{2^n n!} \]
so that
\[ \langle x^{2n} \rangle = \frac{(2n)!}{2^n n!}. \]
Let's give a combinatorial description. Given \(2n\) objects, in how many ways can we divide them into pairs? If we care about the order in which we pick the pairs, then we have
\[ {2n \choose 2}{2n - 2 \choose 2} \cdots {2n-(2n-2) \choose 2} = \frac{(2n)!}{2^n} \]
Of course, there are \(n!\) ways of ordering the \(n\) pairs, so after dividing by this (to account for the overcounting) we get exactly the expression for \(\langle x^{2n} \rangle\). This is the first case of Wick's theorem.
Now consider the general multidimensional case. Given \(I = (i_1, \cdots, i_{2n})\), we define a contraction to be
\[ \langle x^{j_1} x^{k_1} \rangle \cdots \langle x^{j_n} x^{k_n} \rangle \]
where \(j_1, k_1, \cdots, j_n, k_n\) is a choice of parition of \(I\) into pairs.
Theorem (Wick's theorem, Isserlis' theorem) The expectation value
\[ \langle x^{i_1} \cdots x^{i_{2n}} \rangle \]
is the sum over all full contractions. There are \((2n)!/ 2^n n!\) terms in the sum.
Proof This follows from our formula for the power series of the generating function. The reason is that the coefficient of \(J^I\) in \((\frac{1}{2} A^{-1}_{ij} J^i J^k)^n\) is exactly given by summing products of \(A^{-1}_{ij}\) over partitions of \(I\) into pairs, and the \(n!\) in the denominator takes care of the overcounting.
Next up: perturbation theory and Feynman diagrams.
Introduction to Gaussian Integrals
As a warm-up for more serious stuff, I'd like to discuss Gaussian integrals over \(\mathbb{R}^d\). Gaussian integrals are the main tool for perturbative quantum field theory, and I find that understanding Gaussian integrals in finite dimensions is an immense aid to understanding how perturbative QFT works. So let's get started.
The Basics
Let \(A\) be some \(d \times d\) symmetric positive definite matrix. We are interested in the integral
\[ \int_{-\infty}^\infty \exp(-\frac{x \cdot Ax}{2}) dx. \]
Out of laziness, I will suppress the limits of integration and just write this as
\[ \int e^{-S(x)} dx. \]
where \(S(x) = x \cdot Ax / 2\). Now for a function \(f(x)\), we define the expectation value \(\langle f(x) \rangle\) to be
\[ \langle f(x) \rangle_0 = \int f(x) e^{-S(x)} dx \]
Occasionally, we might care about the normalized expectation value
\[ langle f(x) \rangle = \frac{\langle f(x) \rangle_0}{\langle 1 \rangle_0} = \frac{1}{\langle 1 \rangle_0} \int f(x) e^{-S(x)} dx. \]
We mostly care about asymptotics, so we will typically think of a function \(f(x)\) as being a polynomial (or Taylor series). So what we're really interested in is
\[ \langle x^I \rangle = c\int x^I e^{-S(x)} dx, \]
where \(I\) is a multi-index.
The Partition Function
Let us define \(Z[J]\) by
\[ Z[J] = \int e^{-S(x) + J \cdot x} dx. \]
Now the great thing is that
\[ \langle x^I \rangle = \left. \frac{d^I}{dJ^I} \right|_{J = 0} Z[J], \]
so that once we know \(Z[J]\), we can calculate anything. So let's try to compute it. We have
\begin{align}
(Ax - J) \cdot A^{-1} (Ax - J) &= (Ax - J) \cdot (x - A^{-1} J) \\\
&= x \cdot Ax - x \cdot J - J \cdot x + J \cdot A^{-1} J \\\
&= x \cdot Ax - 2 x \cdot J + J \cdot A^{-1} J.
\end{align}
So we see that
\[ -\frac{1}{2} x \cdot A x + J \cdot x = \frac{1}{2} J \cdot A^{-1} J -\frac{1}{2} (x-A^{-1}J) \cdot A(x - A^{-1} J). \]
So, after a change of variales \(x \mapsto x - A^{-1} J\) we find
\[ Z[J] = e^{\frac{1}{2} J \cdot A^{-1} J} Z[0]. \]
Now the argument in the exponential is
\[ \frac{1}{2} A^{-1}_{ij} J^i J^j \]
So we find that
\[ \langle x^i x^j \rangle = \frac{d^2}{dx^i dx^j} Z[J]|_{J = 0} = A^{-1}_{ij}. \]
Now we are ready to prove Wick's theorem and discuss Feynman diagrams, which we'll do in the next post.
The Basics
Let \(A\) be some \(d \times d\) symmetric positive definite matrix. We are interested in the integral
\[ \int_{-\infty}^\infty \exp(-\frac{x \cdot Ax}{2}) dx. \]
Out of laziness, I will suppress the limits of integration and just write this as
\[ \int e^{-S(x)} dx. \]
where \(S(x) = x \cdot Ax / 2\). Now for a function \(f(x)\), we define the expectation value \(\langle f(x) \rangle\) to be
\[ \langle f(x) \rangle_0 = \int f(x) e^{-S(x)} dx \]
Occasionally, we might care about the normalized expectation value
\[ langle f(x) \rangle = \frac{\langle f(x) \rangle_0}{\langle 1 \rangle_0} = \frac{1}{\langle 1 \rangle_0} \int f(x) e^{-S(x)} dx. \]
We mostly care about asymptotics, so we will typically think of a function \(f(x)\) as being a polynomial (or Taylor series). So what we're really interested in is
\[ \langle x^I \rangle = c\int x^I e^{-S(x)} dx, \]
where \(I\) is a multi-index.
The Partition Function
Let us define \(Z[J]\) by
\[ Z[J] = \int e^{-S(x) + J \cdot x} dx. \]
Now the great thing is that
\[ \langle x^I \rangle = \left. \frac{d^I}{dJ^I} \right|_{J = 0} Z[J], \]
so that once we know \(Z[J]\), we can calculate anything. So let's try to compute it. We have
\begin{align}
(Ax - J) \cdot A^{-1} (Ax - J) &= (Ax - J) \cdot (x - A^{-1} J) \\\
&= x \cdot Ax - x \cdot J - J \cdot x + J \cdot A^{-1} J \\\
&= x \cdot Ax - 2 x \cdot J + J \cdot A^{-1} J.
\end{align}
So we see that
\[ -\frac{1}{2} x \cdot A x + J \cdot x = \frac{1}{2} J \cdot A^{-1} J -\frac{1}{2} (x-A^{-1}J) \cdot A(x - A^{-1} J). \]
So, after a change of variales \(x \mapsto x - A^{-1} J\) we find
\[ Z[J] = e^{\frac{1}{2} J \cdot A^{-1} J} Z[0]. \]
Now the argument in the exponential is
\[ \frac{1}{2} A^{-1}_{ij} J^i J^j \]
So we find that
\[ \langle x^i x^j \rangle = \frac{d^2}{dx^i dx^j} Z[J]|_{J = 0} = A^{-1}_{ij}. \]
Now we are ready to prove Wick's theorem and discuss Feynman diagrams, which we'll do in the next post.
Saturday, February 25, 2012
Geometry of Curved Spacetime 5: Bianchi Identity and Einstein Equations
Background
Following last time, we are almost ready to write down the Einstein equations. Before doing any math, let's understand what we're trying to do. Minkowski realized that Einstein's special relativity was best understood by combining space and time into 4-dimensional spacetime, with Lorentzian metric
\[ ds^2 = -dt^2 + dx^2 + dy^2 + dz^2. \]
The spacetime approach works wonderfully and even explains the Lorentz invariance of Maxwell's equations (indeed, it was Maxwell's equations that motivated Einstein to postulate his principle of relativity). However, (for reasons that I may discuss later) gravity is not a "force" but rather the geometry of spacetime itself.
By mass-energy equivalence (which is one of the most basic consequences of relativity), the gravitational field, whatever it is, must couple to the stress-energy tensor \(T_{ij}\). I won't get into details, but the stress-energy tensor is a familiar object from physics that roughly tells you what the energy-momentum density/flux is in each direction at every point in spacetime. If the matter is completely static, then it is ok to think of this as measuring the mass density, but for nonstatic matter it also takes things like pressure into account.
Now, as I said above, the gravitational field is just the geometry of spacetime, which is measured by the metric tensor \(g_{ij}\). Mass-energy equivalence says that it must couple to the stress-energy tensor \(T_{ij}\). The simplest field equation then would be
\[ G_{ij} = c T_{ij} \]
where \(G_{ij}\) is some tensor built out of \(g_{ij}\) and its derivatives, and \(c\) is some constant. The equations of Newtonian gravity are 2nd order in the gravitational field, so if we want these equations to reduce to Newton's in the appropriate limit, \(G_{ij}\) should only depend on the metric and its first two derivatives. Now there is an obvious 2nd rank tensor satisfying these constraints: \(R_{ij}\), the Ricci tensor. However, this turns out to be completely wrong (except in the vacuum).
Any reasonable matter will satisfy local energy-momentum conservation,
\[ \nabla_j T^{ij} = 0. \]
It turns out that the Ricci tensor does not satisfy this condition in general. So to look for the right tensor \(G_{ij}\), we turn to the Bianchi identity.
The Bianchi Identity
As discussed in the previous post, the curvature of a connection is the endomorphism-valued 2-form
\[ F = d\Omega - \Omega \wedge \Omega \]
where \(\Omega\) is the matrix of 1-forms telling us how to take the covariant derivative of a frame, i.e.
\[ \nabla_i e_j = \Omega_{ij} \otimes e_j. \]
Since a connection can be extended to all tensor powers in a natural way, we can consider the covariant derivative of the curvature \(F\) (thought of as a section of the appropriate bundle). Quick calcluation:
\begin{align}
\nabla F &= \nabla(d\Omega - \Omega \wedge \Omega) \\
&= d^2 \Omega - d\Omega \wedge \Omega + \Omega \wedge d\Omega \\
& \ \ + d\Omega \wedge \Omega - \Omega \wedge \Omega \wedge \Omega \\
& \ \ - \Omega \wedge d\Omega + \Omega \wedge \Omega \wedge \Omega \\
&= 0.
\end{align}
Thus the endomorphism valued 3-form \(\nabla F\) is identically 0. Writing \(F\) in components as \(R_{ijkl}\), this is equivalent to
\[ R_{ijkl|m} + R_{ijlm|k} + R_{ijmk|l} = 0. \]
Now let's contract:
\begin{align}
0 &= g^{ik} g^{jl} R_{ijkl|m} + g^{ik} g^{jl} R_{ijlm|k} + g^{ik} g^{jl} R_{ijmk|l}\\
&= g^{ik} R_{ik|m} - g^{ik}R_{im|k} - g^{jl} R_{jm|l} \\
&= \nabla_m S - 2 \nabla^k R_{mk}
\end{align}
So we see that the tensor
\[ G_{ij} = R_{ij} - \frac{S}{2} g_{ij} \]
is divergence free. This yields the Einstein field equations:
\[ R_{ij} - \frac{S}{2} g_{ij} = c T_{ij}. \]
Actually, there is another obvious divergence free tensor: \(g_{ij}\) itself. So a more general form is
\[ G_{ij} + \Lambda g_{ij} = c T_{ij} \]
where \(\Lambda\) is a constant called the cosmological constant.
Following last time, we are almost ready to write down the Einstein equations. Before doing any math, let's understand what we're trying to do. Minkowski realized that Einstein's special relativity was best understood by combining space and time into 4-dimensional spacetime, with Lorentzian metric
\[ ds^2 = -dt^2 + dx^2 + dy^2 + dz^2. \]
The spacetime approach works wonderfully and even explains the Lorentz invariance of Maxwell's equations (indeed, it was Maxwell's equations that motivated Einstein to postulate his principle of relativity). However, (for reasons that I may discuss later) gravity is not a "force" but rather the geometry of spacetime itself.
By mass-energy equivalence (which is one of the most basic consequences of relativity), the gravitational field, whatever it is, must couple to the stress-energy tensor \(T_{ij}\). I won't get into details, but the stress-energy tensor is a familiar object from physics that roughly tells you what the energy-momentum density/flux is in each direction at every point in spacetime. If the matter is completely static, then it is ok to think of this as measuring the mass density, but for nonstatic matter it also takes things like pressure into account.
Now, as I said above, the gravitational field is just the geometry of spacetime, which is measured by the metric tensor \(g_{ij}\). Mass-energy equivalence says that it must couple to the stress-energy tensor \(T_{ij}\). The simplest field equation then would be
\[ G_{ij} = c T_{ij} \]
where \(G_{ij}\) is some tensor built out of \(g_{ij}\) and its derivatives, and \(c\) is some constant. The equations of Newtonian gravity are 2nd order in the gravitational field, so if we want these equations to reduce to Newton's in the appropriate limit, \(G_{ij}\) should only depend on the metric and its first two derivatives. Now there is an obvious 2nd rank tensor satisfying these constraints: \(R_{ij}\), the Ricci tensor. However, this turns out to be completely wrong (except in the vacuum).
Any reasonable matter will satisfy local energy-momentum conservation,
\[ \nabla_j T^{ij} = 0. \]
It turns out that the Ricci tensor does not satisfy this condition in general. So to look for the right tensor \(G_{ij}\), we turn to the Bianchi identity.
The Bianchi Identity
As discussed in the previous post, the curvature of a connection is the endomorphism-valued 2-form
\[ F = d\Omega - \Omega \wedge \Omega \]
where \(\Omega\) is the matrix of 1-forms telling us how to take the covariant derivative of a frame, i.e.
\[ \nabla_i e_j = \Omega_{ij} \otimes e_j. \]
Since a connection can be extended to all tensor powers in a natural way, we can consider the covariant derivative of the curvature \(F\) (thought of as a section of the appropriate bundle). Quick calcluation:
\begin{align}
\nabla F &= \nabla(d\Omega - \Omega \wedge \Omega) \\
&= d^2 \Omega - d\Omega \wedge \Omega + \Omega \wedge d\Omega \\
& \ \ + d\Omega \wedge \Omega - \Omega \wedge \Omega \wedge \Omega \\
& \ \ - \Omega \wedge d\Omega + \Omega \wedge \Omega \wedge \Omega \\
&= 0.
\end{align}
Thus the endomorphism valued 3-form \(\nabla F\) is identically 0. Writing \(F\) in components as \(R_{ijkl}\), this is equivalent to
\[ R_{ijkl|m} + R_{ijlm|k} + R_{ijmk|l} = 0. \]
Now let's contract:
\begin{align}
0 &= g^{ik} g^{jl} R_{ijkl|m} + g^{ik} g^{jl} R_{ijlm|k} + g^{ik} g^{jl} R_{ijmk|l}\\
&= g^{ik} R_{ik|m} - g^{ik}R_{im|k} - g^{jl} R_{jm|l} \\
&= \nabla_m S - 2 \nabla^k R_{mk}
\end{align}
So we see that the tensor
\[ G_{ij} = R_{ij} - \frac{S}{2} g_{ij} \]
is divergence free. This yields the Einstein field equations:
\[ R_{ij} - \frac{S}{2} g_{ij} = c T_{ij}. \]
Actually, there is another obvious divergence free tensor: \(g_{ij}\) itself. So a more general form is
\[ G_{ij} + \Lambda g_{ij} = c T_{ij} \]
where \(\Lambda\) is a constant called the cosmological constant.
Monday, February 6, 2012
Geometry of Curved Spacetime 4
Today I had to try to explain connections and curvature in local frames (as opposed to coordinates), and I really feel that Wald's treatment of this is just awful (this is one of the few complaints I have with an otherwise classic textbook). It is particularly baffling since the treatment in Misner, Thorne, and Wheeler is just perfect. What follows is the modern math (as opposed to physics) point of view. This is more abstract than any introductory GR (or even Riemannian geometry) text I've seen, but in this case the abstraction absolutely clarifies and simplifies things.
Let \(M\) be a smooth manifold and suppose \(E\) is a smooth vector bundle over \(M\). A connection on \(E\) is a map \nabla taking sections of \(E\) to sections of \(T^\ast M \otimes E\), \(\mathbb{R}\)-linear and satisfying the Leibniz rule
\[ \nabla(f\sigma) = df \otimes \sigma + f \nabla \sigma. \]
Now consider the sheaf of \(E\)-valued \(p\)-forms on \(M\). Call it \(\Omega^p(E)\). Then we can extend the connection to a map
\[ \nabla: \Omega^p(E) \to \Omega^{p+1}(E) \]
via the Leibniz rule:
\[ \nabla(\eta \otimes \sigma) = d\eta \otimes \sigma + (-1)^p \eta \wedge \nabla \sigma. \]
Let us define the curvature \(F\) associated to a connection \(\nabla\) by the composition
\[ F = \nabla^2: \Omega^p(E) \to \Omega^{p+2}(E). \]
Claim \(F\) is \(C^\infty\)-linear, i.e. it is tensorial.
Proof
\begin{align}
\nabla(\nabla(f \sigma)) &= \nabla( df \otimes \sigma + f \nabla \sigma) \\\
&= d^2 f \otimes \sigma - df \wedge \nabla \sigma + df \wedge \nabla \sigma + f \nabla^2 \sigma \\\
&= f \nabla^2 \sigma.
\end{align}
So far we have not made any additional choices (beyond \(\nabla\)). In order to actually compute something locally, we have to make some choices. Let \(\hat{e}_a\) be a frame, i.e. a local basis of sections of \(E\). Then \(\nabla \hat{e}_a\) is an \(E\)-valued 1-form, hence it can be expressed as a sum
\[ \nabla \hat{e}_a = \sum_{b} \omega_a^b \otimes \hat{e}_b \]
where the coefficients \(\omega_a^b\) are 1-forms, often called the connection 1-forms. Let \(\Omega\) denote the matrix of 1-forms whose entries are exactly \(\omega_a^b\).
Claim Let \(\sigma = \sigma^a \hat{e}_a\). Then we have
\[ \nabla \sigma = d\sigma + \Omega \sigma. \]
Proof The coefficients \(\sigma^a\) are functions (i.e. scalars), so \(\nabla \sigma^a = d\sigma^a\). Using the Leibniz rule we have
\begin{align}
\nabla(\sigma^a \hat{e}_a) &= (\nabla \sigma^a) \hat{e}_a + \sigma^a \nabla \hat{e}_a \\\
&= d\sigma^a \hat{e}_a + \sigma^a \omega_a^b \hat{e}_b \\\
&= d\sigma^a \hat{e}_a + \omega_c^a \sigma^c \hat{e}_a \\\
&= (d\sigma + \Omega \sigma)^a \hat{e}_a.
\end{align}
Claim The curvature satisfies \(F = d\Omega - \Omega \wedge \Omega\).
Proof Just apply the above formula twice using Leibniz.
Connection 1-forms from Christoffel symbols. Suppose now that we are in the Riemannian setting and we already know the Christoffel symbols in some coordinates. Then we can express our frame \(\hat{e}_a\) in terms of coordinate vector fields, i.e.
\[ \hat{e}_a = \hat{e}_a^i \frac{\partial}{\partial x^i} \]
Then we have that
\[ \nabla_j \hat{e}_a^i = \frac{\partial \hat{e}_a^i}{\partial x^j} + \Gamma^i_{jk} \hat{e}_a^k \]
So, as a vector-valued 1-form, we have
\[ \nabla \hat{e} = \frac{\partial \hat{e}_a^i}{\partial x^j} dx^j \otimes \frac{\partial}{\partial x^i}
+ \Gamma^i_{jk} \hat{e}_a^k dx^j \otimes \frac{\partial}{\partial x^i}. \]
Juggling things a bit using the metric, we find
\[ \nabla \hat{e}_a = \frac{\partial \hat{e}_a^i}{\partial x^j} \hat{e}^b_i dx^j \otimes \hat{e}_b
+ \Gamma^i_{jk} \hat{e}_a^k \hat{e}_i^b dx^j \otimes \hat{e}_b. \]
So the connection 1-forms are given by
\[ \omega_a^b = \frac{\partial \hat{e}_a^i}{\partial x^j} \hat{e}^b_i dx^j
+ \Gamma^i_{jk} \hat{e}_a^k \hat{e}_i^b dx^j. \]
To come later (if I ever get around to it): some explicit computations.
Let \(M\) be a smooth manifold and suppose \(E\) is a smooth vector bundle over \(M\). A connection on \(E\) is a map \nabla taking sections of \(E\) to sections of \(T^\ast M \otimes E\), \(\mathbb{R}\)-linear and satisfying the Leibniz rule
\[ \nabla(f\sigma) = df \otimes \sigma + f \nabla \sigma. \]
Now consider the sheaf of \(E\)-valued \(p\)-forms on \(M\). Call it \(\Omega^p(E)\). Then we can extend the connection to a map
\[ \nabla: \Omega^p(E) \to \Omega^{p+1}(E) \]
via the Leibniz rule:
\[ \nabla(\eta \otimes \sigma) = d\eta \otimes \sigma + (-1)^p \eta \wedge \nabla \sigma. \]
Let us define the curvature \(F\) associated to a connection \(\nabla\) by the composition
\[ F = \nabla^2: \Omega^p(E) \to \Omega^{p+2}(E). \]
Claim \(F\) is \(C^\infty\)-linear, i.e. it is tensorial.
Proof
\begin{align}
\nabla(\nabla(f \sigma)) &= \nabla( df \otimes \sigma + f \nabla \sigma) \\\
&= d^2 f \otimes \sigma - df \wedge \nabla \sigma + df \wedge \nabla \sigma + f \nabla^2 \sigma \\\
&= f \nabla^2 \sigma.
\end{align}
So far we have not made any additional choices (beyond \(\nabla\)). In order to actually compute something locally, we have to make some choices. Let \(\hat{e}_a\) be a frame, i.e. a local basis of sections of \(E\). Then \(\nabla \hat{e}_a\) is an \(E\)-valued 1-form, hence it can be expressed as a sum
\[ \nabla \hat{e}_a = \sum_{b} \omega_a^b \otimes \hat{e}_b \]
where the coefficients \(\omega_a^b\) are 1-forms, often called the connection 1-forms. Let \(\Omega\) denote the matrix of 1-forms whose entries are exactly \(\omega_a^b\).
Claim Let \(\sigma = \sigma^a \hat{e}_a\). Then we have
\[ \nabla \sigma = d\sigma + \Omega \sigma. \]
Proof The coefficients \(\sigma^a\) are functions (i.e. scalars), so \(\nabla \sigma^a = d\sigma^a\). Using the Leibniz rule we have
\begin{align}
\nabla(\sigma^a \hat{e}_a) &= (\nabla \sigma^a) \hat{e}_a + \sigma^a \nabla \hat{e}_a \\\
&= d\sigma^a \hat{e}_a + \sigma^a \omega_a^b \hat{e}_b \\\
&= d\sigma^a \hat{e}_a + \omega_c^a \sigma^c \hat{e}_a \\\
&= (d\sigma + \Omega \sigma)^a \hat{e}_a.
\end{align}
Claim The curvature satisfies \(F = d\Omega - \Omega \wedge \Omega\).
Proof Just apply the above formula twice using Leibniz.
Connection 1-forms from Christoffel symbols. Suppose now that we are in the Riemannian setting and we already know the Christoffel symbols in some coordinates. Then we can express our frame \(\hat{e}_a\) in terms of coordinate vector fields, i.e.
\[ \hat{e}_a = \hat{e}_a^i \frac{\partial}{\partial x^i} \]
Then we have that
\[ \nabla_j \hat{e}_a^i = \frac{\partial \hat{e}_a^i}{\partial x^j} + \Gamma^i_{jk} \hat{e}_a^k \]
So, as a vector-valued 1-form, we have
\[ \nabla \hat{e} = \frac{\partial \hat{e}_a^i}{\partial x^j} dx^j \otimes \frac{\partial}{\partial x^i}
+ \Gamma^i_{jk} \hat{e}_a^k dx^j \otimes \frac{\partial}{\partial x^i}. \]
Juggling things a bit using the metric, we find
\[ \nabla \hat{e}_a = \frac{\partial \hat{e}_a^i}{\partial x^j} \hat{e}^b_i dx^j \otimes \hat{e}_b
+ \Gamma^i_{jk} \hat{e}_a^k \hat{e}_i^b dx^j \otimes \hat{e}_b. \]
So the connection 1-forms are given by
\[ \omega_a^b = \frac{\partial \hat{e}_a^i}{\partial x^j} \hat{e}^b_i dx^j
+ \Gamma^i_{jk} \hat{e}_a^k \hat{e}_i^b dx^j. \]
To come later (if I ever get around to it): some explicit computations.
Saturday, February 4, 2012
Geometry of Curved Spacetime 3
Today, some numerology. The Riemann curvature tensor is a tensor \(R_{abcd}\) satisfying the identities:
1. \(R_{abcd} = -R_{bacd}.\)
2. \(R_{abcd} = R_{cdba}. \)
3. \(R_{abcd} + R_{acdb} + R_{adbc} = 0. \) (First Bianchi)
4. \(R_{abcd|e} + R_{acec|d} + R_{abde|c} = 0. \) (Second Bianchi)
By 1, the number of independent \(ab\) indices is \(N = n(n-1)/2\), and similarly for \(cd\). By 2, the number of independent pairs of indices is \(N(N+1)/2\). Now the cyclic constraint 3 can be written as
\[ R_{[abcd]} = 0, \]
and thus constitutes \({n \choose 4}\) equations. So the number of independent components is
\begin{align}
N(N+1)/2 - {n \choose 4} &= \frac{n(n-1)((n(n-1)/2+1)}{4} - \frac{n(n-1)(n-2)(n-3)}{24} \\
&= \frac{(n^2-n)(n^2-n+2)}{8} - \frac{(n^2-n)(n^2-5n+6}{24} \\
&= \frac{n^4-2n^3+3n^2+2n}{8} - \frac{n^4-6n^3+11n^2-6n}{24} \\
&= \frac{2n^4-2n^2}{24} \\
&= \frac{n^4-n^2}{12} \\
&= \frac{n^2(n^2-1)}{12}
\end{align}
Now consider the Weyl tensor \(C_{abcd}\) which is defined as the completely trace-free part of the Rienmann tensor. The trace is determined by the Ricci tensor \(R_{ab}\) which as \(n(n+1)/2\) indepdendent components, so the Weyl tensor has
\[ \frac{n^2(n^2-1)}{12} - \frac{n^2-n}{2} = \frac{n^4-7n^2+6n}{12} \]
independent components. Now, for \(n = 1\) we see that \(R_{abcd}\) has no independent components, i.e. it vanishes identically. In \(n=2\), it has only 1 independent component, and so the scalar curvature determines everything. In \(n=3\), it has 6 independent components. Note that in this case, the Weyl tensor has no independent components, i.e. it is identically 0. So we see that in \(n = 2, 3\) every Riemannian manifold is conformally flat. So things only start to get really interesting in \(n=4\), where the Riemann tensor has 20 independent components, and the Weyl tensor has 10.
1. \(R_{abcd} = -R_{bacd}.\)
2. \(R_{abcd} = R_{cdba}. \)
3. \(R_{abcd} + R_{acdb} + R_{adbc} = 0. \) (First Bianchi)
4. \(R_{abcd|e} + R_{acec|d} + R_{abde|c} = 0. \) (Second Bianchi)
By 1, the number of independent \(ab\) indices is \(N = n(n-1)/2\), and similarly for \(cd\). By 2, the number of independent pairs of indices is \(N(N+1)/2\). Now the cyclic constraint 3 can be written as
\[ R_{[abcd]} = 0, \]
and thus constitutes \({n \choose 4}\) equations. So the number of independent components is
\begin{align}
N(N+1)/2 - {n \choose 4} &= \frac{n(n-1)((n(n-1)/2+1)}{4} - \frac{n(n-1)(n-2)(n-3)}{24} \\
&= \frac{(n^2-n)(n^2-n+2)}{8} - \frac{(n^2-n)(n^2-5n+6}{24} \\
&= \frac{n^4-2n^3+3n^2+2n}{8} - \frac{n^4-6n^3+11n^2-6n}{24} \\
&= \frac{2n^4-2n^2}{24} \\
&= \frac{n^4-n^2}{12} \\
&= \frac{n^2(n^2-1)}{12}
\end{align}
Now consider the Weyl tensor \(C_{abcd}\) which is defined as the completely trace-free part of the Rienmann tensor. The trace is determined by the Ricci tensor \(R_{ab}\) which as \(n(n+1)/2\) indepdendent components, so the Weyl tensor has
\[ \frac{n^2(n^2-1)}{12} - \frac{n^2-n}{2} = \frac{n^4-7n^2+6n}{12} \]
independent components. Now, for \(n = 1\) we see that \(R_{abcd}\) has no independent components, i.e. it vanishes identically. In \(n=2\), it has only 1 independent component, and so the scalar curvature determines everything. In \(n=3\), it has 6 independent components. Note that in this case, the Weyl tensor has no independent components, i.e. it is identically 0. So we see that in \(n = 2, 3\) every Riemannian manifold is conformally flat. So things only start to get really interesting in \(n=4\), where the Riemann tensor has 20 independent components, and the Weyl tensor has 10.
Path Integrals 3: Recovering the Spectrum from Asymoptotics
In my previous posts on path integrals, I described (rather tersely) how the path integral, suitably defined and interpreted, can be used to compute the Schwartz kernel of the operators \(e^{iHt}\) (Lorentzian signature) and \(e^{-Ht}\) (Euclidean signature).
Suppose that we understand the spectrum of \(H\) completely (nb: for a given system described by \(H\), this is the goal). For example, suppose we know that the spectrum of \(H\) consists of discrete eigenvalues \(E_n, n = 0, \cdots\) with corresponding eigenvectors \(|n\rangle\),
\[ H|n\rangle = E_n|n\rangle. \]
(For simplicity, I assume there is no continuous spectrum and that the eigenvalues are nondegenerate.) Then we have
\[ e^{-iHt} = \sum_n e^{-i E_n t} \langle n|n\rangle \]
and
\[ e^{-Ht} = \sum_n e^{-E_n t} \langle n|n\rangle \]
Now the second expression turns out to be very useful. Assume the eigenvalues are ordered so that
\[ E_0 < E_1 < \cdots \]
Then we can write
\[ e^{-Ht} = e^{-E_0 t} |0\rangle + \sum_{n \geq 1} e^{-(E_n-E_0)t}|n\rangle \]
Now suppose that \(v\) is some vector which is close to the ground state, in the sense that
\[ \langle v|0\rangle \neq 0 \]
(This is obviously a generic condition, so if we just pick \(v\) randomly we can expect this to be true.) Then we can consider
\[ e^{-Ht} v = e^{-E_0 t} v_0 |0\rangle + \sum_{n \geq 1} e^{-(E_n-E_0)t} v_n|n\rangle \]
Now for \(n \geq 1\), \(E_n-E_0\) is strictly positive, and so for large \(t\) all of the higher terms are exponentially damped. So, we have the asymptotic
\[ e^{-Ht} v \sim e^{-E_0 t }v_0|0\rangle \]
Next comes the really interesting part. Multiply on the right by a position-representation eigenbra \(\langle x|\):
\[ \langle x|e^{-Ht} v \sim e^{-E_0 t} v_0 \langle x|0\rangle \]
Now \(v_0\) is an irrelevant constant, so we might as well take it to be 1 (rescale \(v\) as necessary). The expression \(\langle x|0\rangle\) is exactly the ground state wavefunction in the position representation! Call it \(\psi_0(x)\). So to conclude: the large-t asymptotic of the expression \(\langle x|e^{-Ht}v\) is (up to an overall constant) given by \(e^{-E_0 t} \psi_0(x)\), hence we can recover both the ground state energy and the ground state wavefunction. But the value of this expression is exactly given by the Euclidean path integral. So we have a correspondence:
Asymptotics of Euclidean path integral \(\leftarrow\rightarrow\) The spectrum of \(H\).
Coming next: instantons.
Path Integrals 2: Euclidean Path Integrals and Heat Kernels
Consider the heat equation
\[ \frac{\partial \psi}{\partial t} = -\hat{H} \psi \]
We find similarly
\[ \langle x_N|U_t|x_0\rangle = \int \exp \sum_{j=0}^{N-1} ik_j(x_j - x_{j+1}) -\Delta t H(x_j, k_j) dx dk. \]
Now take \(H(x,k) = k^2/2m + V(x)\) and complete the square:
\[ ik_j(x_j - x_{j+1}) - \Delta t k_j^2/2m - \Delta tV(x_j) + ik_{j+1}(x_{j+1} - x_{j+2}) - k_{j+1}^2/2m - V(x_{j+1}) \]
\begin{align}
ik_j a_j - \Delta t k_j^2/2m &= -(-2mi k_j a_j / \Delta t + k_j^2) \Delta t/2m \\
&= -(k_j^2 -2mi k_j a_j/\Delta t -m^2 a_j^2/(\Delta t)^2 + m^2 a_j^2/(\Delta t)^2) \Delta t/2m \\
&= -(k_j -mia_j/\Delta t)^2 \Delta t/2m -ma_j^2/2\Delta t
\end{align}
Combining things together, we have that the heat kernel is given by
\[ \langle y|e^{-t \hat{H}}|x\rangle = \int e^{-S_{\textrm{euc}}} \mathcal{D}x \]
That is, Schwartz kernel of time evolution operator is given by the oscialltory Lorentzian signature path integral, whereas the heat kernel is given by the exponentially decaying path integral (better chance of being well-defined). Most importantly, the heat kernel contains most of the essential information about the spectrum of \(\hat{H}\), which is really all we need in order to understand the dynamics.
See ABC of Instantons. (I never understood the title of Nekrasov's "ABCD of Instantons" until I found this classic).
Thursday, January 26, 2012
Geometry of Curved Space, Part 2
Disclaimer: as before, these are (incredibly) rough notes intended for a tutorial. I may clean them up a bit later but for now it will seem like a lot of unmotivated equations (with typos!!).
The Energy Functional
\[ S = \int_0^T |\dot{\gamma}|^2 dt \]
Letting \(V^i = \dot{\gamma}^i\), this is
\[ S = \int_0^T g_{ij}(\gamma(t)) V^i V^j dt = \int_0^T L dt \]
where the Lagrangian \(L\) is
\[ L = g_{ij} V^i V^j \]
Now,
\[ \frac{\partial L}{\partial x^k} = (\partial_k g_{ij}) V^i V^j \]
and
\[ \frac{\partial L}{\partial V^k} = g_{ij} \delta^i_k V^j + g_{ij} V^i \delta^i_k = 2 g_{jk} V^j \]
Now,
\[ \frac{d}{dt} \frac{\partial L}{\partial V^k} = 2 (\partial_i g_{jk}) V^i V^j + 2 g_{jk} \dot{V}^j \]
Plugging these expressions into the Euler-Lagrange equations, we have
\[ 2 g_{jk} \dot{V}^j + \left(\partial_i g_{jk} + \partial_j g_{ik}- \partial_k g_{ij}\right) V^i V^j = 0 \]
Multiplying by the inverse metric, we have
\[ \dot{V}^k + \frac{g^{kl}}{2} \left( \partial_i g_{jl} + \partial_j g_{il} - \partial_l g_{ij} \right) V^i V^j = 0 \]
Which is the geodesic equation (recall the formula for the Christoffel symbols).
Orthonormal Frames (Lorentzian and Riemannian) (tetrads, vielbeins, vierbeins, ...)
Locally, we can find an orthonormal basis of vector fields \(e^\mu_i\). Greek indicates coordinates, whereas Latin indicates label in the basis. These necessarily satisfy
\[ g_{\mu\nu} e^\mu_i e^\nu_j = \eta_{ij} \]
where \(\eta_{ij}\) is the flat/constant metric (of whatever signature we are working in).
Methods for Computing Curvature (from Wald)
0. Getting the Christoffel symbols from the geodesic equation.
See e.g. sphere or spherical coordinates.
1. Coordinates. By definition,
\[ \nabla_a \nabla_b \omega_c = \nabla_b \nabla_a \omega_c + {R_{abc}}^d \omega_d \]
Writing things explicitly, this gives
\[ R_{abc}^d = \partial_b \Gamma^d_{ac} - \partial_a \Gamma^d_{bc}\]
\[+\Gamma^e_{ac}\Gamma^d_{be} - \Gamma^e_{bc}\Gamma^d_{ae}\]
(todo: fix typesetting.)
Do this for eg unit sphere in \(\mathbb{R}^3\).
2. Curvature in Frames (equivalent to coordinates but totally different flavor)
(note: Misner-Thorne-Wheeler seems much better than Wald for this stuff).
Using MTW notation. Fix a frame \(\mathbf{e_\mu}\) and a dual frame \(\omega^\mu\). The connection 1-forms are defined by
\[ 0 = d\omega^\mu + \alpha^\mu_\nu \wedge \omega^\nu \]
We also have
\[ dg_{\mu\nu} = \omega_{\mu\nu} + \omega_{\nu\mu}\]
So metric compatibility yields
\[ \omega_{\mu\nu} = -\omega_{\nu\mu}\]
Antisymmetry means fewer independent components. In this language, the curvature 2-form is given by
\[ R^\mu_\nu = d\alpha^\mu_\nu + \alpha^\mu_\sigma \wedge \alpha^\sigma_\nu \]
Gaussian Coordinates
Via Wald. Suppose \(S \subset M\) is a codimension 1 submanifold. If \(S\) is not null, we can find a normal vector field \(n^a\) which is everywhere orthogonal to \(S\) and has unit length. (Probably also need orientation to make it unique!) We can pick any coordinates \(x^1, \cdots, x^{n-1}\) on \(S\), and we pick the last coordinate to be the distance to \(S\), measured along a geodesic with initial tangent vector \(n^a\) (i.e. we use exponential coordinates in the normal direction).
Once we pick these coordinates, we obtain a family of hypersurfaces \(S_t\) given by
\(x^n = t\). These have the property that they are orthogonal to the normal geodesics through \(S\). Proof: (X are vector fields which are tangent to \(S_t\))
\[ n^b \nabla_b (n_a X^a) = n_a n^b \nabla_b X^a \]
\[= n_a X^b \nabla_b n^a \]
\[= \frac{1}{2}X^b \nabla_b (n^a n_a) = 0 \]
(first: geodesic, second: they lie-commute since they are coordinate vector fields).
Jacobi Fields, Focusing and Growth, Conjugate Points
Geodesic deviation. Suppose we have a 1-parameter family of geodesics \(\gamma_s\) with tangent \(T^a\) and deviation \(X^a\). (draw pictures!) By the geodesic equation, we have
\[ T^a \nabla_a T^b = 0 \]
What can we say about \(X^a\)? By change of affine parameter if necessary, we can assume that \(T^a\) and \(X^a\) are coordinate vector fields, and in particular they commute. So
\[ X^a \nabla_a T^b = T^a \nabla_a X^b \]
Then it is easy to see that \(X^a T_a\) is constant, and so (again by change of parameter if necessary) we can assume that it is 0. Now set \(v^a = T^b \nabla_b X^a\). We interpret this as the relative velocity of nearby geodesics. Similarly, we have the acceleration
\[ a^a = T^c \nabla_c v^a = T^b \nabla_b (T^c \nabla_c X^a) \]
Some manipulation shows that
\[ a^a = -R_{cbd}^a X^b T^c T^d \]
This is the geodesic deviation equation. (Positive curvature -> focus, negative curvature ->growth.)
Now we can work this in reverse. Suppose I have a single geodesic with tangent \(T^a\). If I have some vector field \(X^a\) on the geodesic, under what conditions will it integrate to give me a family of geodesics? The above shows that we must have
\[ T^a \nabla_a (T^b \nabla_b X^c) = -R_{abd}^c X^b T^a T^d \]
Solutions to this equation are called Jacobi vector fields.
Definition Points p, q on a geodesic are said to be conjugate if there exists a Jacobi field on the geodesics which vanishes at p and q. (Picture time!)
Definition (Cut Locus in Riemannian Signature) For \(p \in M\), we define the cut locus in \(T_p M\) to be those vectors \(v \in T_p M\) for which \(\exp(tv)\) is length minimizing on \([0,1]\) but fails to be length-minimizing on \([0,1+\epsilon]\) for and \(\epsilon\). The cut locus in M is the image of the cut locus in \(T_p M\) under the exponential map.
eg. Sphere, antipodes.
The Energy Functional
\[ S = \int_0^T |\dot{\gamma}|^2 dt \]
Letting \(V^i = \dot{\gamma}^i\), this is
\[ S = \int_0^T g_{ij}(\gamma(t)) V^i V^j dt = \int_0^T L dt \]
where the Lagrangian \(L\) is
\[ L = g_{ij} V^i V^j \]
Now,
\[ \frac{\partial L}{\partial x^k} = (\partial_k g_{ij}) V^i V^j \]
and
\[ \frac{\partial L}{\partial V^k} = g_{ij} \delta^i_k V^j + g_{ij} V^i \delta^i_k = 2 g_{jk} V^j \]
Now,
\[ \frac{d}{dt} \frac{\partial L}{\partial V^k} = 2 (\partial_i g_{jk}) V^i V^j + 2 g_{jk} \dot{V}^j \]
Plugging these expressions into the Euler-Lagrange equations, we have
\[ 2 g_{jk} \dot{V}^j + \left(\partial_i g_{jk} + \partial_j g_{ik}- \partial_k g_{ij}\right) V^i V^j = 0 \]
Multiplying by the inverse metric, we have
\[ \dot{V}^k + \frac{g^{kl}}{2} \left( \partial_i g_{jl} + \partial_j g_{il} - \partial_l g_{ij} \right) V^i V^j = 0 \]
Which is the geodesic equation (recall the formula for the Christoffel symbols).
Orthonormal Frames (Lorentzian and Riemannian) (tetrads, vielbeins, vierbeins, ...)
Locally, we can find an orthonormal basis of vector fields \(e^\mu_i\). Greek indicates coordinates, whereas Latin indicates label in the basis. These necessarily satisfy
\[ g_{\mu\nu} e^\mu_i e^\nu_j = \eta_{ij} \]
where \(\eta_{ij}\) is the flat/constant metric (of whatever signature we are working in).
Methods for Computing Curvature (from Wald)
0. Getting the Christoffel symbols from the geodesic equation.
See e.g. sphere or spherical coordinates.
1. Coordinates. By definition,
\[ \nabla_a \nabla_b \omega_c = \nabla_b \nabla_a \omega_c + {R_{abc}}^d \omega_d \]
Writing things explicitly, this gives
\[ R_{abc}^d = \partial_b \Gamma^d_{ac} - \partial_a \Gamma^d_{bc}\]
\[+\Gamma^e_{ac}\Gamma^d_{be} - \Gamma^e_{bc}\Gamma^d_{ae}\]
(todo: fix typesetting.)
Do this for eg unit sphere in \(\mathbb{R}^3\).
2. Curvature in Frames (equivalent to coordinates but totally different flavor)
(note: Misner-Thorne-Wheeler seems much better than Wald for this stuff).
Using MTW notation. Fix a frame \(\mathbf{e_\mu}\) and a dual frame \(\omega^\mu\). The connection 1-forms are defined by
\[ 0 = d\omega^\mu + \alpha^\mu_\nu \wedge \omega^\nu \]
We also have
\[ dg_{\mu\nu} = \omega_{\mu\nu} + \omega_{\nu\mu}\]
So metric compatibility yields
\[ \omega_{\mu\nu} = -\omega_{\nu\mu}\]
Antisymmetry means fewer independent components. In this language, the curvature 2-form is given by
\[ R^\mu_\nu = d\alpha^\mu_\nu + \alpha^\mu_\sigma \wedge \alpha^\sigma_\nu \]
Gaussian Coordinates
Via Wald. Suppose \(S \subset M\) is a codimension 1 submanifold. If \(S\) is not null, we can find a normal vector field \(n^a\) which is everywhere orthogonal to \(S\) and has unit length. (Probably also need orientation to make it unique!) We can pick any coordinates \(x^1, \cdots, x^{n-1}\) on \(S\), and we pick the last coordinate to be the distance to \(S\), measured along a geodesic with initial tangent vector \(n^a\) (i.e. we use exponential coordinates in the normal direction).
Once we pick these coordinates, we obtain a family of hypersurfaces \(S_t\) given by
\(x^n = t\). These have the property that they are orthogonal to the normal geodesics through \(S\). Proof: (X are vector fields which are tangent to \(S_t\))
\[ n^b \nabla_b (n_a X^a) = n_a n^b \nabla_b X^a \]
\[= n_a X^b \nabla_b n^a \]
\[= \frac{1}{2}X^b \nabla_b (n^a n_a) = 0 \]
(first: geodesic, second: they lie-commute since they are coordinate vector fields).
Jacobi Fields, Focusing and Growth, Conjugate Points
Geodesic deviation. Suppose we have a 1-parameter family of geodesics \(\gamma_s\) with tangent \(T^a\) and deviation \(X^a\). (draw pictures!) By the geodesic equation, we have
\[ T^a \nabla_a T^b = 0 \]
What can we say about \(X^a\)? By change of affine parameter if necessary, we can assume that \(T^a\) and \(X^a\) are coordinate vector fields, and in particular they commute. So
\[ X^a \nabla_a T^b = T^a \nabla_a X^b \]
Then it is easy to see that \(X^a T_a\) is constant, and so (again by change of parameter if necessary) we can assume that it is 0. Now set \(v^a = T^b \nabla_b X^a\). We interpret this as the relative velocity of nearby geodesics. Similarly, we have the acceleration
\[ a^a = T^c \nabla_c v^a = T^b \nabla_b (T^c \nabla_c X^a) \]
Some manipulation shows that
\[ a^a = -R_{cbd}^a X^b T^c T^d \]
This is the geodesic deviation equation. (Positive curvature -> focus, negative curvature ->growth.)
Now we can work this in reverse. Suppose I have a single geodesic with tangent \(T^a\). If I have some vector field \(X^a\) on the geodesic, under what conditions will it integrate to give me a family of geodesics? The above shows that we must have
\[ T^a \nabla_a (T^b \nabla_b X^c) = -R_{abd}^c X^b T^a T^d \]
Solutions to this equation are called Jacobi vector fields.
Definition Points p, q on a geodesic are said to be conjugate if there exists a Jacobi field on the geodesics which vanishes at p and q. (Picture time!)
Definition (Cut Locus in Riemannian Signature) For \(p \in M\), we define the cut locus in \(T_p M\) to be those vectors \(v \in T_p M\) for which \(\exp(tv)\) is length minimizing on \([0,1]\) but fails to be length-minimizing on \([0,1+\epsilon]\) for and \(\epsilon\). The cut locus in M is the image of the cut locus in \(T_p M\) under the exponential map.
eg. Sphere, antipodes.
Subscribe to:
Posts (Atom)