Processing math: 2%

Friday, December 28, 2012

BRST and Lie Algebra Cohomology

We saw in previous posts that gauge-fixing is intimately related to BRST cohomology. Today I want to explain the underlying mathematical formalism, as it is actually something very well-known: Lie algebra cohomology. Let g be a Lie algebra and M a g-module. We will construct a cochain complex that computes the Lie algebra cohomology with values in M, Hi(g,M). Out of thin air, we define
C(g,M)=Mg.
The grading is just the grading induced by the grading on g, which we identify with the BRST ghost number. Let ei be a basis for M and Ta be a basis for g, with canonical dual basis Sa. The differential is defined on generators to be
dei=ρ(Ta)eiSa
dSa=12fabcSbSc
where ρ:gEnd(M) is the representation and fabc are the structure constants of the group. This differential is then extended to satisfy the graded Leibniz rule, and is easily verified to satisfy d2=0 (this is just the Jacobi identity). The Lie algebra cohomology is just the cohomology of this cochain complex. Essentially by definition, we see that
H0(g,M)={mM | ξm=0  ξg},
i.e. H0()=()g is the invariants functor. In fact, this can be taken to be the defining property of Lie algebra cohomology:

Theorem Hk(g,M)=Rk(M)g.

Returning to field theory, we see (modulo some hard technicalities!) that, roughly, g is the Lie algebra of infinitesimal gauge transformations, and M is the algebra of functions on the space of all connections. The ghost and anti-ghost fields can then be seen to be the multiplication and contraction operators. To wit, we can take ca to be the operator
ca:fSaf
and take ˉca to be the operator
ˉca:fSaf=Ta
Then we have
[c^a, \bar{c}^b] = \delta^{ab} 
so that \bar{c} is indeed the antifield of c.

Sunday, December 23, 2012

BRST

Finally, I want to discuss gauge-invariant of the gauge-fixed theory. (!?) We saw in the previous posts that if we have a gauge theory with connection A and matter fields \psi, in order to derive sensible Feynman rules we have to introduce a gauge-fixing function G as well as Fermionic fields c, \bar{c}, the ghosts. (Note: last time I used \eta, \bar{\eta} for the ghosts but I want to match the more standard notation, so I've switched to c, \bar{c}).

Usually it is convenient to use the gauge-fixing function G(A) = \partial^\mu A_\mu. Under an infinitesimal gauge-transformation \lambda, A transforms as
A \mapsto -\nabla \lambda,
so G(A) transforms as
G(A) \mapsto G(A) - \partial^\mu \nabla_\mu \lambda.
Hence the term in the Lagrangian involving the ghosts is
-\bar{c}^a \partial^\mu \nabla_\mu^{ab} c^b,
and our gauge-fixed Lagrangian is
\mathcal{L} = -\frac{1}{4} |F|^2 + \bar{\psi}(iD\!\!\!/-m)\psi + -\frac{|\partial^\mu A_\mu|^2}{2\xi}  - \bar{c}^a \nabla_\mu^{ab} c^b
Introducing an auxiliary filed B^a, this is of course equivalent to
\mathcal{L} = -\frac{1}{4} |F|^2 + \bar{\psi}(iD\!\!\!/-m)\psi + \frac{\xi}{2} B^a B_a  + B^a \partial^\mu A_{\mu a} - \bar{c}^a \nabla_\mu^{ab} c^b. 
Now, there are two questions one might ask: (1) how can we tell that this is a gauge-theory? i.e., what remains of the original gauge symmetry? and (2) does the resulting theory depend in any way on the choice of gauge-fixing function?

The answer to both of these questions is BRST symmetry. The field c is Lie-algebra valued, so we could think of it as being an infinitesimal gauge transformation. Rather, for \epsilon a constant odd variable, \epsilon c is even and an honest infinitesimal gauge transformation. Under this transformation, we have
\delta_\epsilon A = -\nabla (\epsilon c) = -\epsilon \nabla c.
Then we define a graded derivation \delta by
\delta A = - \nabla c.
We have a grading by ghost number, where \mathrm{gh}(A) = 0, \mathrm{gh}(\psi) = 0, \mathrm{gh}(c) = 1, \mathrm{gh}(\bar{c}) = -1. We would like to extend \delta to a derivation of degree +1 that squares to 0. First, we should figure out what \delta c is. We compute:
\begin{align} 0 &= \delta^2 A \\ &= \delta(-\nabla c) \\ &= -\partial \delta c - (\delta A) c - A (\delta c) + (\delta c) A - c (\delta A) \\ &= -\partial \delta c + (\nabla c) c + c (\nabla c) - [A, \delta c] \\ &= -\nabla(\delta c) + \nabla(c^2). \end{align}
From this, we see that \nabla(\delta c) = \nabla(c^2), so we can set
\delta c = c^2 = \frac{1}{2}[c, c].
Then \delta^2 c = 0 is just the Jacobi identity for the group's Lie algebra! Finally, we would like to extend \delta to act on \psi, B, and \bar{c} so that \delta \mathcal{L} = 0, and \delta^2 = 0. Since the action on A is by infinitesimal gauge transformation, this leaves the curvature term of \mathcal{L} invariant. Similarly, the \psi term is invariant if we simply take
\delta \psi = c \cdot \psi
where dot denotes the infinitesimal gauge transformation. Using the known rules for \delta, we find that
\delta \mathcal{L} = \frac{\xi}{2} \left(\delta B B + B \delta B \right) + \delta B \cdot \partial^\mu A_\mu  - B \cdot \partial^\mu \nabla_\mu c - \delta\bar{c} \cdot \partial^\mu \nabla_\mu c 
By comparing coefficients, we find (together with what we've already computed)
\begin{align} \delta A &= -\nabla c \\ \delta \psi &= c \cdot \psi \\ \delta c &= \frac{1}{2}[c,c] \\ \delta \bar{c} &= B \\ \delta B &= 0. \end{align}
This is the BRST differential. Now, suppose that \mathcal{O}(A, \psi) is a local operator involving the physical fields A and psi. Then by construction,delta O is the change of O under an infinitesimal gauge transformation. Hence, we find

An operator \mathcal{O} is gauge invariant \iff \delta\mathcal{O} = 0.
Now, suppose the functional measure \mathcal{D}A \mathcal{D}\psi \mathcal{D}B \mathcal{D}c \mathcal{D}\bar{c} is gauge-invariant, i.e. is BRST closed. (This assumption is equivalent to the absence of anomalies, but we'll completely ignore this in today's post.) Then we have
\langle \delta \mathcal{O} \rangle = 0
for any local observable \mathcal{O}. This just follows from integration by parts (this is where we have to assume the measure is \delta-closed). Now, why is this significant? First, this tells us that the space of physical observables is
H^0(C^\ast_{\mathrm{BRST}}, \delta)
where C^\ast_{\mathrm{BRST}} is the cochain complex of local observables, graded by ghost number.

Now, the real power of the BRST formalism is the following. We find that the gauge-fixed Lagrangian can be written as
\mathcal{L}_{gf} = \mathcal{L}_0 +\delta \left(\bar{c} \frac{B}{2} + \bar{c}\Lambda\right)
where  \Lambda = \partial^\mu \nabla_\mu A is our gauge-fixing function, and \mathcal{L}_0 is the original Lagrangian without gauge-fixing. Now the point is, any two choices of gauge fixing differ by terms which are BRST exact, and hence give the same expectation values on the physical observables H^0. So we have restored gauge invariance, while obtaining a gauge-fixed perturbation theory!


Fadeev-Popov Ghosts, continued

Last time I sketched how we can represent an integral over a submanifold M \subset \mathbb{R}^n by an integral of the form
\int_{\mathbb{R}^n} f(x) \delta(G(x)) \exp\left(\bar{\eta}G(x+\eta) \right) d\eta d\bar{\eta} dx.
Here, \eta, \bar{\eta} are Fermionic variables called Fadeev-Popov ghosts, which are introduced to cancel an unwanted determinant factor. The function G(x) singles out the submanifold M as M = G^{-1}(0).

Now suppose that we start with a vector (or affine) space V, which is acted on by a group H. We would like to undstand integrals over the quotient V / H in terms of integrals over V. Suppose there is some function G(x) on V satisfying the following property:

For each level w of G, the subspace M_w := G^{-1}(w) intersects the orbits transversely, and furthermore every H-orbit intersects M_w exactly once. (*)
We call such a function a gauge-fixing function, and a level w a gauge-fixing. By assumption, we have V/H \cong M_w for any w. Hence using the integral we derived last time, we can integrate over M_w for any particular choice of w, and this ought to be the same as integrating over V/H. The problem, however, is that in the QFT setting it's not clear what the Feynman rules should be for such a path integral. The final trick is that since the answer should be independent of w, and by integrating over all possible w we obtain a Lagrangian from which we can derive sensible Feynman rules.

We have some integral

Z = \int \delta(G(x) - w) \exp\left\{\frac{i}{\hbar}S(x) + \bar{\eta}dG \eta\right\} dx d\eta d\bar{\eta}
which is independent of w. So we add a Gaussian weight an integrate over w:
\begin{align} Z' &= \int \delta(G(x) -w) \exp\left\{\frac{i}{\hbar}S(x) + \bar{\eta}dG \eta - \frac{1}{2\xi} |w|^2\right\}  dx dw d\eta d\bar{\eta}\\ &= \int \exp\left\{\frac{i}{\hbar}S(x) + \bar{\eta}dG \eta - \frac{1}{2\xi} |G(x)|^2 \right\} dx d\eta d\bar{\eta}. \end{align}
Here, \xi is an arbitrary real positive constant, and we denote the new integral by Z' to indicate that it differs from the old path integral Z by (at most) an overall constant. Now the important thing is that the new action appear in the integrand of Z' is gauge-fixed and hence there is no problem whatsoever in deriving sensible, meaningful Feynman rules. The gauge-fixing term |G(x)|^2 serves to make the action non-degenerate, so that propagators are well-defined, while the term involving the Fermions \eta, \bar{\eta} generates new Feynman rules that "cancel" the superfluous degrees of freedom due to gauge redundancy.

The question remains, what if we choose some other gauge-fixing function? i.e., what happens if we perturb G(x) to some new function satisfying property (*)? We'll answer this using the BRST formalism.


Friday, December 21, 2012

Fadeev-Popov Ghosts

Today I want to review the Fadeev-Popov procedure, with a view toward BRST and eventually BV.

Gauge-Invariance and Gauge-Fixing


First we'll review the Fedeev-Popov method, to motivate the introduction of ghosts. Suppose we have a gauge theory involving a G-connection A and some field \phi charged under G. Under a gauge transformation g(x), \phi transforms as
\phi \mapsto g \cdot \phi.
We would like that the covariant derivative transforms in the same way, i.e.
\nabla \phi \mapsto g \nabla \phi.
In terms of the connection 1-form A, the covariant derivative is
\nabla = d + A.
Let \nabla' denote the gauge-transformed covariant derivative, and \phi' the gauge-transformed field. Then we want
\nabla' \phi' = g \nabla \phi.
We compute
\begin{align} \nabla' \phi' &= (d + A')(g \phi) \\ &= dg \phi + g d\phi + A'(g\phi) \\ &= gg^{-1}dg \phi + gd\phi + g g^{-1} A' g\phi \\ &= g(d\phi + g^{-1} A' g \phi + g^{-1} dg \phi \\ &= g(d\phi + A\phi) \end{align}
Comparing terms, we see that
A = g^{-1} A' g + g^{-1} dg,
so upon re-arranging we have
A' = g A g^{-1} - dg g^{-1}

This causes a problem: at critical points of the action, the Hessian of the action is degenerate in directions tangent to the gauge orbits. This means that the propagator is not well-defined, and there is no obvious way to derive the Feynman rules for perturbation theory. The solution is to take the quotient by gauge-transformations. To do this, we pick some gauge-fixing function G(A) which ought to be transverse to the orbits. Then we can restrict to the space G(A) = 0, on which the Hessian of the action is non-degenerate, leading to a well-defined propagator. Formally, the path integral is
Z = \int_{\{G(A) = 0\}} \exp{\frac{i}{\hbar} S[A, \phi]} \mathcal{D}A \mathcal{D}\phi  
Formally, this suggests that the path integral should be something like
Z = \int \delta(G(A)) \exp{\frac{i}{\hbar} S[A, \phi]} \mathcal{D}A \mathcal{D}\phi,
but this is not quite right! To understand the source of the problem, we'll first study the finite-dimensional case and then use this to solve the problem in infinite-dimensions.


The Fadeev-Popov Determinant


Suppose we are on \mathbb{R}^n, and we would like to integrate a function f(x) over a submanifold M defined by M = G^{-1}(0) for some smooth function G: \mathbb{R}^n \to \mathbb{R}^k. Naively, we might expect that the answer is
\int_M f(x) \stackrel{?}{=} \int f(x) \delta(G(x)) dx.
To see why this is not correct, write the delta function as
\delta(G(x)) = \frac{1}{(2\pi)^k}\int e^{ip\cdot G(x)} d^k p.
We can regularize this by taking the limit as \epsilon \to 0 of
\frac{1}{(2\pi)^k}\int \exp\left\{ip\cdot G(x) -\frac{\epsilon}{2} |p|^2\right\} d^k p
This integral is Gaussian, so we obtain explicitly
\left(\frac{2\pi}{\epsilon} \right)^{\frac{k}{2}} \exp\left\{-\frac{1}{2\epsilon} |G(x)|^2 \right\}.
So our original guess becomes
\left(\frac{1}{2\pi \epsilon}\right)^\frac{k}{2} \int f(x) \exp\left\{ -\frac{1}{2\epsilon} |G(x)|^2 \right\} dx .
As \epsilon \to 0, this integral localizes on the locus \{G(x) = 0\}, as desired, but does not give the right answer! To see this, let u be a coordinate on M = G^{-1}(0) and v coordinates normal to M. Then we have
G(x) = G(u,v) = v^T H(u) v + o(|v|^3)
where H(x) is the Hessian of |G|^2 at the point x = (u, 0). So the integral becomes (as \epsilon \to 0)
\begin{align} I_\epsilon &= \left(\frac{1}{2\pi \epsilon}\right)^\frac{k}{2} \int\int  f(u, v) \exp\left\{ -\frac{1}{2\epsilon} v^T H(u) v \right\} du dv \\ &= \int_M \frac{f(u)}{\sqrt{\det H(u)}} du. \end{align}
This is not correct. We have to account for the determinant of the Hessian. Now, the Hessian is given by
\begin{align} H_{ij} &= \frac{1}{2} \frac{\partial^2 |G|^2}{\partial v^i \partial v^j} \\ &= \frac{\partial}{\partial v^i} \left( G^a \partial_j G^a \right) \\ &=  \left(\partial_i G^a \partial_j G^a + G^a \partial_{ij} G^a \right) \\ &=  \partial_i G^a \partial_j G^a \end{align}
where we have used the fact that G = 0 on x = (u, 0). Hence we see that
\det H = (\det A)^2
where A is the k \times k matrix with entries \partial_i G^a. Hence
\sqrt{\det H} = \det A.
Now there is a straightforward way to eliminate the determinant. We introduce Fermionic coordinates \eta^i, \theta^i, i = 1, \ldots, k. Then by Berezin integration, we have
\int e^{\eta^i G^i(0, \theta^j)} d\theta d\eta = \det A.
So in the end, we find
\int_M f(x) d\mu = \int_{\mathbb{R}^n} f(x) \delta(G(x)) \exp\left(\eta \cdot G(x+ \theta) \right)  dx d\theta d\eta.

Thursday, December 13, 2012

The Weyl and Wigner Transforms

Today I'd like to try to understand better how deformation quantization is related to the usual canonical quantization, and especially how the latter might be used to deduce the former, i.e., given an honest quantization (in the sense of operators), how might be reproduce the formula for the Moyal star product?

We'll fix our symplectic manifold once and for all to be \mathbb{R}^2 with its standard symplectic structure, with Darboux coordinates x and p. Let \mathcal{A} be the algebra of observables on \mathbb{R}^2. For technical reasons, we'll restrict to those smooth functions that are polynomially bounded in the momentum coordinate (but of course the star product makes sense in general). Let \mathcal{D} be the algebra of pseudodifferential operators on \mathbb{R}. We want to define a quantization map
\Psi: \mathcal{A} \to \mathcal{D}
such that
\Psi(x) = x \in \mathcal{D}
\Psi(p) = -i\hbar \partial
Out of thin air, let us define
\langle q| \Psi(f) |q' \rangle = \int e^{ik(q-q')} f(\frac{q+q'}{2}, k) dk
This is the Weyl transform. Its inverse is the Wigner transform, given by
\Phi(A, q, k) = \int e^{-ikq'} \left\langle q+\frac{q'}{2} \right| A \left| q - \frac{q'}{2} \right\rangle dq'
Note: I am (intentionally) ignoring all factors of 2\pi involved. It's not hard to work out what they are, but annoying to keep track of them in calculations, so I won't.

Theorem For suitably well-behaved f, we have \Phi(\Psi(f)) = f.

Proof Using the "ignore 2\pi" conventions, we have the formal identities
\int e^{ikx} dx = \delta(k), \ \ \int e^{ikx} dk = \delta(x).
The theorem is a formal result of these:
\begin{align} \Phi(\Psi(f))(q, k) &= \int e^{-ikq'} \left\langle q + \frac{q'}{2} \right| \Psi(f) \left| q - \frac{q'}{2} \right\rangle \\\ &= \int e^{-ikq'} e^{ik'q'} f(q, k) dk' dq' \\\ &= f(q,k). \end{align}

One may easily check that \Psi(x) = x and Psi(k) = -i\partial, so this certainly gives a quantization. But why is it particularly natural? To see this, let Q be the operator of multiplication by x, and let P be the operator -i\partial. We'd like to take f(q,p) and replace it by f(Q, P), but we can't literally substitute like this due to order ambiguity. However, we could work formally as follows:
\begin{align} f(Q, P) &= \int \delta(Q-q) \delta(P - p) f(q,p) dq dp \\\ &= \int e^{ik(Q-q) + iq'(P-p)} f(q,p) dq dq' dp dk. \end{align}
In this last expression, there is no order ambiguity in the argument of the exponential (since it is a sum and not a product), and furthermore the expression itself make sense since it is the exponential of a skew-adjoint operator. So let's check that this agrees with the Weyl transform. Using a special case of the Baker-Campbell-Hausdorff formula for the Heisenberg algebra, we have
e^{ik(Q-q) + iq'(P-p)} = e^{ik(Q-q)} e^{iq'(P-p)} e^{-ikq'/2}
Let us compute the matrix element:
\begin{align} \langle q_1 | P | q_2 \rangle &= \int \langle q_1 | p_1 \rangle \langle p_1 | P | p_2 \rangle \langle p_2 | q_2 \rangle dp_1 dp_2 \\\ &= \int e^{iq_1p_1 - iq_2 p_2} p_2 \delta(p_2 - p_1) dp_1 dp_2 \\\ &= \int e^{i p(q_1-q_2)} p dp. \end{align}
Hence we find that the matrix element for the exponential is
\begin{align} \langle q_1 |e^{ik(Q-q) + iq'(P-p)} | q_2 \rangle &= e^{-ikq'/2 + ik(q_1-q)} \langle q_1 | e^{iq'(P-p)} | q_2 \rangle \\\ &=  \int e^{-ikq'/2 + ik(q_1-q) -iq'p} e^{iq'p'' + ip''(q_1-q_2)} dp'' \\\ &= \delta(q' + q_1 - q_2)  e^{-ikq'/2 + ik(q_1-q) -iq'p} \end{align}
Plugging this back into the expression for f(Q, P) we find
\begin{align}  \langle q_1 | f(Q. P) | q_2 \rangle &= \int \delta(q' + q_1 - q_2)  e^{-ikq'/2 + ik(q_1-q) -iq'p} f(q,p) dq dq' dp dk \\\ &= \int  e^{ ik(q_1/2 +q_2/2-q) -ip(q_1-q_2)} f(q,p) dq dp dk \\\ &= \int e^{ip(q_1-q_2)} f(\frac{q_1+q_2}{2}, p) dp, \end{align}
which is the original expression we gave for the Weyl transform.

Thursday, November 29, 2012

Equations of Motion and Noether's Theorem in the Functional Formalism

First, let us recall the derivation of the equations of motion and Noether's theorem in classical field theory. We have some action functional S[\phi] defined by some local Lagrangian:
S[\phi] = \int L(\phi, \partial \phi) dx.
The classical equations of motion are just the Euler-Lagrange equations
\frac{\delta S}{\delta \phi(x)} = 0  \iff \partial_\mu \left( \frac{\partial L}{\partial(\partial_\mu\phi)} \right) = \frac{\partial L}{\partial \phi}

Now suppose that S is invariant under some transformation \phi(x) \mapsto \phi(x) + \epsilon(x) \eta(x), so that S[\phi] = S[\phi+\epsilon \eta]. Here we treat \eta as a fixed function but \epsilon may be an arbitrary infinitesimal function. The Lagrangian is not necessarily invariant, but rather can transform with a total derivative:
L(\phi+\epsilon \eta) = L(\phi) + \frac{\partial L}{\partial (\partial_\mu \phi)} \eta \partial_\mu \epsilon + \epsilon \partial_\mu f^\mu
For some unknown vector field f^\mu (which we could compute given any particular Lagrangian). So let's compute
\begin{align}\delta_\epsilon S &= \int \delta_\epsilon L \\\ &= \int \frac{\partial L}{\partial (\partial_\mu \phi)} \eta \partial_\mu \epsilon + \epsilon \partial_\mu f^\mu \\\ &= \int \partial_\mu \left(f^\mu - \frac{\partial L}{\partial (\partial_\mu \phi)} \eta \right) \epsilon \end{align}
Let us define the Noether current J^\mu by
J^\mu = \frac{\partial L}{\partial (\partial_\mu \phi)} \eta - f^\mu.
Then the previous computation showed that
\frac{\delta S}{\delta \epsilon} = -\partial_\mu J^\mu.
If \phi is a solution to the Euler-Lagrange equations, then the variation dS vanishes, hence we obtain:

Theorem (Noether's theorem) The Noether current is divergence free, i.e.
\partial_\mu J^\mu = 0.

Functional Version


First, we derive the functional analogue of the classical equations of motion. Consider an expectation value
\langle \mathcal{O(\phi)} \rangle = \int \mathcal{O}(\phi) e^{\frac{i}{\hbar} S} \mathcal{D}\phi
We'll assume that \phi takes values in a vector space (or bundle). Then we can perform a change of variables \psi = \phi + \epsilon, and since \mathcal{D}\phi = \mathcal{D}\psi we find that
\int \mathcal{O}(\phi+\epsilon) \exp\left(\frac{i}{\hbar} S[\phi] \right) \mathcal{D}\phi
is independent of \epsilon. Expanding to first order in \epsilon, we have
0 = \int \left(\frac{\delta\mathcal{O}}{\delta \phi}  + \frac{i \mathcal{O}}{\hbar} \frac{\delta S}{\delta \phi} \right)  \exp \left( \frac{i}{\hbar} S \right) \mathcal{D}\phi  
So we find the quantum analogue of the equations of motion:
\left\langle \frac{\delta \mathcal{O}}{\delta \phi} \right\rangle + \frac{i}{\hbar} \left\langle \mathcal{O} \frac{\delta S}{\delta \phi} \right\rangle = 0

Next, we move on to the quantum version of Noether's theorem. Suppose there is a transformation Q of the fields leaving the action invariant. Assuming the path integral measure is invariant, we obtain
\left\langle QF \right \rangle + \frac{i}{\hbar} \left\langle F QS \right\rangle = 0
To compare with the classical result, consider Q to be the (singular) operator
Q = \frac{\delta}{\delta \epsilon(x)}
Then by the previous calculations,
Q S = -\delta_\mu J^\mu,
so we obtain
\left\langle \frac{\delta \mathcal{O}}{\delta \epsilon(x)} \right\rangle = \frac{i}{\hbar} \left\langle \mathcal{O} \partial_\mu J^\mu \right\rangle.  
This is the Ward-Takahashi identity, the quantum analogue of Noether's theorem.

Saturday, November 24, 2012

The Moyal Product

Today I want to understand the Moyal product, as we will need to understand it in order to construct quantizations of symplectic quotients. (More precisely, to incorporate stability conditions.)



Let A be the algebra of polynomial functions on T^\ast \mathbb{C}^n. This algebra has a natural Poisson bracket, given by
\{p_i, x_j\} = \delta_{ij}.
We would like to define a new associative product \ast on A((\hbar)) satisfying:

  1. f  \ast g = fg + O(\hbar)
  2. f \ast g - g \ast f = \hbar \{f, g\} + O(\hbar^2)
  3. 1 \ast f = f \ast 1 = f
  4. (f \ast g)^\ast = -g^\ast \ast f^\ast
In the last line, the map (\cdot)^\ast takes x_i \mapsto x_i and p_i \mapsto -p_i. To figure out what this new product should be, let's take f,g \in A and expand f \ast g in power series:
f \ast g = \sum_{n=0}^\infty c_n(f,g) \hbar^n
Now, equations (1) and (2) will be satisfied by taking c_0(f,g) = fg and c_1(f,g) = \{f,g\}/2. Let \sigma be the Poisson bivector defining the Poisson bracket. This defines a differential operator \Pi on A \otimes A by
\Pi = \sigma^{ij} (\partial_i \otimes \partial_j)
Let B = \sum_{n=0}^\infty B_n \hbar^n and write the product as

f \ast g = m \circ B(f \otimes g).
Now, condition (2) tells us that B(0) = 1 and that
\left. \frac{dB}{d\hbar} \right|_{\hbar=0} = \frac{\Pi}{2}
So
B = 1 + \frac{\hbar \Pi}{2} + O(\hbar^2)
It is natural to guess that B should be built out of powers of \Pi, and a natural guess is
B = \exp(\frac{\hbar \Pi}{2}),
which certainly reproduces the first two terms of our expansion. Let's see that this choice actually works, i.e. defines an associative \ast-product. Let m: A \otimes A \to A be the multiplication, and
m_{12}, m_{23}: A \otimes A \otimes A \to A \otimes A, m_{123}: A \otimes A \otimes A \to A the induced multiplication maps. Then
\begin{align} f \ast (g \ast h) &= m \circ(B( f \otimes m \circ B(g \otimes h) ) ) \\\ &= m \circ B( m_{23} \circ (1 \otimes B)(f \otimes g \otimes h) ) \\\ &= m_{123} (B \otimes 1)(1 \otimes B)(f \otimes g \otimes h) \end{align}
On the other hand, we have

\begin{align} (f \ast g) \ast h) &= m \circ(B( m \circ B(f \otimes g) \otimes h) ) ) \\\ &= m \circ B( m_{12} \circ (B \otimes 1)(f \otimes g \otimes h) ) \\\ &= m_{123} (1 \otimes B)(B \otimes 1)(f \otimes g \otimes h) \end{align}

Hence, associativity is the condition
m_{123} \circ [1\otimes B, B \otimes 1] = 0.

On A \otimes A \otimes A, write \partial_i^1 for the partial derivative acting on the first factor, \partial_i^2 on the second, etc. Then
1 \otimes B = \sum_n \frac {\hbar^n}{2^n n!}  \Pi^{i_1 j_1} \cdots \Pi^{i_n j_n} \partial^2_{i_1} \partial^3_{j_1} \cdots \partial^2_{i_n} \partial^3_{j_n}
and similarly for B \otimes 1. So we have
\begin{align} m_{123} (B\otimes 1)(1 \otimes B) &= \sum_n \sum_{k=0}^n \frac {\hbar^n}{2^n k! (n-k)!}  \Pi^{k_1 l_1} \cdots \Pi^{k_k l_k} \partial_{k_1} \partial_{l_1} \cdots \partial_{k_k} \partial_{l_k} \\\  & \ \times  \Pi^{i_1 j_1} \cdots \Pi^{i_{n-k} j_{n-k}} \partial_{i_1} \partial_{j_1} \cdots \partial_{i_{n-k}} \partial_{j_{n-k}} \\\ &= m_{123}(1 \otimes B)(B \otimes 1) \end{align}
Hence we obtain an associative \ast-product. This is called Moyal product.


Sheafifying the Construction


Now suppose that U is a (Zariski) open subset of X = T^\ast \mathbb{C}^n. Then the star product induces a well-defined map
\ast: O_X(U)((\hbar)) \otimes_\mathbb{C} O_X(U)((\hbar)) \to O_X(U)((\hbar))
In this way we obtain a sheaf \mathcal{D} of O_X modules with a non-commutative \ast-product defined as above.

Define a \mathbb{C}^\ast action on T^\ast \mathbb{C}^n by acting on x_i and p_i with weight 1. Extend this to an action on \mathcal{D} by acting on hbar with weight -1.

Proposition: The algebra C^\ast-invariant global sections of \mathcal{D} is naturally identified with the algebra of differential operators on \mathbb{C}^n.

Proof: The \mathbb{C}^\ast-invariant global sections are generated by \hbar^{-1} x_i and \hbar^{-1} p_i. So define a map \Gamma(\mathcal{D})^{\mathbb{C}^\ast} \to \mathbb{D} by
\hbar^{-1} x_i \mapsto x_i
\hbar^{-1} p_i \mapsto \partial_i
From the definition of the star product, it is clear that this is an algebra map, and that it is both injective and surjective.

Thursday, November 22, 2012

An Exercise in Quantum Hamiltonian Reduction

Semiclassical Setup

Let the group GL(2) act on V = \mathrm{Mat}_{2\times n} and consider the induced symplectic action on T^\ast V. If we use variables (x,p) with x a 2 \times n matrix and p an n \times 2 matrix, then the classical moment map \mu is given by
\mu(x,p) = xp
This is equivariant with respect to the adjoint action, so we can form the GL(2)-invariant functions
Z_1 = \mathrm{Tr} \mu
Z_2 = \mathrm{Tr} (\mu)^2
If we think of x as being made of column vectors
x = ( x_1 \cdots x_n )
and similarly think of p as being made of row vectors, then there are actually many more GL(2) invariants, given by
f_{ij} = \mathrm{Tr} x_i p_j = p_j x_i
In terms of the invariants, the Z functions are
Z_1 = \sum_k f_{kk}
Z_2 = \sum_{jk} f_{jk} f_{kj}
Let us compute Poisson brackets:
\begin{align}  \{f_{ij}, f_{kl}\} &= \{p_j^\mu x_i^\mu, p_l^\nu x_k^\nu\} \\\ &= x_i^\mu p_l^\nu \delta_{jk} \delta^{\mu\nu} - p_j^\mu x_k^\nu \delta_{il} \delta^{\mu\nu} \\\ &= f_{il} \delta_{jk} - f_{kj} \delta_{il}. \end{align}
So we see that the invariants form a Poisson subalgebra (as they should!). Let's compute:
\begin{align} \{Z_1, f_{ij} &= \sum_k \{ f_{kk}, f_{ij} \} \\\ &= \sum_k \left( f_{kj} \delta_{ki} - f_{ik} \delta_{kj} \right) \\\ &= f_{ij} - f_{ij} = 0. \end{align}
Hence Z_1 is central with respect to the invariant functions f_{ij}. Similarly,
\begin{align} \{Z_2, f_{kl}\} &= \sum_{ij} \{f_{ij} f_{ji}, f_{kl}\} \\\ &= \sum_{ij} f_{ij} \left(f_{jl} \delta_{ik} - f_{ki} \delta_{jl} \right) + f_{ji} \left(f_{il} \delta_{jk} - f_{kj} \delta_{il} \right) \\\ &= \sum_j f_{kj} f_{jl} - \sum_i f_{il} f_{ki} + \sum_i f_{ki} f_{il} - \sum_j f_{jl} f_{kj} \\\ &= 0. \end{align}
So we see that the Z_i are in the center of the invariant algebra. In fact, they generate it, so we'll denote by Z the algebra generated by Z_1, Z_2. Let A be the algebra generated by the f_{ij}. The inclusion Z \hookrightarrow A can be thought of as a purely algebraic version of the moment map. In particular, given any character \lambda: Z \to \mathbb{C}, we can define the Hamiltonian reduction of A to be
A_\lambda := A / A\langle \ker \lambda \rangle
The corresponding space is of course \mathrm{Spec} A.


The Cartan Algebra and the Center

Define functions

h_1 = Z_1 = \sum_i f_{ii}
h_2 = Z_2 = \sum_{ij} f_{ij} f_{ji}
h_3 = \sum_{ijk} f_{ij} f_{jk} f_{ki}
h_k = \sum_{i_1, i_2, \ldots, i_k} f_{i_1 i_2} f_{i_2 i_3} \cdots f_{i_k i_1}

These are just the traces of various powers of the n \times n matrix px. In particular, h_k for k>n may be expressed as a function of the h_i for i \leq n. The algebra generated by the H plays the role of a Cartan subalgebra. So we have inclusions
Z \subset H \subset A

Quantization

Now we wish to construct a quantization of A and A_\lambda. The quantization of A is obvious: we quantize T^\ast V by taking the algebraic differential operators on V. Denote this algebra by \mathbb{D}. It is generated by x_i and (\partial_i\) satisfying the relation
[\partial_i, x_j] = \delta_{ij}
Then we simply the subalgebra of GL(2)-invariant differential operators as our quantization of A. Call this subalgebra U. We can define Hamiltonian reduction analogously by taking central quotients. So we need to understand the center Z(U), but this is just the subalgebra generated by quantizations of Z_1 and Z_2, i.e. the subalgebra of all elements whose associated graded lies in Z(A).

More to come: stability conditions, \mathbb{D}-affineness, and maybe proofs of some of my claims.

Wednesday, November 7, 2012

The 1PI Effective Action

In this post I'd like to try to understand the 1PI effective action that is often of interest. Suppose we have a QFT in some bosonic field \phi(x) taking values in a vector space (this is important). Then its vev \phi_{cl}(x) := \langle \phi(x) \rangle is just an ordinary (but possibly distributional) field on spacetime. The question is, what is the field equation satisfied by \phi_{cl}? I.e., if we average over quantum effects by replacing all fields by their vevs, what is the action that governs this (now completely classical) theory? The 1PI effective action answers exactly this question.

Consider the generating functional
Z[J] = \int e^{-S[\phi]+\langle \phi, J \rangle} \mathcal{D}\phi
Then for a given source J, define the J-vev of \phi(x) to be
\phi_J(x) = \frac{\partial \log Z[J]}{\partial J}.
Now let's take \Gamma to be the Legendre transform of \log Z with respect to J:
\Gamma[\phi_J] = \langle J, \phi_J \rangle - \log Z[J]
Then we compute:
\frac{\partial \Gamma}{\partial \phi_J} = J + \frac{\partial J}{\partial \phi_J} \phi_J -\frac{\partial \log Z[J]}{\partial J} \frac{\partial J}{\partial \phi_J} = J.

Now consider the situation without a background source, i.e. J = 0. Then \phi_0 = \phi_{cl} and we find
\frac{\partial \Gamma}{\partial \phi_{cl}} = 0
Hence, \phi_{cl} satisfies the Euler-Lagrange equations associated to the functional \Gamma. Note that from the Legendre transform, \Gamma takes quantum effects (i.e. Feynman diagrams with loops) into account, even though the field and the equations are purely classical!

By studying these equations, we might find instanton solutions (or solitons in Lorentz signature).

Now for the name. Some combinatorics and algebra (which I will skip!) show that \Gamma[\phi_{cl}] is itself a generating functional for certain correlation functions, then 1PI correlation functions:
\frac{\partial^n \Gamma}{\partial \phi(x_1) \cdots \partial \phi(x_n)} = \langle \phi(x_1) \cdots \phi(x_n) \rangle_{1PI}.
The 1PI subscript means that the RHS is computed in perturbation theory by summing over only the connection 1PI (1 particle irreducible) Feynman diagrams.

Warning: As usual, there are regularization issues, both in the UV and IR. UV divergences can be solved by a cutoff (if we only care about effective field theory), but IR divergences are much more technical. For this reason (and others), it is sometimes preferable to try to understand the low energy dynamics by studying the Wilsonian effective action. As the Wilsonian effective action does not take IR modes into account, it can avoid many of the difficulties of the 1PI effective action.

Monday, November 5, 2012

Spontaneous Symmetry Breaking in QFT

In this post I want to try to understand symmetry breaking and the origin of the moduli space of vacua. Most of this can be found in the lectures by Witten in vol. 2 of Quantum Fields and Strings.


Non-Example: Quantum Mechanics

The main point of confusion for me is that my quantum intuition comes from ordinary quantum mechanics. However, this turns out to be incredibly misleading because for most reasonable quantum mechanical systems, spontaneous symmetry breaking cannot occur. In fact, we'll see that even in field theory, the question of whether spontaneous symmetry breaking can occur is intimately related to the geometry of spacetime. Since quantum mechanics is QFT in 0+1 dimensions (i.e., the spatial part of spacetime is just a point), spontaneous symmetry breaking is forbidden.

Consider a particle in one spatial dimension, with Hamiltonian
H = -\frac{\hbar^2}{2} + (a^2-x^2)^2.
The classical ground states are given by the stationary solutions x(t) = \pm a. Hence we might expect that the quantum Hamiltonian has a degenerate ground state, i.e. the eigenspace of the lowest eigenvalue has dimension greater than one. However, this is not the case!

Sketch of proof: Define a function E(\phi) on the unit sphere in L^2(\mathbb{R}). If \phi is a global minimum, then it necessarily satisfies H\psi = E_0 \phi where E_0 is the lowest eigenvalue of H. On the other hand, E(\phi) = E(|\phi|) so |\phi| is also a global minimum, hence satisfies H|\phi| = E_0|\phi|. This equation is elliptic, so by elliptic regularity |\phi| must be at least C^1, and hence \phi(x) has constant phase, so we might as well take \phi(x) to be real and positive. Any other ground state \psi would have these properties, and hence (\phi,\psi) \neq 0. Hence the eigenspace is 1-dimensional and the ground state is not degenerate.

Now by the Stone-von Neumann theorem, there is a unique irreducible representation of the canonical commutation relations on a separable Hilbert space, and by the argument above there is a unique (up to scale) vector |\Omega\rangle which is the ground state of the Hamiltonian, called the vacuum vector. So for QM, we find a unique representation on \mathcal{H} together with a unique vacuum vector |\Omega\rangle \in \mathcal{H}. The point I want to stress here is that the Poisson algebra of observables together with the Hamiltonian determine the data \mathcal{H}, |\Omega\rangle) in an essentially unique way, so there is no ambiguity in quantization and no further choices need to be made.


QFT in Finite Volume


Now we'll argue that a similar symmetry breaking phenomenon should be expected whenever the spatial part of spacetime has finite volume. We'll have to use formal path integral arguments, so of course this won't be totally rigorous. Suppose we have two representations \mathcal{H}_\pm with vacua |\Omega\rangle_\pm. Then we consider the direct sum \mathcal{H} = \mathcal{H}_+ \oplus \mathcal{H}_-. Then by construction, we should have
(\Omega_+, e^{-tH} \Omega_-) = 0
since |\Omega\rangle_\pm are orthogonal eigenstates of H. On the other hand, we can compute this inner product using the Feynman-Kac formula. The semi-classical approximation to the path integral yields
(\Omega_+, e^{-tH} \Omega_-) = C \exp\left(- \frac{S(t) V}{\hbar} \right)
Where S(t) is the classical least action and V is the volume of space. If V < \infty, the right hand side is non-zero, contradicting our assumptions! Hence the vacuum is non-degenerate. So we find that QFT with finite spatial volume is much like QM, at least as far as symmetry breaking is concerned.

Note that this argument is essentially just a formal manipulation of the path integral, so you should expect a result of this form independent of the particular regularization scheme used to define the path integral.

QFT in Infinite Volume

Now we consider the case V = \infty, i.e. a non-compact space. Then the preceding argument fails spectacularly, as does the Stone-von Neumann theorem. So there is no guarantee of a unique irreducible representation of the algebra, and no guarantee of a unique vacuum vector.
So we see that the situation is significantly more complicated. We can expect a moduli space \mathcal{M} of vacua of the theory, and the low energy effective theory is described by a \sigma-model with target \mathcal{M}. I'll try to discuss this in more detail in follow-up posts.

Sunday, November 4, 2012

Seiberg-Witten Theory and the Riemann-Hilbert Problem

References:

The Classical Moduli Space of Vacua

For definiteness, we'll consider just the case of SU(2) considered by Seiberg and Witten. There is a scalar Higgs field \phi. The classical vacua of the theory are given by the absolute minima of the potential energy, which in this case is proportional to
\mathrm{Tr}[\phi, \phi^\dagger]^2
Hence at the minimum, [\phi,\phi^\dagger]=0 and \phi is diagonalizable. Hence the classical moduli space of vacua \mathcal{M}_{cl} is just \mathbb{C}, with complex coodinate a, corresponding to the Higgs field
\phi = \left( \begin{array}{rr}a & 0 \\ 0 & -a\end{array} \right)
Actually, due to gauge invariance, it is better to introduce another copy of the complex plane \mathcal{B} with local coordinate u = \frac{1}{2} Tr \phi^2 = a^2. Then we can think of \mathcal{M}_{cl} as a branched cover of \mathcal{B}, with a a (local) choice of square root of u.

The goal is to understand the low energy effective theory. We introduce a cutoff \Lambda to define the quantum theory, and integrate out all degrees of freedom except for the low momentum modes of \phi (in particular, we integrate out the gauge field d.o.f.). The result is a \sigma-model with target \mathcal{M}_{cl}. The kinetic term of the \sigma-model is governed by the metric on \mathcal{M}_{cl}, hence the low energy effective action determines a metric on \mathcal{M}_{cl}.

We'll see that 1-loop calculations introduce monodromy, so that in the quantum theory, "functions" on \mathcal{M}_{cl} are actually sections of non-trivial bundles over \mathcal{M}_{cl}, and furthermore that the metric receives corrections from instantons (or BPS states). So what we really would like to understand/construct is the quantum moduli space of vacua \mathcal{M}, which will be some non-trivial modification of \mathcal{M}_{cl}. The key to the Seiberg-Witten solution is that susy allows us to reduce the problem to finding a specified set of holomorphic functions (in the u coordinate\) satsfying certain monodromies, and that once we know the monodromies the solution is given to us by the Riemann-Hilbert correspondence.

The Riemann-Hilbert Correspondence

Let X be \mathbb{P}^1 with punctures at the points z_1, \ldots, z_n. Let U be the universal cover of X and let G be the fundamental group of X (pick some basepoint away from the punctures). A set of monodromy matrices is exactly what is needed to specify a representation V of G. Since U / G = X, we can form the associated bundle E = U \times_G V over X. The (trivial) G-connection on U \to X induces a flat connection \nabla on E. This gives a map from representations of G to flat connections on X.

Conversely, given a flat connection on X, the monodromy about the punctures determines a representation of G. Hence monodromy is a map from flat connections on X to representations of G. The Riemann-Hilbert correspondence is that these two maps are bijections, modulo the natural notions of equivalence (conjugacy and gauge transformations).

Gross Overview of the Seiberg-Witten Approach

We are now ready to sketch the "big picture" idea of Seiberg and Witten, which applies not only to their N=2, d=4 example but also to certain other compactifications of the N=1, d=6 theory (in particular, the one considered by Gaiotto-Moore-Neitzke).

As discussed above, the theory will have a classical moduli space of vacua \mathcal{M}, which turns out to be a complex manifold (or variety, and possibly with singularities). We'll let u be an abstract local complex coordinate on \mathcal{M}. Supersymmetry then tells us that the main quantities we are interested in (to compute the low energy effective action) are holomorphic in u (away from the singularities/punctures of \mathcal{M}!). The general outline is as follows:

  1. Identify functions f_i(u) which by susy are holomorphic in u.
  2. Compute the 1-loop corrections to f_i(u).
  3. Compute monodromies of the corrected f_i(u).
  4. Find the desired f_i(u) by solving the Riemann-Hilbert problem for these monodromies.
Now, to be more clear, it is a consequence of susy that the renormalized quantities f_i(u) are given schematically by
f_{i, \mathrm{ren}}(u) = f_{i, \mathrm{cl}}(u) + f_{i,1}(\frac{u}{\Lambda}) + \sum_{k=0}^\infty c_{i,k} \left(\frac{\Lambda}{u} \right)^k

Here, f_{i,\mathrm{cl}}(u) is the classical function, f_{i,1}(u) is the one-loop correction, and the terms in the series are corrections coming from instantions (BPS states). Non-renormalization theorems due to susy guarantee that there are no higher loop corrections. One expects the instanton series to converge, and hence the monodromy is completely determined by the one-loop calculation. This is the key: by Riemann-Hilbert, the monodromy determines the f_i(u) uniquely--solving the Riemann-Hilbert problem is equivalent to computing the infinitely-many instantion corrections!

Now in general, solving the Riemann-Hilbert problem is difficult, so this reduction is of a theoretical but not necessarily practical nature. The second main idea of Seiberg and Witten is that we can solve this Riemann-Hilbert problem explicitly by introducing a family of curves \{C_u\}_{u\in\mathcal{B}}, called Seiberg-Witten curves (or spectral curves).


Electric-Magnetic Duality


An absolutely key requirement of the Seiberg-Witten construction is electric-magnetic duality. Maxwell's equations in vacuum are
dF = 0, \ \ \ d\ast F = 0.
Here F is a 2-form, and \ast F is its Hodge dual, a (d-2)-form in d-dimensions. The first equation implies that F = dA for some 1-form A, and we normally think of the second equation as the Euler-Lagrange equations for the action written in terms of A. However, we could equally well take the starting point to be the second equation, taking \ast F = dB, and take the first equation to be the Euler-Lagrange equations for B. The problem with either of these approaches is that they allow particles of either electric or magnetic charge, but not both.

To put electric and magnetic charge on equal footing, we introduce fields F and F_D (a 2-form and a (d-2)-form\). Then the Lagrangian is (up to factors that I'm too lazy to care about)
\mathcal{L} = \mathrm{Tr} F \wedge F_D
However, to recover Maxwell's equations, we need to impose \ast F_D = F as a constraint. So to get the right equations of motion, introduce an auxiliary field \lambda (a Lagrange multiplier), and modify the Lagrangian:
\mathcal{L} = \mathrm{Tr} F \wedge F_D + \lambda(F - \ast F_D)
Variation with respect to (F, F_D, \lambda) will reproduce Maxwell's equations exactly, but in this form the EM duality is manifest. Since EM duality exchanges electric and magnetic charges, we should consider how to modify the Lagrangian to couple the field to EM sources. Let J_e, J_m be the electric and magnetic currents, respectively. Up to conventions, Maxwell's equations read
dF = J_m, \ \ \ d F_D = J_e.
Then we take the Lagrangian to be
\mathcal{L} = \mathrm{Tr} F \wedge F_D + \lambda(F - \ast F_D) + F \wedge J_e + J_m \wedge F_D
to reproduce the right equations of motion.

In this form, we can consider particles with electric or magnetic charge (or both--dyons). If our gauge group has rank r, then the lattice of electric charges is \mathbb{Z}^r, while the lattice of magnetic charges is (\mathbb{Z}^\ast)^r. Hence the lattice of electromagnetic charges is
\Gamma = \mathbb{Z}^r \oplus (\mathbb{Z}^\ast)^r
which comes with a natural symplectic pairing
\langle \cdot, \cdot \rangle: \Gamma \otimes \Gamma \to \mathbb{Z}.

(You might ask why we take the natural sympletic pairing as opposed to the natrual symmetric pairing. This is because there is actually a larger SL(2,\mathbb{Z}) symmetry of the theory which preserves the symplectic pairing but not the symmetric pairing.)

Now there is an obvious source of symplectic lattices. Simply let C be a genus r compact Riemann surface. Then \Gamma = H_1(C, \mathbb{Z}) is a symplectic lattice of rank 2r, where the symplectic pairing is now given by the intersection pairing. In fact, we can say more--if we take a- and b-cycles as generators, these form a Darboux (symplectic) basis of \Gamma.

Back to the gauge theory problem. Recall that the 1-loop calculation and consideration of BPS states leads to a set of monodromy data on \mathcal{B}. Suppose now that we could find a complex surface C \to \mathcal{B} whose fibers C_u are (possibly singular) genus r curves, and such that the monodromies of \Gamma_u := H_1(C_u, \mathbb{Z}) agree with the given monodromies. Then we can solve the Riemann-Hilbert problem by doing geometry on this family, i.e. by finding holomorphic sections of certain associated bundles.

The SU(2) Seiberg-Witten Solution

We will now specialize to the case considered in the original paper of Seiberg and Witten. I will only construct the family--the details of the solution will follow in a subsequent post.

In this case, the group has rank r=1, so we should be looking for a family of elliptic curves. In this case, the solution is almost obvious, given what I've said above. Seiberg and Witten argue that the moduli space \mathcal{B} must be \mathbb{C} \setminus \{\Lambda^2, -\Lambda^2\}. The punctures at \pm \Lambda^2 come from BPS states whose mass goes to zero at those values of u. So the monodromy consists of three matrices, M_\infty, M_\pm, the monodromies computed around \infty and \pm \Lambda^2. These generate a certain modular subgroup G of SL(2, \mathbb{Z}), allowing us to realize \mathcal{B} as the modular curve H / G (where H is the upper half-plane). Now, the space of elliptic curves is just H / SL(2, \mathbb{Z}). So given any u \in \mathbb{B}, we pick a lift \tilde{u} in H and let C_u be the corresponding elliptic curve. This is exactly the family needed to solve the Riemann-Hilbert problem!

Next time: details of this construction, including exact formulas, and some words about instanton counting.



Monday, October 29, 2012

BPS States and Wall-Crossing

This is the first in what I hope will become a series of posts on BPS state counting and wall-crossing. I'm participating in gLab, and our most immediate goal is to understand the Kontsevich-Soibelman wall-crossing formula (KSWCF) in the context of quadratic differentials on a (punctured) Riemann surface, following the lectures of Kontsevich and Neitzke at IHES.

The purpose of these posts is to keep a written record of my attempts to understand the physics behind the WCF as well as the work of Gaiotto-Moore-Neitzke.

References:



Video Lectures:


Physics Setup:

Warning: I'm still trying to sort this all out, so a lot of this will be fuzzy and/or completely wrong. I will try to point out the points of confusion.

We will start with some kind of family of susy gauge theories (or rather, a single  "theory" with a family of vacua, depending on what asymptotic boundary conditions we specify in the path integral). We let \mathcal{B} be some kind of manifold (or variety, possibly with singularities?), and \{\mathcal{H}_u\}_{u \in \mathcal{B}} a family (bundle) of Hilbert spaces, depending on u \in \mathcal{B}. Concretely, \mathcal{B} will parametrize the vacuum expectation values (VEVs) of the scalar fields of the theory. (Note, for non-scalar fields we can typically expect VEVs to vanish, for example by looking at the action of the Lorentz group.) Actually, to be more precise, \mathcal{B} parametrizes the Coulomb branch--where the VEVs break the gauge symmetry to a maximal torus (as opposed to the Higgs branch, where the VEVs just break the gauge group to a smaller subgroup).

The next ingredient is a lattice \Gamma, the charge lattice, which is supposed to parametrize all possible electric and magnetic charges. Since electric and magnetic charges are dual, this lattice has a pairing \Gamma \otimes \Gamma \to \mathbb{Z} which is symplectic (or possibly just Poisson?). (Actually, maybe we should think of \Gamma as being a bundle of lattices over \mathcal{B}, but this isn't completely clear to me.) The lattice gives a grading of \mathcal{H}:
\mathcal{H} = \bigoplus_{\gamma \in \Gamma} \mathcal{H}_\gamma .

Now, the Hilbert spaces \mathcal{H}_u are supposed to carry representations of the \mathcal{N}=2 susy algebra, with central charge Z. On any state of charge \gamma above the point u \in \mathcal{B}, the central charge Z acts as a scalar, which we denote by Z_\gamma(u). Manipulations with the susy algebra show the BPS bound M \geq |Z_\gamma(u)|, where M is the mass of a state with charge \gamma. A state is called BPS if it saturates this bound.

Finally, I'll end this post by attempting to define (or at least motivate) the walls of marginal stability. In all known examples, we have
|Z_{\gamma_1 + \gamma_2}(u)|^2 = |Z_{\gamma_1}(u)|^2 + |Z_{\gamma_2}(u)|^2 +2 \mathrm{Re}(Z_{\gamma_1}(u) \bar{Z}_{\gamma_2}(u) )
If the cross-term is negative, then it is possible to form stable bound states (since the mass of a BPS state of charge \gamma_1+\gamma_2 is strictly less than the sum of the corresponding masses); and it is impossible to form stable bound states if the cross-term is positive. This (naive!) dichotomy tells us that there is something very special about the intermediate case. For a pair of charges \gamma_1, \gamma_2 we define a wall in \mathcal{B} by
W(\gamma_1, \gamma_2) = \{u \in \mathcal{B} \ | \ \mathrm{Re}(Z_{\gamma_1}(u)\bar{Z}_{\gamma_2}(u)) = 0 \}
and we define W \subset \mathcal{B} to be the union of all the walls.

The idea of wall-crossing is the following. We define some functions \Omega(\gamma; u) on \mathcal{B} \setminus W which are locally constant. These functions are supposed to count the number of BPS states of charge \gamma (where count really means take the trace of a particular operator over \mathcal{H}_{\gamma, \mathrm{BPS}}). The wall-crossing formula is an explicit formula that relates \Omega(\gamma; u_+) and \Omega(\gamma; u_-) for u_+, u_- on opposite sides of a wall in \mathcal{B}. There are two applications of WCF:

1. We pick some particular u \in \mathcal{B} for which \Omega is particularly easy to calculate ("extreme stability"). Then by KSWCF we actually know how to compute \Omega on all of \mathcal{B} \setminus W.

2. Gaiotto-Moore-Neitzke study a certain QFT whose low energy effective action is a sigma model with target space \mathcal{M}, the moduli space of Higgs bundles over a Riemann surface. The invariants \Omega(\gamma; u) together with KSWCF allow them to compute the low energy effective action explicitly, giving an explicit construction of holomorphic Darboux coordinates on \mathcal{M}. This is enough to recover the full hyperkahler metric on \mathcal{M}, in local coordinates!


Tasklist (incomplete!):

  • Define susy algebra, derive BPS bound
  • Understand/construct the charge lattice and its pairing
  • Sketch that 3d sigma model with \mathcal{N}=4 has a hyperkahler target
  • Sketch/understand why the low energy effective action has target Higgs
  • Understand computation of effective action: Seiberg-Witten curves and all that
  • Understand how KSWCF implies consistency of the Darboux coordinates

Monday, October 22, 2012

Seiberg-Witten Theory Video Lectures

I found some lectures by Sara Pasquetti on Seiberg-Witten theory here:

Unfortunately, it seems that the quality is so poor that it is impossible to read the blackboard!


Monday, August 27, 2012

A Toy Model for Effective Field Theory and Extra Dimensions

I wanted to see how the Fourier transform can turn field theory into many-particle mechanics. This is just silly fooling around, so you shouldn't take what follows too seriously (there are much better models of extra dimensions, to be sure!).

Take \phi(t, s) to be a field on a cylinder of radius R. We consider the action

S = \frac{1}{R} \int_{-\infty}^\infty \int_0^{R} |\nabla \phi|^2 ds dt

Expand \phi(t, s) in Fourier series:
\phi(t,s ) = \sum_n \phi_n(t) e^{2 \pi i n s / R}
Then in Lorentzian signature, we have
 \int_0^{R} |\nabla \phi|^2 d\theta = R \sum_n \dot{\phi}_n^2 - \left(\frac{2\pi n}{R}\right)^2 \phi_n^2.

Putting this back into the action, we find
S = \sum_n \int_{-\infty}^\infty \dot{\phi}_n^2- \left(\frac{2\pi n}{R}\right)^2 \phi_n^2 dt.

This is the action for infinitely many harmonic oscillators, with frequencies \omega_n = 2\pi |n| / R. Recall that the energy levels of the harmonic oscillator are k\omega for k = 0, 1, \ldots. So supposing that only a finite energy E is accessible in some particular experiment, we can only excite those modes \phi_n for which
\frac{2\pi |n|}{R} < E.
In particular, only finitely many \phi_n may be excited at energies below E, effectively reducing the field theory on the cylinder to many-particle quantum mechanics.

Thursday, July 26, 2012

Generating Functions

Method of Generating Functions


Let X and Y be two smooth manifolds, and let M = T^\ast X, N = T^\ast Y with corresponding symplectic forms \omega_M and \omega_N.

Question: How can we produce symplectomorphisms \phi: M \to N?

The most important construction from classical mechanics is the method of generating functions. I will outline this method, shameless stolen from Ana Cannas da Silva's lecture notes.

Suppose we have a smooth function f \in C^\infty(X \times Y). Then its graph \Gamma is a submanifold of M \times N: \Gamma = \{ (x,y, df_{x,y}) \in M \times N \}. Since M \times N is a product, we have projections \pi_M, \pi_N, and this allows us to write the graph as
\Gamma = \{ (x, y, df_x, df_y) \}
Now there is a not-so-obvious trick: we consider the twisted graph \Gamma^\sigma given by
\Gamma^\sigma =  \{(x,y, df_x, -df_y) \}
Note the minus sign.

Proposition If \Gamma^\sigma is the graph of a diffeomorphism \phi: M \to N, then \phi is a symplectomorphism.

Proof By construction, \Gamma^\sigma is a Lagrangian submanifold of M \times N with respect to the twisted symplectic form \pi_M^\ast \omega_M - \pi_N^\ast \omega_N. It is a standard fact that a diffeomorphism is a symplectomorphism iff its graph is Lagrangian with respect to the twisted symplectic form, so we're done.

Now we have:

Modified question: Given f \in C^\infty(M \times N), when is its graph the graph of a diffeomorphism \phi: M \to N?

Pick coordinates x on X and y on Y, with corresponding momenta \xi and \eta. Then if \phi(x,\xi) = (y,\eta), we obtain
\xi = d_x f, \ \eta = -d_y f
Note the simlarity to Hamilton's equations. By the implicit function theorem, we can construct a (local) diffeomorphism \phi as long as f is sufficiently non-degenerate.

Different Types of Generating Functions

We now concentrate on the special case of M = T^\ast \mathbb{R} = \mathbb{R} \times \mathbb{R}^\ast. Note that this is a cotangent bundle in two ways: T^\ast \mathbb{R} \cong T^\ast \mathbb{R}^\ast. Hence we can construct local diffeomorphisms T^\ast \mathbb{R} \to T^\ast \mathbb{R} in four ways, by taking functions of the forms
f(x_1, x_2), \ f(x_1, p_2), \ f(p_1, x_2), \ f(p_1, p_2)

Origins from the Action Principle, and Hamilton-Jacobi

Suppose that we have two actions
S_1 = \int p_1 \dot{q}_1 - H_1 dt, \ S_2 = \int p_2 \dot{q}_2 - H_2 dt
which give rise to the same dynamics. Then the Lagrangians must differ by a total derivative, i.e.
p_1 \dot{q}_1 - H_1 = p_2 \dot{q}_2 - H_2  + \frac{d f}{dt}
Suppose that f = -q_2 p_2 + g(q_1, p_2, t). Then we have
p_1 \dot{q}_1 - H_1 = -q_2 \dot{p}_2 - H_2 + \frac{\partial g}{\partial t} + \frac{\partial g}{\partial q_1}\dot{q}_1 + \frac{\partial g}{\partial p_2} \dot{p_2}
Comparing coefficients, we find
p_1 = \frac{\partial g}{\partial q_1}, \ q_2 = \frac{\partial g}{\partial p_2}, \ H_2 = H_1 + \frac{\partial g}{\partial t}

Now suppose that the coordinates (q_2, p_2) are chosen so that Hamilton's equations become
\dot{q_2} = 0, \ \dot{p}_2 = 0
Then we must have H_2 = 0, i.e.
H_1 + \frac{\partial g}{\partial t} = 0
Now we also have \partial H_2 / \partial p_2 = 0, so this tells us that g is independent of p_2, i.e. g = g(q_1, t). Since p_1 = \partial g / \partial q_1, we obtain
\frac{\partial g}{\partial t} + H_1(q_1, \frac{\partial g}{\partial q_1}) = 0
This is the Hamilton-Jacobi equation, usually written as
\frac{\partial S}{\partial t} + H(x, \frac{\partial S}{\partial x}) = 0
Note the similarity to the Schrodinger equation! In fact, one can derive the Hamilton-Jacobi equation from the Schrodinger equation by taking a wavefunction of the form
\psi(x,t) = A(x,t) \exp({\frac{i}{\hbar} S(x,t)})
and expanding in powers of \hbar. This also helps to motivate the path integral formulation of quantum theory.

Monday, July 23, 2012

KAM I

In this post I want to sketch the idea of KAM, following these lecture notes.

Integrable Systems


I don't want to worry too much about details, so for now we'll define an integrable system to be a Hamiltonian system (M, \omega, H) for which we can choose local Darboux coordinates (I, \phi) with I \in \mathbb{R}^N and \phi \in T^N, such that the Hamiltonian is a function of I only. Defining \omega_j := \partial H / \partial I_j, Hamilton's equations then read
\begin{align} \dot{I}_j &= 0, \\\ \dot{\phi}_j &= \omega_j(I). \end{align}
Hence we obtain linear motion on the torus as our dynamics. Note in particular that the sets \{I = \mathrm{const}\} are tori, and that the dynamics are constrained to these tori. We call these tori "invariant".


Now suppose that our Hamiltonian H is of the form
H(I, \phi) = h(I) + f(I, \phi)
with f "small". What can be said of the dynamics? Specifically, do there exist invariant tori? KAM theory lets us formulate this question in a precise way, and gives an explicit quantitative answer (as long as f is nice enough, and small enough).

I want to sketch the idea of the KAM theorem, completely ignoring analytical details.



Constructing the Symplectomorphism


Suppose we could find a symplectomorphism \Phi: (I, \phi) \mapsto (\tilde{I}, \tilde{\phi})\) such that H(I, \phi) = H(\tilde{I}. Then our system would still be integrable (just in new action-angle coordinates), and we'd be done. There are two relatively easy ways of constructing symplectomorphisms: integrating symplectic vector fields, and generating functions. In the lecture notes, generating functions are used, so let's take a minute to discuss them.

Proposition Let \Sigma(\tilde{I}, \phi) be a smooth function and suppose that the transformation
I = \frac{\partial \Sigma}{\partial \phi}, \ \tilde{\phi} = \frac{\partial \Sigma}{\partial \tilde{I}}
can be inverted to produce a diffeomorphism \Phi: (I, \phi) \mapsto (\tilde{I}, \tilde{\phi}). Then \Phi is a symplectomorphism.

Proof
dI = \frac{\partial^2 \Sigma}{\partial \phi \partial \tilde{I}} d \tilde{I}
d\tilde{\phi} = \frac{\partial^2 \Sigma}{\partial \phi \partial \tilde{I}} d\phi
Hence
dI \wedge d\phi = \frac{\partial^2 \Sigma}{\partial \phi \partial \tilde{I}} d \tilde{I} \wedge d\phi = d\tilde{I} \wedge d\tilde{\phi}.

We want a symplectomorphism \Phi such that
H \circ \Phi(\tilde{I}, \tilde{\phi}) = \tilde{h}(\tilde{I}
If \Phi came from a generating function \Sigma, then we have
H(\frac{\partial \Sigma}{\partial \phi}, \phi) = \tilde{h}(\tilde{I})
Expanding things, we have
h(\frac{\partial \Sigma}{\partial \phi}) + f(\frac{\partial \Sigma}{\partial \phi}, \phi) = \tilde{h}(\tilde{I}).

If f is small, then we might expect \Phi to be close to the identity, and hence \Sigma ought to be close to the generating function for the identity (which is \langle I, \phi \rangle). So we take
\Sigma(\tilde{I}, \phi) = \langle \tilde{I}, \phi \rangle + S(\tilde{I}, \phi)
where S should be "small". So we linearize the equation in S:
\langle \omega(\tilde{I}), \frac{\partial S}{\partial \phi} \rangle + f(\tilde{I}, \phi) = \tilde{h}(\tilde{I}) - h(\tilde{I})

Now we can expand S and f in Fourier series and solve coefficient-wise. This gives a formal solution S(\tilde{I}, \phi) of the equation
\langle \omega, \frac{\partial S}{\partial \phi} \rangle + f(\tilde{I}, \phi) = 0.

Getting it to Work


Unfortunately, the Fourier series for S has no chance to converge, so instead we take a finite truncation. If we assume f is analytic, its Fourier coefficients decay exponentially fast, so this provides a very good approximate solution to the linearized equation (and we can give an explicit bound in terms of a certain norm of f). Call this function S_1. We then use S_1 to construct a symplectomorphism \Phi_1.

Now we take
H_1(I, \phi) = H \circ \Phi_1(I, \phi) = h_1(I) + f_1(I, \phi).
Some hard analysis then shows that h - h_1 is small, and f_1 is much smaller than f.


The Induction Step


The above arguments sketch a method to put the system "closer" to an integrable form. By carefully controlling \epsilon's and \delta's, one then shows that iterated sequence \Phi_1, \Phi_2 \circ \Phi_1, \ldots converges to some limiting symplectomorphism \Phi_\infty.

Friday, July 13, 2012

Circle Diffeomorphisms I

This is the first of a series of posts based on these lecture notes on KAM theory. For now I just want to outline section 2, which is a toy model of KAM thoery.

Circle Diffeomorphisms


We consider a map \phi: \mathbb{R} \to \mathbb{R} defined by
\phi(x) = x + \rho + \eta(x)
where \rho is its rotation number and \eta(x) is "small".

Define S_\sigma to be the strip \{ |\mathrm{Im} z|<\sigma\} \subset \mathbb{C} and let B_\sigma be the space of holomorphic functions bounded on S_\sigma with sup norm \|\cdot\|_\sigma.

Goal: Show that if \|\eta\|_\sigma is sufficiently small, then there exists some diffeomorphism H(x) such that
H^{-1} \circ \phi \circ H (x) = x + \rho
i.e. that \phi is conjugate to a pure rotation.


Linearization


The idea is that if \eta is small, then H should be close to the identity, so we suppose that
H(x) = x + h(x)
where h(x) is small. Plugging this into the equation above and discarding higher order terms yields
h(x+\rho) - h(x) = \eta(x)
Since \eta is periodic, we Fourier transform both sides to obtain an explicit formula for the Fourier coefficients of h(x). We have to show several things:

1. The Fourier series defining h(x) converges in some appropriate sense.

2. The function H(x) = x + h(x) is a diffeomorphism.

3. The composition \tilde{\phi} = H^{-1} \circ \phi \circ H is closer to a pure rotation than \phi, in the sense that
\tilde{\phi}(x) = x + \rho + \tilde{\eta}(x)
where \|\tilde{\eta}\| \ll \|\eta\|.


Newton's Method

Carrying out the analysis, one finds that for appropriate epsilons and deltas, if \eta \in B_\sigma then H \in B_{\sigma - \delta} and that \|\tilde{\eta}\|_{\sigma-\delta} \leq C \|\eta\|_\sigma^2. By carefully choosing the deltas, we can iterate this procedure (composing the H's) to obtain a well-defined limit H_\infty \in B_{\sigma/2} such that
H_\infty^{-1} \circ \phi \circ H_\infty (x) = x + \rho,
as desired.

So in fact the idea of the proof is extremely simple, and all of the hard work is in proving some estimates.

Saturday, March 3, 2012

Gaussian Integrals: Wick's Theorem

We saw in the last update that the generating function Z[J] can be expressed as
Z[J] = e^{\frac{1}{2} J \cdot A^{-1} J}
(at least as long as we've normalize things so that Z[0] = 1. Now the wonderful thing is that this is something we can compute explicitly:
Z[J] = \sum_{n = 0}^{\infty} \frac{(\frac{1}{2} A^{-1}_{ij} J^i J^j)^n}{n!} = \sum_{n=0}^\infty \frac{(A^{-1}_{ij} J^i J^j)^n}{2^n n!}

For example, in the one-dimensional case (taking A = 1) we get
Z[J] = \sum_{n=0}^\infty \frac{J^{2n}}{2^n n!}
On the other hand, by the definition of the generating function we have
Z[J] = \sum_{n=0}^\infty \frac{\langle x^n \rangle}{n!} J^n
Comparing coefficients, we find
\frac{\langle x^{2n} \rangle}{(2n)!} = \frac{1}{2^n n!}
so that
\langle x^{2n} \rangle = \frac{(2n)!}{2^n n!}.
Let's give a combinatorial description. Given 2n objects, in how many ways can we divide them into pairs? If we care about the order in which we pick the pairs, then we have
{2n \choose 2}{2n - 2 \choose 2} \cdots {2n-(2n-2) \choose 2} = \frac{(2n)!}{2^n}
Of course, there are n! ways of ordering the n pairs, so after dividing by this (to account for the overcounting) we get exactly the expression for \langle x^{2n} \rangle. This is the first case of Wick's theorem.

Now consider the general multidimensional case. Given I = (i_1, \cdots, i_{2n}), we define a contraction to be
\langle x^{j_1} x^{k_1} \rangle \cdots \langle x^{j_n} x^{k_n} \rangle
where j_1, k_1, \cdots, j_n, k_n is a choice of parition of I into pairs.

Theorem (Wick's theorem, Isserlis' theorem) The expectation value
\langle x^{i_1} \cdots x^{i_{2n}} \rangle
is the sum over all full contractions. There are (2n)!/ 2^n n! terms in the sum.

Proof This follows from our formula for the power series of the generating function. The reason is that the coefficient of  J^I in (\frac{1}{2} A^{-1}_{ij} J^i J^k)^n is exactly given by summing products of A^{-1}_{ij} over partitions of I into pairs, and the n! in the denominator takes care of the overcounting.

Next up: perturbation theory and Feynman diagrams.

Introduction to Gaussian Integrals

As a warm-up for more serious stuff, I'd like to discuss Gaussian integrals over \mathbb{R}^d. Gaussian integrals are the main tool for perturbative quantum field theory, and I find that understanding Gaussian integrals in finite dimensions is an immense aid to understanding how perturbative QFT works. So let's get started.


The Basics

Let A be some d \times d symmetric positive definite matrix. We are interested in the integral
\int_{-\infty}^\infty \exp(-\frac{x \cdot Ax}{2}) dx.
Out of laziness, I will suppress the limits of integration and just write this as
\int e^{-S(x)} dx.
where S(x) = x \cdot Ax / 2. Now for a function f(x), we define the expectation value \langle f(x) \rangle to be
\langle f(x) \rangle_0 = \int f(x) e^{-S(x)} dx
Occasionally, we might care about the normalized expectation value
langle f(x) \rangle = \frac{\langle f(x) \rangle_0}{\langle 1 \rangle_0} = \frac{1}{\langle 1 \rangle_0} \int f(x) e^{-S(x)} dx.
We mostly care about asymptotics, so we will typically think of a function f(x) as being a polynomial (or Taylor series). So what we're really interested in is
\langle x^I \rangle = c\int x^I e^{-S(x)} dx,
where I is a multi-index.

The Partition Function

Let us define Z[J] by
Z[J] = \int e^{-S(x) + J \cdot x} dx.

Now the great thing is that
\langle x^I \rangle = \left. \frac{d^I}{dJ^I} \right|_{J = 0} Z[J],
so that once we know Z[J], we can calculate anything. So let's try to compute it. We have

\begin{align} (Ax - J) \cdot A^{-1} (Ax - J) &= (Ax - J) \cdot (x - A^{-1} J) \\\ &= x \cdot Ax - x \cdot J - J \cdot x + J \cdot A^{-1} J \\\ &= x \cdot Ax - 2 x \cdot J + J \cdot A^{-1} J. \end{align}
So we see that
-\frac{1}{2} x \cdot A x + J \cdot x = \frac{1}{2} J \cdot A^{-1} J -\frac{1}{2} (x-A^{-1}J) \cdot A(x - A^{-1} J).
So, after a change of variales x \mapsto x - A^{-1} J we find
Z[J] = e^{\frac{1}{2} J \cdot A^{-1} J} Z[0].
Now the argument in the exponential is
\frac{1}{2} A^{-1}_{ij} J^i J^j
So we find that
\langle x^i x^j \rangle = \frac{d^2}{dx^i dx^j} Z[J]|_{J = 0} = A^{-1}_{ij}.

Now we are ready to prove Wick's theorem and discuss Feynman diagrams, which we'll do in the next post.

Saturday, February 25, 2012

Geometry of Curved Spacetime 5: Bianchi Identity and Einstein Equations

Background

Following last time, we are almost ready to write down the Einstein equations. Before doing any math, let's understand what we're trying to do. Minkowski realized that Einstein's special relativity was best understood by combining space and time into 4-dimensional spacetime, with Lorentzian metric
ds^2 = -dt^2 + dx^2 + dy^2 + dz^2.
The spacetime approach works wonderfully and even explains the Lorentz invariance of Maxwell's equations (indeed, it was Maxwell's equations that motivated Einstein to postulate his principle of relativity). However, (for reasons that I may discuss later) gravity is not a "force" but rather the geometry of spacetime itself.

By mass-energy equivalence (which is one of the most basic consequences of relativity), the gravitational field, whatever it is, must couple to the stress-energy tensor T_{ij}. I won't get into details, but the stress-energy tensor is a familiar object from physics that roughly tells you what the energy-momentum density/flux is in each direction at every point in spacetime. If the matter is completely static, then it is ok to think of this as measuring the mass density, but for nonstatic matter it also takes things like pressure into account.

Now, as I said above, the gravitational field is just the geometry of spacetime, which is measured by the metric tensor g_{ij}. Mass-energy equivalence says that it must couple to the stress-energy tensor T_{ij}. The simplest field equation then would be
G_{ij} = c T_{ij}
where G_{ij} is some tensor built out of g_{ij} and its derivatives, and c is some constant. The equations of Newtonian gravity are 2nd order in the gravitational field, so if we want these equations to reduce to Newton's in the appropriate limit, G_{ij} should only depend on the metric and its first two derivatives. Now there is an obvious 2nd rank tensor satisfying these constraints: R_{ij}, the Ricci tensor. However, this turns out to be completely wrong (except in the vacuum).

Any reasonable matter will satisfy local energy-momentum conservation,
\nabla_j T^{ij} = 0.
It turns out that the Ricci tensor does not satisfy this condition in general. So to look for the right tensor G_{ij}, we turn to the Bianchi identity.

The Bianchi Identity

As discussed in the previous post, the curvature of a connection is the endomorphism-valued 2-form
F = d\Omega - \Omega \wedge \Omega
where \Omega is the matrix of 1-forms telling us how to take the covariant derivative of a frame, i.e.
\nabla_i e_j = \Omega_{ij} \otimes e_j.
Since a connection can be extended to all tensor powers in a natural way, we can consider the covariant derivative of the curvature F (thought of as a section of the appropriate bundle). Quick calcluation:
\begin{align} \nabla F &= \nabla(d\Omega - \Omega \wedge \Omega) \\ &= d^2 \Omega - d\Omega \wedge \Omega + \Omega \wedge d\Omega \\ & \ \ + d\Omega \wedge \Omega - \Omega \wedge \Omega \wedge \Omega \\ & \ \ - \Omega \wedge d\Omega + \Omega \wedge \Omega \wedge \Omega \\ &= 0. \end{align}
Thus the endomorphism valued 3-form \nabla F is identically 0. Writing F in components as R_{ijkl}, this is equivalent to
R_{ijkl|m} +  R_{ijlm|k} + R_{ijmk|l} = 0.
Now let's contract:
\begin{align} 0 &= g^{ik} g^{jl} R_{ijkl|m} + g^{ik} g^{jl} R_{ijlm|k} + g^{ik} g^{jl} R_{ijmk|l}\\ &= g^{ik} R_{ik|m} - g^{ik}R_{im|k} - g^{jl} R_{jm|l} \\ &= \nabla_m S - 2 \nabla^k R_{mk} \end{align}
So we see that the tensor
G_{ij} = R_{ij} - \frac{S}{2} g_{ij}
is divergence free. This yields the Einstein field equations:
R_{ij} - \frac{S}{2} g_{ij} = c T_{ij}.
Actually, there is another obvious divergence free tensor: g_{ij} itself. So a more general form is
G_{ij} + \Lambda g_{ij} = c T_{ij}
where \Lambda is a constant called the cosmological constant.

Monday, February 6, 2012

Geometry of Curved Spacetime 4

Today I had to try to explain connections and curvature in local frames (as opposed to coordinates), and I really feel that Wald's treatment of this is just awful (this is one of the few complaints I have with an otherwise classic textbook). It is particularly baffling since the treatment in Misner, Thorne, and Wheeler is just perfect. What follows is the modern math (as opposed to physics) point of view. This is more abstract than any introductory GR (or even Riemannian geometry) text I've seen, but in this case the abstraction absolutely clarifies and simplifies things.

Let M be a smooth manifold and suppose E is a smooth vector bundle over M. A connection on E is a map \nabla taking sections of E to sections of T^\ast M \otimes E, \mathbb{R}-linear and satisfying the Leibniz rule
\nabla(f\sigma) = df \otimes \sigma + f \nabla \sigma.

Now consider the sheaf of E-valued p-forms on M. Call it \Omega^p(E). Then we can extend the connection to a map
\nabla: \Omega^p(E) \to \Omega^{p+1}(E)
via the Leibniz rule:
\nabla(\eta \otimes \sigma) = d\eta \otimes \sigma + (-1)^p \eta \wedge \nabla \sigma.
Let us define the curvature F associated to a connection \nabla by the composition
F = \nabla^2: \Omega^p(E) \to \Omega^{p+2}(E).

Claim F is C^\infty-linear, i.e. it is tensorial.

Proof
\begin{align} \nabla(\nabla(f \sigma)) &= \nabla( df \otimes \sigma + f \nabla \sigma) \\\ &= d^2 f \otimes \sigma - df \wedge \nabla \sigma + df \wedge \nabla \sigma + f \nabla^2 \sigma \\\ &= f \nabla^2 \sigma. \end{align}

So far we have not made any additional choices (beyond \nabla). In order to actually compute something locally, we have to make some choices. Let \hat{e}_a be a frame, i.e. a local basis of sections of E. Then \nabla \hat{e}_a is an E-valued 1-form, hence it can be expressed as a sum
\nabla \hat{e}_a = \sum_{b} \omega_a^b \otimes \hat{e}_b
where the coefficients \omega_a^b are 1-forms, often called the connection 1-forms. Let \Omega denote the matrix of 1-forms whose entries are exactly \omega_a^b.

Claim Let \sigma = \sigma^a \hat{e}_a. Then we have
\nabla \sigma = d\sigma + \Omega \sigma.

Proof The coefficients \sigma^a are functions (i.e. scalars), so \nabla \sigma^a = d\sigma^a. Using the Leibniz rule we have
\begin{align} \nabla(\sigma^a \hat{e}_a) &= (\nabla \sigma^a) \hat{e}_a + \sigma^a \nabla \hat{e}_a \\\ &= d\sigma^a \hat{e}_a + \sigma^a \omega_a^b \hat{e}_b \\\ &= d\sigma^a \hat{e}_a + \omega_c^a \sigma^c \hat{e}_a \\\ &= (d\sigma + \Omega \sigma)^a \hat{e}_a. \end{align}

Claim The curvature satisfies F = d\Omega - \Omega \wedge \Omega.

Proof Just apply the above formula twice using Leibniz.

Connection 1-forms from Christoffel symbols. Suppose now that we are in the Riemannian setting and we already know the Christoffel symbols in some coordinates. Then we can express our frame \hat{e}_a in terms of coordinate vector fields, i.e.
\hat{e}_a = \hat{e}_a^i \frac{\partial}{\partial x^i}
Then we have that
\nabla_j \hat{e}_a^i = \frac{\partial \hat{e}_a^i}{\partial x^j} + \Gamma^i_{jk} \hat{e}_a^k
So, as a vector-valued 1-form, we have
\nabla \hat{e} = \frac{\partial \hat{e}_a^i}{\partial x^j} dx^j \otimes \frac{\partial}{\partial x^i} + \Gamma^i_{jk} \hat{e}_a^k dx^j \otimes \frac{\partial}{\partial x^i}.
Juggling things a bit using the metric, we find
\nabla \hat{e}_a = \frac{\partial \hat{e}_a^i}{\partial x^j} \hat{e}^b_i dx^j \otimes \hat{e}_b  + \Gamma^i_{jk} \hat{e}_a^k \hat{e}_i^b dx^j \otimes \hat{e}_b.
So the connection 1-forms are given by
\omega_a^b = \frac{\partial \hat{e}_a^i}{\partial x^j} \hat{e}^b_i dx^j  + \Gamma^i_{jk} \hat{e}_a^k \hat{e}_i^b dx^j.

To come later (if I ever get around to it): some explicit computations.

Saturday, February 4, 2012

Geometry of Curved Spacetime 3

Today, some numerology. The Riemann curvature tensor is a tensor R_{abcd} satisfying the identities:

1. R_{abcd} = -R_{bacd}.

2. R_{abcd} = R_{cdba}.

3. R_{abcd} + R_{acdb} + R_{adbc} = 0. (First Bianchi)

4. R_{abcd|e} + R_{acec|d} + R_{abde|c} = 0. (Second Bianchi)

By 1, the number of independent ab indices is N = n(n-1)/2, and similarly for cd. By 2, the number of independent pairs of indices is N(N+1)/2. Now the cyclic constraint 3 can be written as
R_{[abcd]} = 0,
and thus constitutes {n \choose 4} equations. So the number of independent components is
\begin{align}  N(N+1)/2 - {n \choose 4} &= \frac{n(n-1)((n(n-1)/2+1)}{4} - \frac{n(n-1)(n-2)(n-3)}{24} \\ &= \frac{(n^2-n)(n^2-n+2)}{8} - \frac{(n^2-n)(n^2-5n+6}{24} \\ &= \frac{n^4-2n^3+3n^2+2n}{8} - \frac{n^4-6n^3+11n^2-6n}{24} \\ &= \frac{2n^4-2n^2}{24} \\ &= \frac{n^4-n^2}{12} \\ &= \frac{n^2(n^2-1)}{12} \end{align}

Now consider the Weyl tensor C_{abcd} which is defined as the completely trace-free part of the Rienmann tensor. The trace is determined by the Ricci tensor R_{ab} which as n(n+1)/2 indepdendent components, so the Weyl tensor has
\frac{n^2(n^2-1)}{12} - \frac{n^2-n}{2} = \frac{n^4-7n^2+6n}{12}
independent components. Now, for n = 1 we see that R_{abcd} has no independent components, i.e. it vanishes identically. In n=2, it has only 1 independent component, and so the scalar curvature determines everything. In n=3, it has 6 independent components. Note that in this case, the Weyl tensor has no independent components, i.e. it is identically 0. So we see that in n = 2, 3 every Riemannian manifold is conformally flat. So things only start to get really interesting in n=4, where the Riemann tensor has 20 independent components, and the Weyl tensor has 10.

Path Integrals 3: Recovering the Spectrum from Asymoptotics


In my previous posts on path integrals, I described (rather tersely) how the path integral, suitably defined and interpreted, can be used to compute the Schwartz kernel of the operators e^{iHt} (Lorentzian signature) and e^{-Ht} (Euclidean signature).

Suppose that we understand the spectrum of H completely (nb: for a given system described by H, this is the goal). For example, suppose we know that the spectrum of H consists of discrete eigenvalues E_n, n = 0, \cdots with corresponding eigenvectors |n\rangle,
H|n\rangle = E_n|n\rangle.
(For simplicity, I assume there is no continuous spectrum and that the eigenvalues are nondegenerate.) Then we have
e^{-iHt} = \sum_n e^{-i E_n t} \langle n|n\rangle
and
e^{-Ht} = \sum_n e^{-E_n t} \langle n|n\rangle
Now the second expression turns out to be very useful. Assume the eigenvalues are ordered so that
E_0 < E_1 < \cdots
Then we can write
e^{-Ht} = e^{-E_0 t} |0\rangle + \sum_{n \geq 1} e^{-(E_n-E_0)t}|n\rangle
Now suppose that v is some vector which is close to the ground state, in the sense that
\langle v|0\rangle \neq 0
(This is obviously a generic condition, so if we just pick v randomly we can expect this to be true.) Then we can consider
e^{-Ht} v = e^{-E_0 t} v_0 |0\rangle + \sum_{n \geq 1} e^{-(E_n-E_0)t} v_n|n\rangle
Now for n \geq 1, E_n-E_0 is strictly positive, and so for large t all of the higher terms are exponentially damped. So, we have the asymptotic
e^{-Ht} v \sim e^{-E_0 t }v_0|0\rangle
Next comes the really interesting part. Multiply on the right by a position-representation eigenbra \langle x|:
\langle x|e^{-Ht} v \sim e^{-E_0 t} v_0 \langle x|0\rangle
Now v_0 is an irrelevant constant, so we might as well take it to be 1 (rescale v as necessary). The expression \langle x|0\rangle is exactly the ground state wavefunction in the position representation! Call it \psi_0(x). So to conclude: the large-t asymptotic of the expression \langle x|e^{-Ht}v is (up to an overall constant) given by e^{-E_0 t} \psi_0(x), hence we can recover both the ground state energy and the ground state wavefunction. But the value of this expression is exactly given by the Euclidean path integral. So we have a correspondence:

Asymptotics of Euclidean path integral \leftarrow\rightarrow The spectrum of H.

Coming next: instantons.

Path Integrals 2: Euclidean Path Integrals and Heat Kernels


Consider the heat equation
\frac{\partial \psi}{\partial t} = -\hat{H} \psi
We find similarly
 \langle x_N|U_t|x_0\rangle = \int \exp \sum_{j=0}^{N-1} ik_j(x_j - x_{j+1}) -\Delta t H(x_j, k_j) dx dk. 

Now take H(x,k) = k^2/2m + V(x) and complete the square:

ik_j(x_j - x_{j+1}) - \Delta t k_j^2/2m - \Delta tV(x_j) + ik_{j+1}(x_{j+1} - x_{j+2}) - k_{j+1}^2/2m - V(x_{j+1})

\begin{align}  ik_j a_j - \Delta t k_j^2/2m &= -(-2mi k_j a_j / \Delta t + k_j^2) \Delta t/2m \\ &= -(k_j^2 -2mi k_j a_j/\Delta t -m^2 a_j^2/(\Delta t)^2 + m^2 a_j^2/(\Delta t)^2) \Delta t/2m \\ &= -(k_j -mia_j/\Delta t)^2 \Delta t/2m -ma_j^2/2\Delta t \end{align}

Combining things together, we have that the heat kernel is given by
\langle y|e^{-t \hat{H}}|x\rangle = \int e^{-S_{\textrm{euc}}} \mathcal{D}x

That is, Schwartz kernel of time evolution operator is given by the oscialltory Lorentzian signature path integral, whereas the heat kernel is given by the exponentially decaying path integral (better chance of being well-defined). Most importantly, the heat kernel contains most of the essential information about the spectrum of \hat{H}, which is really all we need in order to understand the dynamics.

See ABC of Instantons. (I never understood the title of Nekrasov's "ABCD of Instantons" until I found this classic).

Thursday, January 26, 2012

Geometry of Curved Space, Part 2

Disclaimer: as before, these are (incredibly) rough notes intended for a tutorial. I may clean them up a bit later but for now it will seem like a lot of unmotivated equations (with typos!!).


The Energy Functional
S = \int_0^T |\dot{\gamma}|^2 dt
Letting V^i = \dot{\gamma}^i, this is 
S = \int_0^T g_{ij}(\gamma(t)) V^i V^j dt = \int_0^T L dt
where the Lagrangian L is
L = g_{ij} V^i V^j
Now, 
\frac{\partial L}{\partial x^k} = (\partial_k g_{ij}) V^i V^j
and 
\frac{\partial L}{\partial V^k} = g_{ij} \delta^i_k V^j + g_{ij} V^i \delta^i_k = 2 g_{jk} V^j
Now, 
\frac{d}{dt} \frac{\partial L}{\partial V^k} = 2 (\partial_i g_{jk}) V^i V^j + 2 g_{jk} \dot{V}^j
Plugging these expressions into the Euler-Lagrange equations, we have
2 g_{jk} \dot{V}^j + \left(\partial_i g_{jk} + \partial_j g_{ik}- \partial_k g_{ij}\right) V^i V^j = 0
Multiplying by the inverse metric, we have
\dot{V}^k + \frac{g^{kl}}{2} \left( \partial_i g_{jl} + \partial_j g_{il} - \partial_l g_{ij} \right) V^i V^j = 0
Which is the geodesic equation (recall the formula for the Christoffel symbols).


Orthonormal Frames (Lorentzian and Riemannian) (tetrads, vielbeins, vierbeins, ...)
Locally, we can find an orthonormal basis of vector fields e^\mu_i. Greek indicates coordinates, whereas Latin indicates label in the basis. These necessarily satisfy
g_{\mu\nu} e^\mu_i e^\nu_j = \eta_{ij}
where \eta_{ij} is the flat/constant metric (of whatever signature we are working in).


Methods for Computing Curvature (from Wald)
0. Getting the Christoffel symbols from the geodesic equation.
See e.g. sphere or spherical coordinates.


1. Coordinates. By definition,
\nabla_a \nabla_b \omega_c = \nabla_b \nabla_a \omega_c + {R_{abc}}^d \omega_d
Writing things explicitly, this gives
R_{abc}^d = \partial_b \Gamma^d_{ac} - \partial_a \Gamma^d_{bc}
+\Gamma^e_{ac}\Gamma^d_{be} - \Gamma^e_{bc}\Gamma^d_{ae}
(todo: fix typesetting.)


Do this for eg unit sphere in \mathbb{R}^3.


2. Curvature in Frames (equivalent to coordinates but totally different flavor)
(note: Misner-Thorne-Wheeler seems much better than Wald for this stuff).
 Using MTW notation. Fix a frame \mathbf{e_\mu} and a dual frame \omega^\mu. The connection 1-forms are defined by
0 = d\omega^\mu + \alpha^\mu_\nu \wedge \omega^\nu
We also have
dg_{\mu\nu} = \omega_{\mu\nu} + \omega_{\nu\mu}
So metric compatibility yields
\omega_{\mu\nu} = -\omega_{\nu\mu}
Antisymmetry means fewer independent components. In this language, the curvature 2-form is given by
R^\mu_\nu = d\alpha^\mu_\nu + \alpha^\mu_\sigma \wedge \alpha^\sigma_\nu




Gaussian Coordinates
Via Wald. Suppose S \subset M is a codimension 1 submanifold. If S is not null, we can find a normal vector field n^a which is everywhere orthogonal to S and has unit length. (Probably also need orientation to make it unique!) We can pick any coordinates x^1, \cdots, x^{n-1} on S, and we pick the last coordinate to be the distance to S, measured along a geodesic with initial tangent vector n^a (i.e. we use exponential coordinates in the normal direction). 


Once we pick these coordinates, we obtain a family of hypersurfaces S_t given by
x^n = t. These have the property that they are orthogonal to the normal geodesics through S. Proof: (X are vector fields which are tangent to S_t)
n^b \nabla_b (n_a X^a) = n_a n^b \nabla_b X^a  
= n_a X^b \nabla_b n^a  
\[= \frac{1}{2}X^b \nabla_b (n^a n_a) = 0 \]
(first: geodesic, second: they lie-commute since they are coordinate vector fields).


Jacobi Fields, Focusing and Growth, Conjugate Points
Geodesic deviation. Suppose we have a 1-parameter family of geodesics \gamma_s with tangent T^a and deviation X^a. (draw pictures!) By the geodesic equation, we have
T^a \nabla_a T^b = 0
What can we say about X^a? By change of affine parameter if necessary, we can assume that T^a and X^a are coordinate vector fields, and in particular they commute. So
X^a \nabla_a T^b = T^a \nabla_a X^b
Then it is easy to see that X^a T_a is constant, and so (again by change of parameter if necessary) we can assume that it is 0. Now set v^a = T^b \nabla_b X^a. We interpret this as the relative velocity of nearby geodesics. Similarly, we have the acceleration
a^a = T^c \nabla_c v^a = T^b \nabla_b (T^c \nabla_c X^a)
Some manipulation shows that
a^a = -R_{cbd}^a X^b T^c T^d
This is the geodesic deviation equation. (Positive curvature -> focus, negative curvature ->growth.)


Now we can work this in reverse. Suppose I have a single geodesic with tangent T^a. If I have some vector field X^a on the geodesic, under what conditions will it integrate to give me a family of geodesics? The above shows that we must have
T^a \nabla_a (T^b \nabla_b X^c) = -R_{abd}^c X^b T^a T^d
Solutions to this equation are called Jacobi vector fields.


Definition Points p, q on a geodesic are said to be conjugate if there exists a Jacobi field on the geodesics which vanishes at p and q. (Picture time!)


Definition (Cut Locus in Riemannian Signature) For p \in M, we define the cut locus in T_p M to be those vectors v \in T_p M for which \exp(tv) is length minimizing on [0,1] but fails to be length-minimizing on [0,1+\epsilon] for and \epsilon. The cut locus in M is the image of the cut locus in T_p M under the exponential map.


eg. Sphere, antipodes.