The Stochastic Variational Method

\(
\def\bra#1{\langle #1 \rvert}
\def\ket#1{\lvert #1 \rangle}
\def\braket#1#2{\langle #1 \lvert #2 \rangle}
\)It is an unfortunate fact of quantum mechanics, and physics is general, that there is no known way to analyse the behavior of systems with more than two interacting pieces without resorting to approximation methods. We can analytically solve for so many properties of the hydrogen atom, but the moment you add another electron and create \(\mathrm{H}^{-}\) the problem becomes intractable. The Stochastic Variational Methond (SVM) is a way to approximate the eigenstates of few body systems. 

Suppose we have a system of \(N\) particles interacting with each other. The problem we are interested in is finding the eigenstates and eigenenergies of this system. If the interaction potentials only depend on the distance between pairs of particles, as the Columb potential does, then we can write the time independant Schrodinger equation for the system as

$$\hat{H} \, \Psi(r_1,r_2,...,r_N) = E \, \Psi(r_1,r_2,...,r_N),$$

with the Hamiltonian

$$\hat{H} = \sum_{i=1}^N\frac{\hbar^2}{2m_i} \nabla_i^2  + \sum_{i < j}V(r_i,r_j).$$

The notation \(i<j\) on the sum is a shorthand to indicate a sum over all pairs. For the sake of generality, we'll substitute the wavefunction \(\Psi(...)\) with the state vector \(\ket{\Psi}\) except where using the wavefunction makes more sense. We can expand \(\ket{\Psi}\) into a complete basis to obtain

$$\hat{H} \sum_{n=1}^{\infty} a_n \ket{\phi_n} = E \sum_{n=1}^{\infty} a_n \ket{\phi_n}. $$

Since the Hamiltonian is a linear operator, we can move it inside the sum. In the same step we'll multiply by \(\bra{\phi_m}\).

$$\sum_{n=1}^{\infty} a_n \bra{\phi_m}\hat{H}\ket{\phi_n} = E \sum_{n=1}^{\infty} a_n \braket{\phi_m}{\phi_n}.$$

The trick now is to interpret \(\bra{\phi_m}\hat{H}\ket{\phi_n}\) as a matrix element \(H_{mn}\) and the overlap \(\braket{\phi_m}{\phi_n}\) as \(O_{mn}\). Then each sum in the above equation can be read as a dot product between a column vector of the expansion coefficients and the \(m\)'th row of a matrix. Then we can rewrite the entire problem as a matrix equation

$$\boldsymbol{Ha} = \boldsymbol{EOa},$$

where \(a\) is the vector of expansion coefficients. Provided the choice of basis functions \(\{\phi_n\}\) is easy to work with, we can know analytically what \(\boldsymbol{H}\) and \(\boldsymbol{O}\) are; and if the basis is orthonormal, \(\boldsymbol{O}\) is the identity! Thus we have reduced the problem to solving this generalized eigenvalue problem, a comparatively simple proposition. Except for one thing: \(\boldsymbol{H}\), \(\boldsymbol{O}\) and \(\boldsymbol{a}\) are in principle infinite dimensional.

Up until now, everything we've done has been exact. But we simply can't find the eigenvalues/vectors of infinite dimensional matricies! So we have to make an approximation that we hope will be a good one: truncate the basis \(\{\phi_n\}\) to be a finite set. Of course, this may mean that the eigenstate we're looking for cannot be expanded in terms of \(\phi_n\), but maybe it will be "close enough." There are two main reasons to think that this will be the case.

The first is the variational principle. Essentially, because the ground state of a system has the lowest possible energy available to the system, every potential wavefunction for the system will have an energy expectation value equal to or above the ground state energy. Therefore, we can make variations on a trial wavefunction in order to minimize the energy. For us, this amounts to choosing the right wavefunctions to put in our basis set such that the lowest eigenvalue is minimized. For the same reason, we know that the n'th lowest eigenvalue is equal to or greater than the n'th lowest energy state.

The second reason is that if we expand our finite basis the accuracy of the approximation can only get better. In the worst case, the eigenvalues are unchanged; corresponding to \(a_{new}\) being zero for all eigenvectors. If upon adding the new basis function any of the weight on the previous functions gets redistributed to the new function, then the associated eigenvalue for each state can only be lower because of the variational principle.

The question then is how do we generate this finite "basis" set in a useful way? The popular method is to define a suitable prototype function (the "correlated gaussians" in my work) with a bunch of free variables. Then generate a large set of functions by randomly choosing values for the free variables (where the "stocastic" in SVM comes from). Find which one of these randomly generated functions minimizes the energy the most and add it to a growing basis set. In pseudo code:

vector basis;

for (n from 0 to max-basis-size) {
	vector trialfunctions = generate_a_bunch_randomly();
    foreach trial in trialfunctions {
    	vector test = basis + trial;
        H = computeH(test);
        O = computeO(test);
        
        eigenvalues = solve_eigenvalue_problem(H,O);
    }
    bestfunction = choose_lowest_eigenvalue(trialfunctions)
    basis.push_back(bestfunction);
}

In this manner, one can generate a pretty good finite basis which minimizes the energy of the state of interest. The meat of the method really depends on the kind of basis functions generated. The set of functions to choose from must A) span the Hilbert space, B) be simple enough so that \(H\) and \(O\) can be easily calculated,  and C) be qualitatively similar to what you expect the wavefunction to look like (eg. bound states have exponentially decaying tails, so so should your basis functions). In my current project, we are using the correlated Gaussians as a basis, but that's a post for another time.