Skip to content

Version 0.1.0

Compare
Choose a tag to compare
@wehs7661 wehs7661 released this 04 Aug 23:42
· 560 commits to master since this release

In this version, we have implemented a preliminary version of ensemble_md that allows users to run an ensemble of expanded ensemble in GROMACS. We are still finalizing our choice of algorithms implemented within the method, including methods for swapping multiple pairs of simulations and combining weights. We have implemented these methods in a basic and valid way, but they are subject to changes in later versions.

Multiple swaps

Here is how we perform multiple swaps in version 0.1.0: From $N_\text{swap}$ pairs, randomly choose 1 pair and repeat $N_\text{ex}$ times. Notably, here we choose pairs without replacements and after every draw, we update the swappable list so that none of the replicas will be involved in multiple pairs. Therefore, the maximum number of $N_\text{ex}$ is half of the number of replicas. Notably, in this method, it is possible that we can not reach the number of $N_\text{ex}$ due to the lack of choices even if $N_\text{ex}$ is specified. For example, given 4 replicas, it is possible that the swappable pairs only include (0, 1), (1, 2), (2, 3) because other pairs of simulations are in the state not present in the alchemical range of their exchanging partner. Then, if we choose (1, 2) in the first round, there would be no choice to draw in the second round, in the case that we have $N_\text{ex}$ being 2.

Weight-combining methods

Weight-shifting method

In the GROMACS log file of an expanded ensemble simulation, all weights are shifted such that the weight of the first state is always 0. That is, simulations with different $\lambda$ ranges will have different references for the weights. To account for this, we shift the weights of one of the simulations such that the first states in the overlapped ranges have the same weights. After shifting, we then combine weights of the overlapped states across the simulations. Below is a schematic example for adjusting weights for simulations $i$ and $j$:

State            0        1        2        3        4        5        6
Original i       0.0      2.1      4.0      3.7      4.8      X        X         
Original j       X        X        0.0      -0.4     0.7      1.5      2.4
Shifted j        X        X        4.0      3.6      4.7      5.5      6.4
Shifted i        -4.0     -1.9     0.0      -0.3     0.8      X        X    

For simplicity, here we are just taking simple averages (although it doesn't makes sense), so that the weights of simulations $i$ and $j$ will be as follows after adjustment:

State            0        1        2        3        4        5        6
Modified i       0.0      2.1      4.0      3.65     4.75     X        X         
Modified j       X        X        0.0      -0.35    0.75     1.5      2.4

Therefore, instead of having [0.0, 2.1, 4.0, 3.7, 4.8] specified as init-lambda-weights in the next iteration of simulation $i$, we use [0.0, 2.1, 4.0, 3.65, 4.75]. Notably, using simple averages, the choice of the reference does not influence the calculation of the acceptance ratio (even if the values of the individual weights are changed). However, with exponential averaging, the choice of reference will influence the value of the acceptance ratio, but it's hard to justify which reference is the best. This is the problem that we want to solve in the next version.

Weight combination

In version 0.1.0, weights are combined across the pair of simulations to be exchanged after the swap is either accepted or rejected. Currently, the following three methods have been implemented.

  • none: In this case, the simulations in the new iterations will just inherit the final weights of the previous iteration. No weights will be modified.
  • exp-avg: In the limit that all $\lambda$ states are equally sampled, the $\lambda$ weight of a state is equal to the dimensionless free energy of that state. That is, we can write $g(\lambda)=-\ln p(\lambda)$, or $p(\lambda)=\exp(-g(\lambda))$ (in the units of kT). Given this, one intuitive way is to average the probability of the two simulations, i.e. $p=\frac{1}{2}(p_1 + p_2)$. Or in terms of free energy, we have $\exp(-g)=\frac{1}{2}(\exp(-g_1) + \exp(-g_2))$. By re-arrangement, we have $$g = -\ln\left(\frac{e^{-g_1} + e^{-g_2}}{2}\right)$$.
  • histogram-exp-avg: In the course of the simulation, the histogram is generally not flat, so $g$ as an estimate of the free energy in the expression of exp-avg could be off. To account for the flatness level of the histogram, I propose introducing a simple correction term of $\ln(N_{k-1}/N_k)$. That is, $$g_k'=g_k + \ln\left(\frac{N_{k-1}}{N_k}\right)$$ where $g_k'$ is the corrected $\lambda$ weight and $N_{k-1}$ and $N_k$ are the histogram counts of states $k-1$ and $k$. Combining this with exp-avg, we have $$g = -\ln\left(\frac{e^{-g_1'} + e^{-g_2'}}{2}\right)$$ In this case, we might need to take care of 0 counts in some cases though.

Future work

In the next version, we expect the following changes:

  • Change how multiple swaps are carried out, using the method used in replica exchange in GROMACS.
  • Combine weights from multiple simulations as long as there are overlapping states instead of combining weights only across the 2 simulations to be exchanged
  • Solve or circumvent the problem incurred by different choices of references in weight-shfiting
  • Add more unit tests to improve the code coverage
  • Pass continuous integration at least for Ubuntu systems
  • Add descriptions about implemented methods in the documentation page