A Quasi Time-Reversible scheme based on density matrix extrapolation on the Grassmann manifold for Born-Oppenheimer Molecular Dynamics

Federica Pes Dipartimento di Chimica e Chimica Industriale, Università di Pisa, Via G. Moruzzi 13, 56124 Pisa, Italy Étienne Polack CERMICS, École des Ponts and Inria Paris, 6 & 8 avenue Blaise Pascal, 77455 Marne-la-Vallé, France Patrizia Mazzeo Dipartimento di Chimica e Chimica Industriale, Università di Pisa, Via G. Moruzzi 13, 56124 Pisa, Italy Geneviève Dusson Laboratoire de Mathématiques de Besançon, UMR CNRS 6623, Université de Franche-Comté, 16 route de Gray, 25030 Besançon, France Benjamin Stamm Institute of Applied Analysis and Numerical Simulation, University of Stuttgart, 70569 Stuttgart, Germany Filippo Lipparini Dipartimento di Chimica e Chimica Industriale, Università di Pisa, Via G. Moruzzi 13, 56124 Pisa, Italy
(July 13, 2023)
Abstract

This article proposes a so-called Quasi Time-Reversible (QTR G-Ext) scheme based on Grassmann extrapolation of density matrices for an accurate calculation of initial guesses in Born-Oppenheimer Molecular Dynamics simulations. The method shows excellent results on four large molecular systems, ranging from 21 to 94 atoms simulated with Kohn-Sham density functional theory surrounded with a classical environment with 6k to 16k atoms. Namely, it clearly reduces the number of self-consistent field iterations, while keeping a similar energy drift as in the extended Lagrangian Born-Oppenheimer method.

Ab-initio, Born-Oppenheimer molecular dynamics (BOMD) is a very powerful and versatile tool to simulate molecular processes where the quantum nature of the system is not negligible. Unfortunately, this comes at a high computational price, which stems from the necessity of solving the quantum mechanical (QM) equations, typically Kohn-Sham Density Functional Theory (KS-DFT) equations, to compute the energy and forces at every time-step. Such equations are nonlinear and are solved using a fixed-point iterative method known as Self-Consistent Field [29] (SCF). BOMD simulations, that require one to perform tens of thousands of SCF calculations, rely thus heavily on extrapolation techniques [7, 1, 2, 27, 11, 12, 31, 32, 20, 21, 19, 34] that, by using converged solutions from previous iterations, compute an accurate guess for the SCF, limiting thus the number of iterations required to achieve convergence. A significant contribution to this field was given by Niklasson and co-workers in 2006, with their work on the time-reversible extrapolation for Born-Oppenheimer Molecular Dynamics (BOMD) [20]. The core concept involves generating a guess density matrix by combining the density matrices from previous steps in a symmetric and time-reversible manner. However, numerical applications showed that enforcing an exact time-reversibility can lead to errors accumulating in long-time simulations, spoiling thus the convergence properties of the algorithm in the long run. This led to the development of the Extended Lagrangian Born-Oppenheimer approach (XLBO) in 2008 [21, 19]. In this particular case, the time-reversible extrapolation is augmented by the inclusion of a dissipative term, which serves to reduce numerical fluctuations. In the last few years, the XLBO method has also been proposed in a SCF-less formulation, [18, 22, 34] where the density computed using the XLBO procedure is used directly without further SCF iterations.

In Niklasson’s XLBO scheme, the guess density is propagated in time subject to a potential that forces it to be close to the converged density. However, the guess density obtained with XLBO is not exactly idempotent [21], which is in practice a problem that can be either ignored, or easily addressed by, for instance, McWeeny purification [17, 25].

Recently, we proposed a different strategy to compute a guess density using linear extrapolation. This is non-trivial, because in general a linear combination of density matrices does not preserve idempotency or, in other words, density matrices belong to a differentiable manifold called Grassmann manifold and not to a vector space. Our approach uses tools from differential geometry to map the Grassmann manifold onto its tangent space, which is a vector space. It then performs a linear extrapolation on the tangent space, and then maps back the extrapolated density to the manifold. We named such a method Grassmann extrapolation (G-Ext) [25, 24]. G-Ext has been successfully adopted in the Pisa-group for both ground- and excited-state SCF-based BOMD simulations in a polarizable multiscale framework [16]. While G-Ext is very effective, indeed outperforming XLBO in terms of the average number of SCF iterations required to achieve convergence along a MD trajectory [25], numerical experiments have shown that the extrapolation introduces a bias causing a drift in the total energy for NVE simulations. Such an energy drift is modest (few kcal/mol in 10 ps), but non negligible, at least for not-too-tightly-converged calculations [25]. In this contribution, we address such a limitation by introducing a new strategy to perform the extrapolation, that we name Quasi Time-Reversible Grassmann extrapolation method (QTR G-Ext). This approach leverages the principles of differential geometry, similarly to the previous method, but offers enhanced accuracy and speed in extrapolating the density matrix during a BOMD simulation, as well as excellent energy conservation properties.

Given a 𝒩-dimensional basis, the SCF solves the following nonlinear eigenvalue problem which consists to find a matrix C and a diagonal matrix E such that

{F(D)C=SCECTSC=IND=CCT,

where C𝒩×N contains the 𝒩 coefficients of the N occupied molecular orbitals, D𝒩×𝒩 is the density matrix, EN×N is a diagonal matrix which entries are the energy levels, F denotes the DFT operator, S𝒩×𝒩 is the overlap matrix, and IN denotes the identity matrix of order N.

We assume that the density matrix is orthogonal. In any case, it can be transformed into such matrix by considering the Löwdin factorization of the overlap matrix S and consequently the modified coefficient matrix C~=S1/2C. Then the normalized density matrix D~=C~C~T=S1/2DS1/2 belongs to the manifold

𝒢r(N,𝒩)={D𝒩×𝒩|D2=D=DT,Tr(D)=N},

which is isomorphic to the so-called “Grassmann manifold”, therefore we identify 𝒢r by this name. From now on, we assume that the density matrix has been orthonormalized and we denote it by D.

Since 𝒢r is a differential manifold, given a point D0𝒢r, there exists a tangent space 𝒯D0𝒩×N, such that tangent vectors Γ(D)𝒯D0 can be associated to nearby points D𝒢r.

In MD, t𝑹(t) represents the trajectory of the nuclei. The transformation of the electronic structure can be interpreted as a trajectory denoted by tD𝑹(t) on the manifold. In order not to burden the notation, we simply indicate D in place of D𝑹(t). The objective is to determine a suitable approximation for the density matrix at the next step of the molecular dynamics trajectory by extrapolating the densities from previous steps. Since the tangent space 𝒯D0 is a vector space, we approximate the density matrix on 𝒯D0. In order to solve the extrapolation problem, we decompose the mapping 𝑹D as a composition of several maps

3M 𝒟𝒯D0𝒢r (1)
𝑹 dΓD,

where the first function 𝑹d is a map from atomic positions to molecular descriptors. Here, as a descriptor, we use the Coulomb matrix [30] dNQM×NQM,

(d)ij={0.5zi2.4i=j,zizj𝑹(ti)𝑹(tj)ij, (2)

where NQM is the number of atoms treated quantum mechanically and zi denotes the nuclear charge of the ith atom. Note that other descriptors can also be considered. We will detail the crucial mapping dΓ below. The mapping ΓExp(Γ)=D is the so-called Grassmann exponential which maps tangent vectors on 𝒯D0 to 𝒢r, and it is a locally bijective function in a neighborhood of D0. Its inverse DLog(D)=Γ(D) is the Grassmann logarithm. These mappings are computed by means of the singular value decomposition (SVD). For mathematical details, the interested reader is referred to [24, 35, 6]. In our method, during the MD, we use a fixed reference point D0 to construct the tangent space 𝒯D0.

Let n be the current time step of the MD. Given previous q snapshots Γni=Log(Dni), for i=1,,q, the approximation of the density matrix on the tangent space is written as

Γ~n=Γnq+i=1q~αi(Γni+Γnq+i), (3)

where q~=q/2 if q is even, while q~=(q1)/2 if q is odd. We remark that if in Eq. (3), the term Γnq is substituted by Γ~nq, a “fully” time-reversible approach (instead of quasi time-reversible) is obtained. Numerical experiments with the fully time-reversible approach, that are reported in the Supporting Information (SI), showed good behavior for total energy conservation, but unfortunately a strong increase in the number of performed SCF iterations.

The descriptors are involved in the computation of the coefficients 𝜶=[α1,,αq~]T appearing in Eq. (3). Indeed, they are computed by solving the least-squares problem with Tikhonov regularization

min𝜶q~{dn+dnqi=1q~αi(dni+dnq+i)2+ε2𝜶2},

where denotes the 2-norm and ε>0 is the regularization parameter. Since the Coulomb matrix (2) is symmetric, in the above formula dj represents the vectorized Coulomb matrix considering the lower triangle. In matrix form, it corresponds to solving the following least-squares problem

min𝜶q~[d^𝟎][D^εIq~]𝜶2,

where the vector d^=dn+dnq is padded with q~ zeroes, D^Nd×q~ is the matrix which columns are defined as D^,i=dni+dnq+i, and Iq~ is the identity matrix of order q~. Then the initial guess for the density matrix is obtained as the composition of the three maps in (1), where the second map dΓ is given by (3). Note that if this second map denoted by f was linear, then the guess would be close to exact, namely

Γn =f(dn)f(dnq+i=1q~αi(dni+dnq+i))
=f(dnq)+i=1q~αi[f(dni)+f(dnq+i)]
=Γnq+i=1q~αi(Γni+Γnq+i)=Γ~n.

The number q of density matrices taken at previous steps and the value of the regularization parameter ε are chosen in a heuristic manner: we computed the error ΓnΓ~n for different values of q and ε, specifically q=3,4,,20 and ε=0.001,0.002,0.005,0.01,0.02,0.05, and we selected the combination (q,ε) corresponding to the minimal error. When the SCF convergence threshold is 105, we found that good values are q=5 and ε=0.005, while if it is fixed to 107, we found q=4 and ε=0.001,0.002. Additional details on the selection of q and ε values can be found in Section S1 of the SI.

The QTR G-Ext approach is tested on four different systems. The first system is dimethylaminobenzonitrile (DMABN) in methanol. The second system is 3-hydroxyflavone (3HF) in acetonitrile. The last two systems (OCP and AppA) are chromophores embedded in a biological matrix-namely, a carotenoid in the orange carotenoid protein (OCP) and a flavin in the AppA Blue-Light Using Flavin photoreceptor [5, 4, 9]. Some information on the systems is reported in Table 1.

Table 1: Summary of systems’ size: number of QM atoms NQM, number of MM atoms NMM, number of QM basis functions 𝒩, number of occupied orbitals N, and size of descriptors Nd.
System NQM NMM 𝒩 N Nd
DMABN 21 6843 185 39 234
3HF 28 15046 290 62 409
AppA 31 16449 309 67 468
OCP 94 6058 734 154 4468

KS-DFT has been adopted to describe the QM subsystem, with the B3LYP hybrid functional [3] and the 6-31G(d) Pople’s basis set [10]. This is coupled with a polarizable description of the environment, using the AMOEBA forcefield [26]. For each system, we performed a QM/AMOEBA geometry optimization until a root-mean-square norm on the forces of 4 kcal/mol/Å is found and finally a 2 ps QM/AMOEBA NVT equilibration to obtain the starting point of the simulations presented in this work.

All simulations have been performed using the Gaussian-Tinker interface [23, 14, 15, 8]. We implemented the QTR G-Ext extrapolation approach in Tinker [28, 13].

To assess the quality of the guess density obtained by the QTR G-Ext extrapolation, we performed 10 ps BOMD simulations, with 0.5 fs time step, in the NVE ensemble, using the velocity Verlet integrator [33]. All systems were tested with an SCF convergence threshold fixed to 105 and 107 with respect to the RMS variation of density. We compare our approach in terms of energy stability and number of iterations required to reach convergence with other two extrapolation schemes, which are the G-Ext scheme [25]

Γ~n=i=1qαiΓni,q=6,

where the αi are computed by solving

min𝜶q{dni=1qαidni2+ε2𝜶2},ε=0.01,

and XLBO without McWeeny purification [21, 19]

D~n=2D~n1D~n2+κ(Dn1D~n1)+ci=18αiD~ni,

with fixed parameters κ=1.86, c=0.0016, and 𝜶=(36,99,88,11,32,25,8,1).

Refer to caption
Figure 1: Total energy as a function of simulation time for DMABN, using a 105 convergence threshold for the SCF.
Refer to caption
Figure 2: Total energy as a function of simulation time for DMABN, using a convergence threshold for the SCF of 107.

Figure 1 provides the plot of the total energy along the DMABN simulation, with a 105 SCF convergence threshold. Despite the non-fully time-reversible formulation of our newly implemented approach, we observe a great improvement with respect to the G-Ext scheme. In particular, the QTR G-Ext method resembles the fully time-reversible scheme XLBO. The same behaviour is almost imperceptible when the SCF convergence is set to 107 (Figure 2), since the accumulation of errors that generates the energy drift when G-Ext is used is lower, so we can appreciate the same trend with all the extrapolation schemes. Analogous figures are reported in Section S2 of the SI for all tested systems. To better evaluate the energy stability, we consider the average short-time fluctuation (STF) of the energy, which is computed by getting the RMS of the energy fluctuation every 50 fs and averaging over the trajectory, and the long-time drift (LTD) for a long-time analysis, that is the slope of the linear regression line of the energy. Tables 2 and 3 disclose STF and LTD for convergence thresholds 105 and 107, respectively. QTR G-Ext, G-Ext, and XLBO show comparable STF, which is specific for the system and is related to the time step for the integration. On the other hand, the absolute value of LTD is in general higher for 105 simulations, in particular for G-Ext. We can state that the QTR G-Ext method solves the energy-drift issue of G-Ext, showing an LTD that is always similar to the XLBO one, suggesting again a good time-reversible behaviour.

Table 2: Short- and Long-Time Stability Analysis of the QTR G-Ext, G-Ext, and XLBO methods. SCF convergence threshold 105.
DMABN 3HF AppA OCP
STF LTD STF LTD STF LTD STF LTD
QTR G-Ext 0.33 -0.01 0.62 -0.40 0.57 -0.08 0.36 -0.23
G-Ext 0.35 -0.43 0.61 -0.94 0.56 -0.93 0.38 -1.38
XLBO 0.32 0.01 0.57 -0.42 0.59 0.14 0.39 -0.28
Table 3: Short- and Long-Time Stability Analysis of the QTR G-Ext, G-Ext, and XLBO methods. SCF convergence threshold 107.
DMABN 3HF AppA OCP
STF LTD STF LTD STF LTD STF LTD
QTR G-Ext 0.37 0.01 0.59 -0.30 0.53 0.18 0.38 -0.16
G-Ext 0.33 0.04 0.60 -0.27 0.54 0.06 0.38 -0.20
XLBO 0.32 0.13 0.64 -0.37 0.56 0.06 0.38 -0.08

The gain of our new methodology is not only in terms of accuracy (energy stability), but also in terms of the computational time of the simulation. Tables 4 and 5 report the average number of SCF iterations required to achieve convergence, as well as the standard deviation for 105 and 107 SCF thresholds, respectively. We remark that each strategy requires q previous density matrices, before having them available a standard SCF is performed. Therefore, for the computation of average and standard deviation, we discard the first q points. The two tables show that for all the tested systems, the QTR G-Ext method requires the lowest number of SCF iterations, for both convergence thresholds. Moving averages of SCF iteration numbers during the simulations for all systems and with both SCF convergence thresholds are reported in Section S3 of the SI.

Table 4: Performance of the QTR G-Ext method compared with the G-Ext method and XLBO algorithm. Average k¯ and standard deviation σ of SCF iterations. Convergence threshold 105.
DMABN 3HF AppA OCP
k¯ σ k¯ σ k¯ σ k¯ σ
QTR G-Ext 3.04 0.22 2.98 0.21 3.00 0.02 2.96 0.31
G-Ext 3.55 0.85 3.16 0.69 3.03 0.54 2.91 0.41
XLBO 4.00 0.05 4.00 0.00 4.00 0.07 4.00 0.01
Table 5: Performance of the QTR G-Ext method compared with the G-Ext method and XLBO algorithm. Average k¯ and standard deviation σ of SCF iterations. Convergence threshold 107.
DMABN 3HF AppA OCP
k¯ σ k¯ σ k¯ σ k¯ σ
QTR G-Ext 5.42 0.69 5.42 0.80 5.37 0.84 4.86 0.83
G-Ext 7.33 0.63 6.96 0.79 6.56 0.75 5.83 0.87
XLBO 7.51 0.65 7.45 0.65 7.43 0.80 7.21 0.75

In conclusion, we presented the novel Quasi Time-Reversible Grassmann extrapolation scheme aimed at preserving the energy conservation of Newton’s equations and, at the same time, at keeping low the number of SCF iterations. This scheme is based on the same properties of differential geometry of our previous extrapolation approach, ensuring that our guess density matrices retain all the mathematical properties of a density matrix. The innovation of this contribution lies in the symmetric combination of vectors in the tangent space, which proved to effectively preserve the stability of the total energy during the simulation. To validate its effectiveness, we conducted tests on systems of different sizes, and we obtained excellent results for all of them.

Acknowledgments

This work was supported by the Italian Ministry of University and Research under grant 2020HTSXMA_002 (PSI-MOVIE) and by the French ‘Investissements d’Avenir’ program, project Agence Nationale de la Recherche (ISITE-BFC) (contract ANR-15-IDEX-0003). ÉP also acknowledges support from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement No 810367–project EMC2) as well as from the Simons Targeted Grant Award No. 896630. BS acknowledges funding by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC 2075 – 390740016. FP is member of the GNCS group of INdAM.

References

Determination of optimal q and ε values

The parameters q and ε of the QTR G-Ext method were found by computing the error ΓnΓ~n and averaging it over the whole simulation for different values of q and ε, specifically q=3,4,,20 and ε = 0.001, 0.002, 0.005, 0.01, 0.02, 0.05, and we selected the combination (q,ε) corresponding to the minimal error.

Table 6: Optimal q ed ε for each system and SCF tolerance.
DMABN 3HF AppA OCP
SCF tolerance 105 107 105 107 105 107 105 107
q 5 4 5 4 5 4 5 4
ε 0.005 0.001 0.005 0.001 0.005 0.002 0.005 0.002
Refer to captionRefer to caption
Refer to captionRefer to caption
Figure 3: Error for DMABN (top) and for 3HF (bottom), with SCF convergence threshold of 105 (left panel) and 107 (right panel).
Refer to captionRefer to caption
Refer to captionRefer to caption
Figure 4: Error for AppA (top) and for OCP (bottom), with SCF convergence threshold of 105 (left panel) and 107 (right panel).

Supplementary figures: Energy stability and number of SCF iterations

In the following, we report the total energy profile and number of SCF iterations per step for all the simulations performed in this work using the XLBO, G-Ext, QTR, and TR schemes. The time-reversible scheme is obtained by computing the guess density as follows:

Γ~n=Γ~nq+i=1q~αi(Γni+Γnq+i), (4)

Eq. 4 is manifestly symmetric, and thus fully time reversible. As mentioned in the main text, the fully TR scheme exhibits excellent stability, but poor performance, as the number of SCF iterations tends to quickly increase along the simulation, to the point that the extrapolation is not anymore beneficial.

DMABN

Refer to captionRefer to caption
Refer to captionRefer to caption
Figure 5: Comparison of total energies (left) and SCF iterations (right) along the simulation on DMABN with four extrapolation approaches: Quasi Time-Reversible (QTR G-Ext, blue line), Fully Time-Reversible (TR, green line), Grassmann Extrapolation (G-Ext, red line), and Extended Langrangian (XLBO, yellow line). For energy plots, soft colors are used to plot the actual energy values, whereas the marked line is the moving-average of the energy every 100 steps. The SCF iteration plots represent the moving-average of the number of iterations every 100 steps. SCF convergence threshold: 105 (top panel), 107 (bottom panel).

3HF

Refer to captionRefer to caption
Refer to captionRefer to caption
Figure 6: Comparison of total energies (left) and SCF iterations (right) along the simulation on 3HF with four extrapolation approaches: Quasi Time-Reversible (QTR G-Ext, blue line), Fully Time-Reversible (TR, green line), Grassmann Extrapolation (G-Ext, red line), and Extended Langrangian (XLBO, yellow line). For energy plots, soft colors are used to plot the actual energy values, whereas the marked line is the moving-average of the energy every 100 steps. The SCF iteration plots represent the moving-average of the number of iterations every 100 steps. SCF convergence threshold: 105 (top panel), 107 (bottom panel).

AppA

Refer to captionRefer to caption
Refer to captionRefer to caption
Figure 7: Comparison of total energies (left) and SCF iterations (right) along the simulation on AppA with four extrapolation approaches: Quasi Time-Reversible (QTR G-Ext, blue line), Fully Time-Reversible (TR, green line), Grassmann Extrapolation (G-Ext, red line), and Extended Langrangian (XLBO, yellow line). For energy plots, soft colors are used to plot the actual energy values, whereas the marked line is the moving-average of the energy every 100 steps. The SCF iteration plots represent the moving-average of the number of iterations every 100 steps. SCF convergence threshold: 105 (top panel), 107 (bottom panel).

OCP

Refer to captionRefer to caption
Refer to captionRefer to caption
Figure 8: Comparison of total energies (left) and SCF iterations (right) along the simulation on OCP with four extrapolation approaches: Quasi Time-Reversible (QTR G-Ext, blue line), Fully Time-Reversible (TR, green line), Grassmann Extrapolation (G-Ext, red line), and Extended Langrangian (XLBO, yellow line). For energy plots, soft colors are used to plot the actual energy values, whereas the marked line is the moving-average of the energy every 100 steps. The SCF iteration plots represent the moving-average of the number of iterations every 100 steps. SCF convergence threshold: 105 (top panel), 107 (bottom panel).