microsoft
/

skala-1.1

@@ -48,9 +48,9 @@ interactions, distorted equilibrium geometries, and elementary
 reactions, as well as a small amount of publicly available high-accuracy
 data.
-We demonstrate departure from the historical trade-off between accuracy
-and efficiency is enabled by learning non-local representations of
-electronic structure directly from data, bypassing the need for
 increasingly costly hand-engineered features. The Skala-1.1 functional
 surpasses state-of-the-art hybrid functionals in accuracy across the
 main-group chemistry benchmark set GMTKN55, which covers general
@@ -158,45 +158,45 @@ The following data is included in our training set:
     Collection](https://arxiv.org/abs/2506.14492v5) (MSR-ACC).
     Additionally the MSR-ACC subsets for larger TAEs (up to 9
     non-hydrogen atoms), conformers, ionization potentials, electron
-    affinities, proton affinities, reaction paths and distorted
-    equilibrium structures were included. The labels for this data sets
     are obtained with the W1w method and are part of the currently
     unpublished subsets of the MSR-ACC.
 **Atomic Data**
-:   Total energies, electron affinities and ionization potentials (up to
-    triple ionization) for atoms, from H to Ar (excluding Li and Be
-    because of basis set constraints).This data was produced in-house
-    with CCSD(T) by extrapolating to the complete basis set limit from
-    quadruple zeta (QZ) and pentuple zeta (5Z) basis set
-    calculations.The basis sets used for H and He were aug-cc-pV(Q+d)Z,
-    aug-cc-pV(5+d), while for the remaining elements B-Ar the basis sets
-    used were aug-cc-pCVQZ and aug-cc-pCV5Z. All basis sets were obtained
-    from the [Basis Set Exchange
-    (BSE)](https://www.basissetexchange.org/). Extrapolation of the
-    correlation energy was performed by fitting a simple Z\^(-3)
-    expression, while extrapolation of the Hartree-Fock energy was
-    performed using a two-point extrapolation.
 **Transition metal properties**
 :   Additional data for transition metal atoms and dimers, including
-    ionization potentials, spin splittings and dissociation energies.
     The reference energies were obtained from literature.
 **NCI-Atlas**
 :   Five datasets from the [NCI-Atlas collection of non-covalent
     interactions](http://www.nciatlas.org/):
-  - [D442x10](http://www.nciatlas.org/D442x10.html), dissociation
-    curves for dispersion bound van-der-Waals complexes
-  - [SH250x10](http://www.nciatlas.org/SH250.html), dissociation
-    curves for sigma-hole bound van-der-Waals complexes
-  - [R739x5](http://www.nciatlas.org/R739.html), compressed
-    van-der-Waals complexes
-  - [HB300SPXx10](http://www.nciatlas.org/HB300SPX.html), dissociation
-    curves for hydrogen bound van-der-Waals complexes
-  - [IHB100x10](http://www.nciatlas.org/IHB100.html), dissociation
-    curves for ionic hydrogen bound van-der-Waals complexes
 **GDB9**
 :   The graph data base with up to non-hydrogen atoms computed at
@@ -204,7 +204,7 @@ The following data is included in our training set:
 **BH9**
 :   Reactions and barrier heights from [Prasad et. al 2021][prasad2021]
-    The data set was filted for systems with up to ten
     non-hydrogen atoms.
 **NCIBLIND**
@@ -226,10 +226,10 @@ The following data is included in our training set:
 :   Containing atomization energies of carbon
     clusters from [Karton et al. 2009][karton2009].
-For all training data we have created input density and derived meta-GGA
-features using density matrices of converged SCF calculations with the
-B3LYP functional (def2-QZVP and ma-def2-QZVP basis set) using a modified
-version of the PySCF software package.
 ### Training procedure
@@ -237,42 +237,41 @@ version of the PySCF software package.
 The training datapoints are preprocessed as follows.
-- For each molecule the density and derived meta-GGA features are
-  computed based on the density matrix of converged SCF calculations
-  with the B3LYP functional using a def2-QZVP or ma-def2-QZVP basis set
-  using a modified version of the PySCF software package.
-- Density fitting was not applied for the SCF calculation.
-- The density features were evaluated on an atom centered integration
   grid of level 1.
-- The radial integral was performed with the Treutler-Ahlrichs,
-  Gauss-Chebychev, Delley, or Mura-Knowles based on Bragg atomic radii
-  with Treutler based radii adjustment.
 - The space-partitioning was performed with Becke partition and
   Treutler-Ahlrichs radii adjustment, Stratmann-Scuseria-Frisch (SSF)
   partition scheme, and Laqua-Kussmann-Ochsenfeld (LKO) partition
   scheme.
 - The angular grid points were pruned using the NWChem scheme.
-- No density based cutoff was applied and all grid points were retained
-  for training.
 #### Training hyperparameters
 The training hyperparameter settings are detailed in the supplementary
-of [Accurate and scalable exchange-correlation with deep learning, Luise
-et al. 2025](https://arxiv.org/abs/2506.14492v6). This repository only
-includes the code to evaluate the checkpoints provided, not the training
-code.
 #### Speeds, sizes, times
-The training of our functional using the training dataset as detailed in
-the section \"Training data\" took approximately 48h for 1M training
-steps on a [ND A100 v4 series
 VM](https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/gpu-accelerated/ndasra100v4-series?tabs=sizebasic)
-with 8 NVIDIA A100 GPU with 80 GB memory each, 96 CPU cores, 880 GB RAM,
-and a 6 TB disk.
-The model checkpoints have ~385,217 trainable parameters.
 ## Evaluation
@@ -290,33 +289,31 @@ We have evaluated our functional on several different benchmark sets:
     CuAgAu83 from [Chan 2019][chan2019],
     DAPd from [Author et. al 2020][dapd2020],
     3d4dIPSS, TMB11, and TMD10 from [Liang et. al 2025][liang2025]
-3.  GMTKN55. A diverse and highly accurate dataset of general main group
-    thermochemistry, kinetics and noncovalent
     interactions from [Goerigk et. al 2017][goerigk2017]
 4.  Geometry optimization datasets: (a) CCse21, equilibrium structures,
-    bond lengths and bond angles from [Piccardo et. al 2015][piccardo2015]
-    (b) HMGB11, equilibrium structures, bond
-    lengths from [Grimme et. al 2015][grimme2015]
-    (c) LMGB35, equilibrium structures, bond lengths,
-    and from [Grimme et. al 2015][grimme2015]
     (d) W4-11-GEOM, equilibrium structures, bond
-    lengths and bond angles.
-5.  The Dipole benchmark dataset from [Hait et al. 2018][hait2018]
-6.  Conformer search benchmark dataset of 22 molecules spanning
-    molecular size from 24 to 176 atoms for cost scaling from
     [Grimme et al. 2019][grimme2019]
-The evaluation of our model using the 5 different types of benchmarks as
-defined above serve to measure different performance aspects of our
-functional. For example, 1 and 2 focus on the accuracy of predicted
-reaction energies, and 3 focuses on the ability of our functional to
-perform geometry optimization and to converge to the right equilibrium
-molecular structure. Furthermore, 4 measures the dipole moment of the
-molecules in the benchmark set, which provides a measure for the quality
-of the self-consistent electron density that a converged SCF procedure
-produces with our model. Finally, 5 determines the speed of employing
-SCF with our model and compares its scaling behavior with respect to
-system size with the scaling of traditional functionals.
 The metrics for the different benchmark sets are:
@@ -327,13 +324,13 @@ The metrics for the different benchmark sets are:
     of reaction r as calculated by a high-accuracy method from the W4
     family (CCSDT(Q)/CBS to CCSDTQ56/CBS), and $\Delta E_r^\theta$ is
     the prediction of the reaction energy difference using SCF
-    calculations with our functional, and
 2.  Weighted total mean absolute deviations 2 (WTMAD-2) in kcal/mol for
     the GMTKN55 benchmark set
     $\text{WTMAD-2} = \frac1{\sum^{55}_{i=1} N_i} \sum_{i=1}^{55} N_i \frac{56.84\text{ kcal/mol}}{\overline{|\Delta E|}_i} \text{MAE}_i$
     Here $N_i$ is the number of reactions in subset *i*,
     $\overline{|\Delta E|}_i$ is the average energy difference in subset
-    *i* in kcal/mol and $\text{MAE}_i$ is the mean absolute error in
     kcal/mol for subset *i*.
 3.  For the geometry benchmark sets that report bond lengths, we measure
     the absolute error in bond lengths in Angstrom, averaged over the
@@ -341,22 +338,19 @@ The metrics for the different benchmark sets are:
     dataset. For the benchmark that also contains bond angles, we report
     the absolute error of the angles, averaged over the number of bonds
     and equilibrium structures in the dataset.
-4.  We follow the metrics defined in `hait2018`{.interpreted-text
-    role="footcite"}. We measure the Root Mean Squared Error (RMSE) of
-    the dipole moment with respect to reference values provided by the
-    benchmark dataset. For those molecules (indexed with *i*) for which
-    only the reference magnitude of the dipole moment
-    $\mu_i^{\text{ref}} = |{\vec\mu}_i^{\text{ref}}|$ is provided, we
-    measure the RMSE of the predicted magnitude of the dipole moment
-    $\mu_i^{\theta} = |{\vec\mu}_i^{\theta}|$ is available, the error is
-    defined as
-    $\text{Error}_i = \frac{\mu_i^\theta - \mu_i^\text{ref}}{\max(\mu_i^\text{ref}, 1D)} \times 100\%$.
-    Here *D* denotes the unit of Debye. For those molecules for which
-    the reference value of the dipole vector $\vec{\mu}_i^\text{ref}$ is
-    also available we instead compute
-    $\text{Error}_i = \frac{|\vec{\mu}_i^\theta - \vec{\mu}_i^\text{ref}|}{max(\mu_i^\text{ref}, 1D)} \times 100\%$.
-    Using these errors we compute the RMSE as follows:
-    $\text{RMSE} = \sqrt{\frac{1}{N} \sum_{i=1}^N \text{Error}_i^2}$
 5.  We fit a power law of the form
     $C(M) = \left(\frac{n(M)}{A}\right)^k$ to the 22 data points of the
     test set where *C(M)* and *n(M)* are the computational cost and
@@ -366,50 +360,32 @@ The metrics for the different benchmark sets are:
 ### Evaluation results
-We demonstrate that the combination of a large-scale high-accuracy
-dataset combined with our deep learning architecture produces the Skala
-functional that predicts atomization energies at chemical accuracy (1
-kcal/mol), as measured on the public benchmark set W4-17. On the public
-benchmark set GMTKN55, which covers general-main group thermochemistry,
-kinetics and noncovalent interactions, our model makes predictions
-around 2.72 kcal/mol. This accuracy is better than state-of-the-art
 range-separated hybrid functionals while only requiring runtimes typical
 of semi-local DFT.
-On the geometry optimization benchmarks we demonstrate that we can
-converge to the reference equilibrium structure with an error that is
-comparable to a range-separated hybrid functional. On the dipole
-prediction benchmark test we demonstrate that the error of our dipole
-moment prediction with respect to reference values is better than
 state-of-the-art range-separated hybrid functionals.
-Finally, our scaling results demonstrate that our functional shows the
-asymptotic scaling behavior of a metaGGA functional, with an approximate
-prefactor of 3 compared to the r2SCAN.
 ## License
-> MIT License
->
-> Copyright (c) Microsoft Corporation.
->
-> Permission is hereby granted, free of charge, to any person obtaining a copy
-> of this software and associated documentation files (the "Software"), to deal
-> in the Software without restriction, including without limitation the rights
-> to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
-> copies of the Software, and to permit persons to whom the Software is
-> furnished to do so, subject to the following conditions:
->
-> The above copyright notice and this permission notice shall be included in all
-> copies or substantial portions of the Software.
->
-> THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
-> IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-> FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
-> AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
-> LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
-> OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
-> SOFTWARE.
 ## Citation
@@ -421,7 +397,7 @@ version number as follows:
 ``` bibtex
 @misc{luise2025,
    title={Accurate and scalable exchange-correlation with deep learning},
-   author={Giulia Luise and Chin-Wei Huang and Thijs Vogels and Derk P. Kooi and Sebastian Ehlert and Stephanie Lanius and Klaas J. H. Giesbertz and Amir Karton and Deniz Gunceler and Megan Stanley and Wessel P. Bruinsma and Lin Huang and Xinran Wei and José Garrido Torres and Abylay Katbashev and Rodrigo Chavez Zavaleta and Bálint Máté and Sékou-Oumar Kaba and Roberto Sordillo and Yingrong Chen and David B. Williams-Young and Christopher M. Bishop and Jan Hermann and Rianne van den Berg and Paola Gori-Giorgi},
    year={2025},
    eprint={2506.14665},
    archivePrefix={arXiv},

 reactions, as well as a small amount of publicly available high-accuracy
 data.
+We demonstrate that departure from the historical trade-off between
+accuracy and efficiency is enabled by learning non-local representations
+of electronic structure directly from data, bypassing the need for
 increasingly costly hand-engineered features. The Skala-1.1 functional
 surpasses state-of-the-art hybrid functionals in accuracy across the
 main-group chemistry benchmark set GMTKN55, which covers general
     Collection](https://arxiv.org/abs/2506.14492v5) (MSR-ACC).
     Additionally the MSR-ACC subsets for larger TAEs (up to 9
     non-hydrogen atoms), conformers, ionization potentials, electron
+    affinities, proton affinities, reaction paths, and distorted
+    equilibrium structures were included. The labels for these data sets
     are obtained with the W1w method and are part of the currently
     unpublished subsets of the MSR-ACC.
 **Atomic Data**
+:   Total energies, electron affinities, and ionization potentials (up
+    to triple ionization) for atoms, from H to Ar (excluding Li and Be
+    due to basis-set constraints). This data was produced in-house with
+    CCSD(T) by extrapolating to the complete basis set limit from
+    quadruple zeta (QZ) and pentuple zeta (5Z) calculations. The basis
+    sets used for H and He were aug-cc-pV(Q+d)Z and aug-cc-pV(5+d),
+    while for the remaining elements B--Ar the basis sets were
+    aug-cc-pCVQZ and aug-cc-pCV5Z. All basis sets were obtained from the
+    [Basis Set Exchange (BSE)](https://www.basissetexchange.org/).
+    Extrapolation of the correlation energy was performed by fitting a
+    $Z^{-3}$ expression, while the Hartree--Fock energy was extrapolated
+    using the two-point scheme of [Karton 2006][karton2006].
 **Transition metal properties**
 :   Additional data for transition metal atoms and dimers, including
+    ionization potentials, spin splittings, and dissociation energies.
     The reference energies were obtained from literature.
 **NCI-Atlas**
 :   Five datasets from the [NCI-Atlas collection of non-covalent
     interactions](http://www.nciatlas.org/):
+    - [D442x10](http://www.nciatlas.org/D442x10.html), dissociation
+      curves for dispersion-bound van der Waals complexes
+    - [SH250x10](http://www.nciatlas.org/SH250.html), dissociation
+      curves for sigma-hole-bound van der Waals complexes
+    - [R739x5](http://www.nciatlas.org/R739.html), compressed van der
+      Waals complexes
+    - [HB300SPXx10](http://www.nciatlas.org/HB300SPX.html), dissociation
+      curves for hydrogen-bound van der Waals complexes
+    - [IHB100x10](http://www.nciatlas.org/IHB100.html), dissociation
+      curves for ionic hydrogen-bound van der Waals complexes
 **GDB9**
 :   The graph data base with up to non-hydrogen atoms computed at
 **BH9**
 :   Reactions and barrier heights from [Prasad et. al 2021][prasad2021]
+    The data set was filtered for systems with up to ten
     non-hydrogen atoms.
 **NCIBLIND**
 :   Containing atomization energies of carbon
     clusters from [Karton et al. 2009][karton2009].
+For all training data, input density and derived meta-GGA features were
+computed from density matrices of converged B3LYP SCF calculations
+(def2-QZVP and ma-def2-QZVP basis sets) using a modified version of
+PySCF.
 ### Training procedure
 The training datapoints are preprocessed as follows.
+- For each molecule, the density and derived meta-GGA features are
+  computed from the density matrix of a converged B3LYP SCF calculation
+  using a def2-QZVP or ma-def2-QZVP basis set in a modified version of
+  PySCF.
+- Density fitting was not applied.
+- The density features were evaluated on an atom-centered integration
   grid of level 1.
+- The radial quadrature was performed with Treutler-Ahlrichs,
+  Gauss-Chebyshev, Delley, or Mura-Knowles schemes based on Bragg atomic
+  radii with Treutler-based radii adjustment.
 - The space-partitioning was performed with Becke partition and
   Treutler-Ahlrichs radii adjustment, Stratmann-Scuseria-Frisch (SSF)
   partition scheme, and Laqua-Kussmann-Ochsenfeld (LKO) partition
   scheme.
 - The angular grid points were pruned using the NWChem scheme.
+- No density-based cutoff was applied; all grid points were retained for
+  training.
 #### Training hyperparameters
 The training hyperparameter settings are detailed in the supplementary
+material of [Accurate and scalable exchange-correlation with deep
+learning, Luise et al. 2025](https://arxiv.org/abs/2506.14492). This
+repository only includes the code to evaluate the provided checkpoints,
+not the training code.
 #### Speeds, sizes, times
+The training of the functional on the dataset described above took
+approximately 48 hours for 1M steps on an [ND A100 v4 series
 VM](https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/gpu-accelerated/ndasra100v4-series?tabs=sizebasic)
+with 8 NVIDIA A100 GPUs (80 GB each), 96 CPU cores, 880 GB RAM, and a 6
+TB disk.
+The model checkpoints have ~385k trainable parameters.
 ## Evaluation
     CuAgAu83 from [Chan 2019][chan2019],
     DAPd from [Author et. al 2020][dapd2020],
     3d4dIPSS, TMB11, and TMD10 from [Liang et. al 2025][liang2025]
+3.  GMTKN55. A diverse and highly accurate dataset of general main-group
+    thermochemistry, kinetics, and noncovalent
     interactions from [Goerigk et. al 2017][goerigk2017]
 4.  Geometry optimization datasets: (a) CCse21, equilibrium structures,
+    bond lengths, and bond angles from [Piccardo et. al 2015][piccardo2015];
+    (b) HMGB11, equilibrium structures and bond
+    lengths from [Grimme et. al 2015][grimme2015];
+    (c) LMGB35, equilibrium structures and bond lengths
+    from [Grimme et. al 2015][grimme2015]; and
     (d) W4-11-GEOM, equilibrium structures, bond
+    lengths, and bond angles.
+5.  The dipole benchmark dataset from [Hait et al. 2018][hait2018]
+6.  Conformer search benchmark dataset of 22 molecules spanning 24 to
+    176 atoms, used for cost-scaling analysis, from
     [Grimme et al. 2019][grimme2019]
+These six benchmark types serve to measure different performance aspects
+of the functional. Benchmarks 1 and 2 focus on the accuracy of predicted
+reaction energies. Benchmark 3 evaluates general main-group
+thermochemistry, kinetics, and noncovalent interactions. Benchmark 4
+evaluates geometry optimization and convergence to reference equilibrium
+structures. Benchmark 5 measures dipole moments, providing a proxy for
+the quality of the self-consistent electron density produced by the SCF
+procedure. Finally, benchmark 6 assesses computational cost scaling with
+respect to system size.
 The metrics for the different benchmark sets are:
     of reaction r as calculated by a high-accuracy method from the W4
     family (CCSDT(Q)/CBS to CCSDTQ56/CBS), and $\Delta E_r^\theta$ is
     the prediction of the reaction energy difference using SCF
+    calculations with our functional.
 2.  Weighted total mean absolute deviations 2 (WTMAD-2) in kcal/mol for
     the GMTKN55 benchmark set
     $\text{WTMAD-2} = \frac1{\sum^{55}_{i=1} N_i} \sum_{i=1}^{55} N_i \frac{56.84\text{ kcal/mol}}{\overline{|\Delta E|}_i} \text{MAE}_i$
     Here $N_i$ is the number of reactions in subset *i*,
     $\overline{|\Delta E|}_i$ is the average energy difference in subset
+    *i* in kcal/mol, and $\text{MAE}_i$ is the mean absolute error in
     kcal/mol for subset *i*.
 3.  For the geometry benchmark sets that report bond lengths, we measure
     the absolute error in bond lengths in Angstrom, averaged over the
     dataset. For the benchmark that also contains bond angles, we report
     the absolute error of the angles, averaged over the number of bonds
     and equilibrium structures in the dataset.
+4.  For the dipole benchmark, we follow the metrics defined in
+    [Hait et al. 2018][hait2018]. For molecules
+    (indexed by *i*) for which only the reference magnitude of the
+    dipole moment $\mu_i^{\text{ref}} = |{\vec\mu}_i^{\text{ref}}|$ is
+    provided, the error is defined as
+    $\text{Error}_i = \frac{\mu_i^\theta - \mu_i^\text{ref}}{\max(\mu_i^\text{ref}, 1D)} \times 100\%$,
+    where $\mu_i^{\theta} = |{\vec\mu}_i^{\theta}|$ is the predicted
+    magnitude and *D* denotes the unit of Debye. For molecules for which
+    the reference dipole vector $\vec{\mu}_i^\text{ref}$ is also
+    available, we instead compute
+    $\text{Error}_i = \frac{|\vec{\mu}_i^\theta - \vec{\mu}_i^\text{ref}|}{\max(\mu_i^\text{ref}, 1D)} \times 100\%$.
+    The RMSE is then
+    $\text{RMSE} = \sqrt{\frac{1}{N} \sum_{i=1}^N \text{Error}_i^2}$.
 5.  We fit a power law of the form
     $C(M) = \left(\frac{n(M)}{A}\right)^k$ to the 22 data points of the
     test set where *C(M)* and *n(M)* are the computational cost and
 ### Evaluation results
+On W4-17, the Skala-1.1 functional predicts atomization energies at
+chemical accuracy (~1 kcal/mol MAE). On GMTKN55, which covers general
+main-group thermochemistry, kinetics, and noncovalent interactions, it
+achieves a WTMAD-2 of 2.72 kcal/mol, surpassing state-of-the-art
 range-separated hybrid functionals while only requiring runtimes typical
 of semi-local DFT.
+On the geometry optimization benchmarks, the functional converges to
+reference equilibrium structures with errors comparable to a
+range-separated hybrid functional. On the dipole prediction benchmark,
+the error in dipole moment predictions is better than that of
 state-of-the-art range-separated hybrid functionals.
+Finally, the scaling results show that the Skala-1.1 functional exhibits
+the asymptotic scaling behavior of a meta-GGA functional, with an
+approximate prefactor of 3 relative to r2SCAN.
 ## License
+:::: dropdown
+MIT License
+::: {.literalinclude lines="3-"}
+../../LICENSE.txt
+:::
+::::
 ## Citation
 ``` bibtex
 @misc{luise2025,
    title={Accurate and scalable exchange-correlation with deep learning},
+   author={Giulia Luise and Chin-Wei Huang and Thijs Vogels and Derk P. Kooi and Sebastian Ehlert and Stephanie Lanius and Klaas J. H. Giesbertz and Amir Karton and Deniz Gunceler and Stefano Battaglia and Gregor N. C. Simm and P. Bernát Szabó and Megan Stanley and Wessel P. Bruinsma and Lin Huang and Xinran Wei and José Garrido Torres and Abylay Katbashev and Rodrigo Chavez Zavaleta and Bálint Máté and Sékou-Oumar Kaba and Roberto Sordillo and Yingrong Chen and David B. Williams-Young and Christopher M. Bishop and Jan Hermann and Rianne van den Berg and Paola Gori-Giorgi},
    year={2025},
    eprint={2506.14665},
    archivePrefix={arXiv},