new

Get trending papers in your email inbox!

Subscribe

Daily Papers

byAK and the research community

May 14

Sampling-based sublinear low-rank matrix arithmetic framework for dequantizing quantum machine learning

We present an algorithmic framework for quantum-inspired classical algorithms on close-to-low-rank matrices, generalizing the series of results started by Tang's breakthrough quantum-inspired algorithm for recommendation systems [STOC'19]. Motivated by quantum linear algebra algorithms and the quantum singular value transformation (SVT) framework of Gilyén, Su, Low, and Wiebe [STOC'19], we develop classical algorithms for SVT that run in time independent of input dimension, under suitable quantum-inspired sampling assumptions. Our results give compelling evidence that in the corresponding QRAM data structure input model, quantum SVT does not yield exponential quantum speedups. Since the quantum SVT framework generalizes essentially all known techniques for quantum linear algebra, our results, combined with sampling lemmas from previous work, suffice to generalize all recent results about dequantizing quantum machine learning algorithms. In particular, our classical SVT framework recovers and often improves the dequantization results on recommendation systems, principal component analysis, supervised clustering, support vector machines, low-rank regression, and semidefinite program solving. We also give additional dequantization results on low-rank Hamiltonian simulation and discriminant analysis. Our improvements come from identifying the key feature of the quantum-inspired input model that is at the core of all prior quantum-inspired results: ell^2-norm sampling can approximate matrix products in time independent of their dimension. We reduce all our main results to this fact, making our exposition concise, self-contained, and intuitive.

  • 6 authors
·
Jul 9, 2023

Improving Long-Range Interactions in Graph Neural Simulators via Hamiltonian Dynamics

Learning to simulate complex physical systems from data has emerged as a promising way to overcome the limitations of traditional numerical solvers, which often require prohibitive computational costs for high-fidelity solutions. Recent Graph Neural Simulators (GNSs) accelerate simulations by learning dynamics on graph-structured data, yet often struggle to capture long-range interactions and suffer from error accumulation under autoregressive rollouts. To address these challenges, we propose Information-preserving Graph Neural Simulators (IGNS), a graph-based neural simulator built on the principles of Hamiltonian dynamics. This structure guarantees preservation of information across the graph, while extending to port-Hamiltonian systems allows the model to capture a broader class of dynamics, including non-conservative effects. IGNS further incorporates a warmup phase to initialize global context, geometric encoding to handle irregular meshes, and a multi-step training objective that facilitates PDE matching, where the trajectory produced by integrating the port-Hamiltonian core aligns with the ground-truth trajectory, thereby reducing rollout error. To evaluate these properties systematically, we introduce new benchmarks that target long-range dependencies and challenging external forcing scenarios. Across all tasks, IGNS consistently outperforms state-of-the-art GNSs, achieving higher accuracy and stability under challenging and complex dynamical systems. Our project page: https://thobotics.github.io/neural_pde_matching.

  • 7 authors
·
Nov 11, 2025

Ground State Preparation via Dynamical Cooling

Quantum algorithms for probing ground-state properties of quantum systems require good initial states. Projection-based methods such as eigenvalue filtering rely on inputs that have a significant overlap with the low-energy subspace, which can be challenging for large, strongly-correlated systems. This issue has motivated the study of physically-inspired dynamical approaches such as thermodynamic cooling. In this work, we introduce a ground-state preparation algorithm based on the simulation of quantum dynamics. Our main insight is to transform the Hamiltonian by a shifted sign function via quantum signal processing, effectively mapping eigenvalues into positive and negative subspaces separated by a large gap. This automatically ensures that all states within each subspace conserve energy with respect to the transformed Hamiltonian. Subsequent time-evolution with a perturbed Hamiltonian induces transitions to lower-energy states while preventing unwanted jumps to higher energy states. The approach does not rely on a priori knowledge of energy gaps and requires no additional qubits to model a bath. Furthermore, it makes mathcal{O}(d^{,3/2}/epsilon) queries to the time-evolution operator of the system and mathcal{O}(d^{,3/2}) queries to a block-encoding of the perturbation, for d cooling steps and an epsilon-accurate energy resolution. Our results provide a framework for combining quantum signal processing and Hamiltonian simulation to design heuristic quantum algorithms for ground-state preparation.

  • 4 authors
·
Apr 8, 2024

Generative Quantum-inspired Kolmogorov-Arnold Eigensolver

High-performance computing (HPC) is increasingly important for scalable quantum chemistry workflows that couple classical generative models, quantum circuit simulation, and selected configuration interaction postprocessing. We present the generative quantum-inspired Kolmogorov-Arnold eigensolver (GQKAE), a parameter-efficient extension of the generative quantum eigensolver (GQE) for quantum chemistry. GQKAE replaces the parameter-heavy feed-forward network components in GPT-style generative eigensolvers with hybrid quantum-inspired Kolmogorov-Arnold network modules, forming a compact HQKANsformer backbone. The method preserves autoregressive operator selection and the quantum-selected configuration interaction evaluation pipeline, while using single-qubit DatA Re-Uploading ActivatioN modules to provide expressive nonlinear mappings. Numerical benchmarks on H4, N2, LiH, C2H6, H2O, and the H2O dimer show that GQKAE achieves chemical accuracy comparable to the GPT-based GQE architecture, while reducing trainable parameters and memory by approximately 66% and improving wall-time performance. For strongly correlated systems such as N2 and LiH, GQKAE also improves convergence behavior and final energy errors. These results indicate that quantum-inspired Kolmogorov-Arnold networks can reduce classical-side overhead while preserving circuit-generation quality, offering a scalable route for HPC-quantum co-design on near-term quantum platforms.

  • 12 authors
·
May 5 2

Towards A Universally Transferable Acceleration Method for Density Functional Theory

Recently, sophisticated deep learning-based approaches have been developed for generating efficient initial guesses to accelerate the convergence of density functional theory (DFT) calculations. While the actual initial guesses are often density matrices (DM), quantities that can convert into density matrices also qualify as alternative forms of initial guesses. Hence, existing works mostly rely on the prediction of the Hamiltonian matrix for obtaining high-quality initial guesses. However, the Hamiltonian matrix is both numerically difficult to predict and intrinsically non-transferable, hindering the application of such models in real scenarios. In light of this, we propose a method that constructs DFT initial guesses by predicting the electron density in a compact auxiliary basis representation using E(3)-equivariant neural networks. Trained on small molecules with up to 20 atoms, our model is able to achieve an average 33.3% self-consistent field (SCF) step reduction on systems up to 60 atoms, substantially outperforming Hamiltonian-centric and DM-centric models. Critically, this acceleration remains nearly constant with increasing system sizes and exhibits strong transferring behaviors across orbital basis sets and exchange-correlation (XC) functionals. To the best of our knowledge, this work represents the first and robust candidate for a universally transferable DFT acceleration method. We are also releasing the SCFbench dataset and its accompanying code to facilitate future research in this promising direction.

  • 6 authors
·
Sep 29, 2025

Curriculum reinforcement learning for quantum architecture search under hardware errors

The key challenge in the noisy intermediate-scale quantum era is finding useful circuits compatible with current device limitations. Variational quantum algorithms (VQAs) offer a potential solution by fixing the circuit architecture and optimizing individual gate parameters in an external loop. However, parameter optimization can become intractable, and the overall performance of the algorithm depends heavily on the initially chosen circuit architecture. Several quantum architecture search (QAS) algorithms have been developed to design useful circuit architectures automatically. In the case of parameter optimization alone, noise effects have been observed to dramatically influence the performance of the optimizer and final outcomes, which is a key line of study. However, the effects of noise on the architecture search, which could be just as critical, are poorly understood. This work addresses this gap by introducing a curriculum-based reinforcement learning QAS (CRLQAS) algorithm designed to tackle challenges in realistic VQA deployment. The algorithm incorporates (i) a 3D architecture encoding and restrictions on environment dynamics to explore the search space of possible circuits efficiently, (ii) an episode halting scheme to steer the agent to find shorter circuits, and (iii) a novel variant of simultaneous perturbation stochastic approximation as an optimizer for faster convergence. To facilitate studies, we developed an optimized simulator for our algorithm, significantly improving computational efficiency in simulating noisy quantum circuits by employing the Pauli-transfer matrix formalism in the Pauli-Liouville basis. Numerical experiments focusing on quantum chemistry tasks demonstrate that CRLQAS outperforms existing QAS algorithms across several metrics in both noiseless and noisy environments.

  • 6 authors
·
Feb 5, 2024

Real-Time Krylov Theory for Quantum Computing Algorithms

Quantum computers provide new avenues to access ground and excited state properties of systems otherwise difficult to simulate on classical hardware. New approaches using subspaces generated by real-time evolution have shown efficiency in extracting eigenstate information, but the full capabilities of such approaches are still not understood. In recent work, we developed the variational quantum phase estimation (VQPE) method, a compact and efficient real-time algorithm to extract eigenvalues on quantum hardware. Here we build on that work by theoretically and numerically exploring a generalized Krylov scheme where the Krylov subspace is constructed through a parametrized real-time evolution, which applies to the VQPE algorithm as well as others. We establish an error bound that justifies the fast convergence of our spectral approximation. We also derive how the overlap with high energy eigenstates becomes suppressed from real-time subspace diagonalization and we visualize the process that shows the signature phase cancellations at specific eigenenergies. We investigate various algorithm implementations and consider performance when stochasticity is added to the target Hamiltonian in the form of spectral statistics. To demonstrate the practicality of such real-time evolution, we discuss its application to fundamental problems in quantum computation such as electronic structure predictions for strongly correlated systems.

  • 6 authors
·
Jun 9, 2023

Efficient and Scalable Density Functional Theory Hamiltonian Prediction through Adaptive Sparsity

Hamiltonian matrix prediction is pivotal in computational chemistry, serving as the foundation for determining a wide range of molecular properties. While SE(3) equivariant graph neural networks have achieved remarkable success in this domain, their substantial computational cost--driven by high-order tensor product (TP) operations--restricts their scalability to large molecular systems with extensive basis sets. To address this challenge, we introduce SPHNet, an efficient and scalable equivariant network, that incorporates adaptive SParsity into Hamiltonian prediction. SPHNet employs two innovative sparse gates to selectively constrain non-critical interaction combinations, significantly reducing tensor product computations while maintaining accuracy. To optimize the sparse representation, we develop a Three-phase Sparsity Scheduler, ensuring stable convergence and achieving high performance at sparsity rates of up to 70%. Extensive evaluations on QH9 and PubchemQH datasets demonstrate that SPHNet achieves state-of-the-art accuracy while providing up to a 7x speedup over existing models. Beyond Hamiltonian prediction, the proposed sparsification techniques also hold significant potential for improving the efficiency and scalability of other SE(3) equivariant networks, further broadening their applicability and impact. Our code can be found at https://github.com/microsoft/SPHNet.

  • 10 authors
·
Feb 3, 2025

Tackling the Curse of Dimensionality with Physics-Informed Neural Networks

The curse-of-dimensionality taxes computational resources heavily with exponentially increasing computational cost as the dimension increases. This poses great challenges in solving high-dimensional PDEs, as Richard E. Bellman first pointed out over 60 years ago. While there has been some recent success in solving numerically partial differential equations (PDEs) in high dimensions, such computations are prohibitively expensive, and true scaling of general nonlinear PDEs to high dimensions has never been achieved. We develop a new method of scaling up physics-informed neural networks (PINNs) to solve arbitrary high-dimensional PDEs. The new method, called Stochastic Dimension Gradient Descent (SDGD), decomposes a gradient of PDEs into pieces corresponding to different dimensions and randomly samples a subset of these dimensional pieces in each iteration of training PINNs. We prove theoretically the convergence and other desired properties of the proposed method. We demonstrate in various diverse tests that the proposed method can solve many notoriously hard high-dimensional PDEs, including the Hamilton-Jacobi-Bellman (HJB) and the Schrödinger equations in tens of thousands of dimensions very fast on a single GPU using the PINNs mesh-free approach. Notably, we solve nonlinear PDEs with nontrivial, anisotropic, and inseparable solutions in 100,000 effective dimensions in 12 hours on a single GPU using SDGD with PINNs. Since SDGD is a general training methodology of PINNs, it can be applied to any current and future variants of PINNs to scale them up for arbitrary high-dimensional PDEs.

  • 4 authors
·
Jul 23, 2023

amangkurat: A Python Library for Symplectic Pseudo-Spectral Solution of the Idealized (1+1)D Nonlinear Klein-Gordon Equation

This study introduces amangkurat, an open-source Python library designed for the robust numerical simulation of relativistic scalar field dynamics governed by the nonlinear Klein-Gordon equation in (1+1)D spacetime. The software implements a hybrid computational strategy that couples Fourier pseudo-spectral spatial discretization with a symplectic Størmer-Verlet temporal integrator, ensuring both exponential spatial convergence for smooth solutions and long-term preservation of Hamiltonian structure. To optimize performance, the solver incorporates adaptive timestepping based on Courant-Friedrichs-Lewy (CFL) stability criteria and utilizes Just-In-Time (JIT) compilation for parallelized force computation. The library's capabilities are validated across four canonical physical regimes: dispersive linear wave propagation, static topological kink preservation in phi-fourth theory, integrable breather dynamics in the sine-Gordon model, and non-integrable kink-antikink collisions. Beyond standard numerical validation, this work establishes a multi-faceted analysis framework employing information-theoretic entropy metrics (Shannon, Rényi, and Tsallis), kernel density estimation, and phase space reconstruction to quantify the distinct phenomenological signatures of these regimes. Statistical hypothesis testing confirms that these scenarios represent statistically distinguishable dynamical populations. Benchmarks on standard workstation hardware demonstrate that the implementation achieves high computational efficiency, making it a viable platform for exploratory research and education in nonlinear field theory.

  • 2 authors
·
Dec 27, 2025

Quantum singular value transformation and beyond: exponential improvements for quantum matrix arithmetics

Quantum computing is powerful because unitary operators describing the time-evolution of a quantum system have exponential size in terms of the number of qubits present in the system. We develop a new "Singular value transformation" algorithm capable of harnessing this exponential advantage, that can apply polynomial transformations to the singular values of a block of a unitary, generalizing the optimal Hamiltonian simulation results of Low and Chuang. The proposed quantum circuits have a very simple structure, often give rise to optimal algorithms and have appealing constant factors, while usually only use a constant number of ancilla qubits. We show that singular value transformation leads to novel algorithms. We give an efficient solution to a certain "non-commutative" measurement problem and propose a new method for singular value estimation. We also show how to exponentially improve the complexity of implementing fractional queries to unitaries with a gapped spectrum. Finally, as a quantum machine learning application we show how to efficiently implement principal component regression. "Singular value transformation" is conceptually simple and efficient, and leads to a unified framework of quantum algorithms incorporating a variety of quantum speed-ups. We illustrate this by showing how it generalizes a number of prominent quantum algorithms, including: optimal Hamiltonian simulation, implementing the Moore-Penrose pseudoinverse with exponential precision, fixed-point amplitude amplification, robust oblivious amplitude amplification, fast QMA amplification, fast quantum OR lemma, certain quantum walk results and several quantum machine learning algorithms. In order to exploit the strengths of the presented method it is useful to know its limitations too, therefore we also prove a lower bound on the efficiency of singular value transformation, which often gives optimal bounds.

  • 4 authors
·
Jun 4, 2018

Hardware-efficient Variational Quantum Eigensolver for Small Molecules and Quantum Magnets

Quantum computers can be used to address molecular structure, materials science and condensed matter physics problems, which currently stretch the limits of existing high-performance computing resources. Finding exact numerical solutions to these interacting fermion problems has exponential cost, while Monte Carlo methods are plagued by the fermionic sign problem. These limitations of classical computational methods have made even few-atom molecular structures problems of practical interest for medium-sized quantum computers. Yet, thus far experimental implementations have been restricted to molecules involving only Period I elements. Here, we demonstrate the experimental optimization of up to six-qubit Hamiltonian problems with over a hundred Pauli terms, determining the ground state energy for molecules of increasing size, up to BeH2. This is enabled by a hardware-efficient variational quantum eigensolver with trial states specifically tailored to the available interactions in our quantum processor, combined with a compact encoding of fermionic Hamiltonians and a robust stochastic optimization routine. We further demonstrate the flexibility of our approach by applying the technique to a problem of quantum magnetism. Across all studied problems, we find agreement between experiment and numerical simulations with a noisy model of the device. These results help elucidate the requirements for scaling the method to larger systems, and aim at bridging the gap between problems at the forefront of high-performance computing and their implementation on quantum hardware.

  • 7 authors
·
Apr 17, 2017

Quantum Krylov subspace algorithms for ground and excited state energy estimation

Quantum Krylov subspace diagonalization (QKSD) algorithms provide a low-cost alternative to the conventional quantum phase estimation algorithm for estimating the ground and excited-state energies of a quantum many-body system. While QKSD algorithms typically rely on using the Hadamard test for estimating Krylov subspace matrix elements of the form, langle ϕ_i|e^{-iHτ}|ϕ_j rangle, the associated quantum circuits require an ancilla qubit with controlled multi-qubit gates that can be quite costly for near-term quantum hardware. In this work, we show that a wide class of Hamiltonians relevant to condensed matter physics and quantum chemistry contain symmetries that can be exploited to avoid the use of the Hadamard test. We propose a multi-fidelity estimation protocol that can be used to compute such quantities showing that our approach, when combined with efficient single-fidelity estimation protocols, provides a substantial reduction in circuit depth. In addition, we develop a unified theory of quantum Krylov subspace algorithms and present three new quantum-classical algorithms for the ground and excited-state energy estimation problems, where each new algorithm provides various advantages and disadvantages in terms of total number of calls to the quantum computer, gate depth, classical complexity, and stability of the generalized eigenvalue problem within the Krylov subspace.

  • 2 authors
·
Oct 13, 2021

Approximate Quantum Compiling for Quantum Simulation: A Tensor Network based approach

We introduce AQCtensor, a novel algorithm to produce short-depth quantum circuits from Matrix Product States (MPS). Our approach is specifically tailored to the preparation of quantum states generated from the time evolution of quantum many-body Hamiltonians. This tailored approach has two clear advantages over previous algorithms that were designed to map a generic MPS to a quantum circuit. First, we optimize all parameters of a parametric circuit at once using Approximate Quantum Compiling (AQC) - this is to be contrasted with other approaches based on locally optimizing a subset of circuit parameters and "sweeping" across the system. We introduce an optimization scheme to avoid the so-called ``orthogonality catastrophe" - i.e. the fact that the fidelity of two arbitrary quantum states decays exponentially with the number of qubits - that would otherwise render a global optimization of the circuit impractical. Second, the depth of our parametric circuit is constant in the number of qubits for a fixed simulation time and fixed error tolerance. This is to be contrasted with the linear circuit Ansatz used in generic algorithms whose depth scales linearly in the number of qubits. For simulation problems on 100 qubits, we show that AQCtensor thus achieves at least an order of magnitude reduction in the depth of the resulting optimized circuit, as compared with the best generic MPS to quantum circuit algorithms. We demonstrate our approach on simulation problems on Heisenberg-like Hamiltonians on up to 100 qubits and find optimized quantum circuits that have significantly reduced depth as compared to standard Trotterized circuits.

  • 4 authors
·
Jan 20, 2023

Momentum Attention: The Physics of In-Context Learning and Spectral Forensics for Mechanistic Interpretability

The Mechanistic Interpretability (MI) program has mapped the Transformer as a precise computational graph. We extend this graph with a conservation law and time-varying AC dynamics, viewing it as a physical circuit. We introduce Momentum Attention, a symplectic augmentation embedding physical priors via the kinematic difference operator p_t = q_t - q_{t-1}, implementing the symplectic shear q_t = q_t + γp_t on queries and keys. We identify a fundamental Symplectic-Filter Duality: the physical shear is mathematically equivalent to a High-Pass Filter. This duality is our cornerstone contribution -- by injecting kinematic momentum, we sidestep the topological depth constraint (L geq 2) for induction head formation. While standard architectures require two layers for induction from static positions, our extension grants direct access to velocity, enabling Single-Layer Induction and Spectral Forensics via Bode Plots. We formalize an Orthogonality Theorem proving that DC (semantic) and AC (mechanistic) signals segregate into orthogonal frequency bands when Low-Pass RoPE interacts with High-Pass Momentum. Validated through 5,100+ controlled experiments (documented in Supplementary Appendices A--R and 27 Jupyter notebooks), our 125M Momentum model exceeds expectations on induction-heavy tasks while tracking a 350M baseline within sim2.9% validation loss. Dedicated associative recall experiments reveal a scaling law γ^* = 4.17 times N^{-0.74} establishing momentum-depth fungibility. We offer this framework as a complementary analytical toolkit connecting Generative AI, Hamiltonian Physics, and Signal Processing.

  • 1 authors
·
Feb 3

An Efficient Graph-Transformer Operator for Learning Physical Dynamics with Manifolds Embedding

Accurate and efficient physical simulations are essential in science and engineering, yet traditional numerical solvers face significant challenges in computational cost when handling simulations across dynamic scenarios involving complex geometries, varying boundary/initial conditions, and diverse physical parameters. While deep learning offers promising alternatives, existing methods often struggle with flexibility and generalization, particularly on unstructured meshes, which significantly limits their practical applicability. To address these challenges, we propose PhysGTO, an efficient Graph-Transformer Operator for learning physical dynamics through explicit manifold embeddings in both physical and latent spaces. In the physical space, the proposed Unified Graph Embedding module aligns node-level conditions and constructs sparse yet structure-preserving graph connectivity to process heterogeneous inputs. In the latent space, PhysGTO integrates a lightweight flux-oriented message-passing scheme with projection-inspired attention to capture local and global dependencies, facilitating multilevel interactions among complex physical correlations. This design ensures linear complexity relative to the number of mesh points, reducing both the number of trainable parameters and computational costs in terms of floating-point operations (FLOPs), and thereby allowing efficient inference in real-time applications. We introduce a comprehensive benchmark spanning eleven datasets, covering problems with unstructured meshes, transient flow dynamics, and large-scale 3D geometries. PhysGTO consistently achieves state-of-the-art accuracy while significantly reducing computational costs, demonstrating superior flexibility, scalability, and generalization in a wide range of simulation tasks.

  • 9 authors
·
Dec 10, 2025 1

JARVIS-Leaderboard: A Large Scale Benchmark of Materials Design Methods

Lack of rigorous reproducibility and validation are major hurdles for scientific development across many fields. Materials science in particular encompasses a variety of experimental and theoretical approaches that require careful benchmarking. Leaderboard efforts have been developed previously to mitigate these issues. However, a comprehensive comparison and benchmarking on an integrated platform with multiple data modalities with both perfect and defect materials data is still lacking. This work introduces JARVIS-Leaderboard, an open-source and community-driven platform that facilitates benchmarking and enhances reproducibility. The platform allows users to set up benchmarks with custom tasks and enables contributions in the form of dataset, code, and meta-data submissions. We cover the following materials design categories: Artificial Intelligence (AI), Electronic Structure (ES), Force-fields (FF), Quantum Computation (QC) and Experiments (EXP). For AI, we cover several types of input data, including atomic structures, atomistic images, spectra, and text. For ES, we consider multiple ES approaches, software packages, pseudopotentials, materials, and properties, comparing results to experiment. For FF, we compare multiple approaches for material property predictions. For QC, we benchmark Hamiltonian simulations using various quantum algorithms and circuits. Finally, for experiments, we use the inter-laboratory approach to establish benchmarks. There are 1281 contributions to 274 benchmarks using 152 methods with more than 8 million data-points, and the leaderboard is continuously expanding. The JARVIS-Leaderboard is available at the website: https://pages.nist.gov/jarvis_leaderboard

  • 38 authors
·
Jun 20, 2023

Approximating the Top Eigenvector in Random Order Streams

When rows of an n times d matrix A are given in a stream, we study algorithms for approximating the top eigenvector of the matrix {A}^TA (equivalently, the top right singular vector of A). We consider worst case inputs A but assume that the rows are presented to the streaming algorithm in a uniformly random order. We show that when the gap parameter R = σ_1(A)^2/σ_2(A)^2 = Ω(1), then there is a randomized algorithm that uses O(h cdot d cdot polylog(d)) bits of space and outputs a unit vector v that has a correlation 1 - O(1/R) with the top eigenvector v_1. Here h denotes the number of heavy rows in the matrix, defined as the rows with Euclidean norm at least |{A}|_F/d cdot operatorname{polylog(d)}. We also provide a lower bound showing that any algorithm using O(hd/R) bits of space can obtain at most 1 - Ω(1/R^2) correlation with the top eigenvector. Thus, parameterizing the space complexity in terms of the number of heavy rows is necessary for high accuracy solutions. Our results improve upon the R = Ω(log n cdot log d) requirement in a recent work of Price and Xun (FOCS 2024). We note that the algorithm of Price and Xun works for arbitrary order streams whereas our algorithm requires a stronger assumption that the rows are presented in a uniformly random order. We additionally show that the gap requirements in their analysis can be brought down to R = Ω(log^2 d) for arbitrary order streams and R = Ω(log d) for random order streams. The requirement of R = Ω(log d) for random order streams is nearly tight for their analysis as we obtain a simple instance with R = Ω(log d/loglog d) for which their algorithm, with any fixed learning rate, cannot output a vector approximating the top eigenvector v_1.

  • 2 authors
·
Dec 16, 2024

Limitations of Quantum Hardware for Molecular Energy Estimation Using VQE

Variational quantum eigensolvers (VQEs) are among the most promising quantum algorithms for solving electronic structure problems in quantum chemistry, particularly during the Noisy Intermediate-Scale Quantum (NISQ) era. In this study, we investigate the capabilities and limitations of VQE algorithms implemented on current quantum hardware for determining molecular ground-state energies, focusing on the adaptive derivative-assembled pseudo-Trotter ansatz VQE (ADAPT-VQE). To address the significant computational challenges posed by molecular Hamiltonians, we explore various strategies to simplify the Hamiltonian, optimize the ansatz, and improve classical parameter optimization through modifications of the COBYLA optimizer. These enhancements are integrated into a tailored quantum computing implementation designed to minimize the circuit depth and computational cost. Using benzene as a benchmark system, we demonstrate the application of these optimizations on an IBM quantum computer. Despite these improvements, our results highlight the limitations imposed by current quantum hardware, particularly the impact of quantum noise on state preparation and energy measurement. The noise levels in today's devices prevent meaningful evaluations of molecular Hamiltonians with sufficient accuracy to produce reliable quantum chemical insights. Finally, we extrapolate the requirements for future quantum hardware to enable practical and scalable quantum chemistry calculations using VQE algorithms. This work provides a roadmap for advancing quantum algorithms and hardware toward achieving quantum advantage in molecular modeling.

  • 3 authors
·
Jun 4, 2025

High-order finite element method for atomic structure calculations

We introduce featom, an open source code that implements a high-order finite element solver for the radial Schr\"odinger, Dirac, and Kohn-Sham equations. The formulation accommodates various mesh types, such as uniform or exponential, and the convergence can be systematically controlled by increasing the number and/or polynomial order of the finite element basis functions. The Dirac equation is solved using a squared Hamiltonian approach to eliminate spurious states. To address the slow convergence of the kappa=pm1 states due to divergent derivatives at the origin, we incorporate known asymptotic forms into the solutions. We achieve a high level of accuracy (10^{-8} Hartree) for total energies and eigenvalues of heavy atoms such as uranium in both Schr\"odinger and Dirac Kohn-Sham solutions. We provide detailed convergence studies and computational parameters required to attain commonly required accuracies. Finally, we compare our results with known analytic results as well as the results of other methods. In particular, we calculate benchmark results for atomic numbers (Z) from 1 to 92, verifying current benchmarks. We demonstrate significant speedup compared to the state-of-the-art shooting solver dftatom. An efficient, modular Fortran 2008 implementation, is provided under an open source, permissive license, including examples and tests, wherein particular emphasis is placed on the independence (no global variables), reusability, and generality of the individual routines.

  • 8 authors
·
Jul 11, 2023

Limits and Powers of Koopman Learning

Dynamical systems provide a comprehensive way to study complex and changing behaviors across various sciences. Many modern systems are too complicated to analyze directly or we do not have access to models, driving significant interest in learning methods. Koopman operators have emerged as a dominant approach because they allow the study of nonlinear dynamics using linear techniques by solving an infinite-dimensional spectral problem. However, current algorithms face challenges such as lack of convergence, hindering practical progress. This paper addresses a fundamental open question: When can we robustly learn the spectral properties of Koopman operators from trajectory data of dynamical systems, and when can we not? Understanding these boundaries is crucial for analysis, applications, and designing algorithms. We establish a foundational approach that combines computational analysis and ergodic theory, revealing the first fundamental barriers -- universal for any algorithm -- associated with system geometry and complexity, regardless of data quality and quantity. For instance, we demonstrate well-behaved smooth dynamical systems on tori where non-trivial eigenfunctions of the Koopman operator cannot be determined by any sequence of (even randomized) algorithms, even with unlimited training data. Additionally, we identify when learning is possible and introduce optimal algorithms with verification that overcome issues in standard methods. These results pave the way for a sharp classification theory of data-driven dynamical systems based on how many limits are needed to solve a problem. These limits characterize all previous methods, presenting a unified view. Our framework systematically determines when and how Koopman spectral properties can be learned.

  • 3 authors
·
Jul 8, 2024

Robustifying State-space Models for Long Sequences via Approximate Diagonalization

State-space models (SSMs) have recently emerged as a framework for learning long-range sequence tasks. An example is the structured state-space sequence (S4) layer, which uses the diagonal-plus-low-rank structure of the HiPPO initialization framework. However, the complicated structure of the S4 layer poses challenges; and, in an effort to address these challenges, models such as S4D and S5 have considered a purely diagonal structure. This choice simplifies the implementation, improves computational efficiency, and allows channel communication. However, diagonalizing the HiPPO framework is itself an ill-posed problem. In this paper, we propose a general solution for this and related ill-posed diagonalization problems in machine learning. We introduce a generic, backward-stable "perturb-then-diagonalize" (PTD) methodology, which is based on the pseudospectral theory of non-normal operators, and which may be interpreted as the approximate diagonalization of the non-normal matrices defining SSMs. Based on this, we introduce the S4-PTD and S5-PTD models. Through theoretical analysis of the transfer functions of different initialization schemes, we demonstrate that the S4-PTD/S5-PTD initialization strongly converges to the HiPPO framework, while the S4D/S5 initialization only achieves weak convergences. As a result, our new models show resilience to Fourier-mode noise-perturbed inputs, a crucial property not achieved by the S4D/S5 models. In addition to improved robustness, our S5-PTD model averages 87.6% accuracy on the Long-Range Arena benchmark, demonstrating that the PTD methodology helps to improve the accuracy of deep learning models.

  • 5 authors
·
Oct 2, 2023

Cutting Slack: Quantum Optimization with Slack-Free Methods for Combinatorial Benchmarks

Constraint handling remains a key bottleneck in quantum combinatorial optimization. While slack-variable-based encodings are straightforward, they significantly increase qubit counts and circuit depth, challenging the scalability of quantum solvers. In this work, we investigate a suite of Lagrangian-based optimization techniques including dual ascent, bundle methods, cutting plane approaches, and augmented Lagrangian formulations for solving constrained combinatorial problems on quantum simulators and hardware. Our framework is applied to three representative NP-hard problems: the Travelling Salesman Problem (TSP), the Multi-Dimensional Knapsack Problem (MDKP), and the Maximum Independent Set (MIS). We demonstrate that MDKP and TSP, with their inequality-based or degree-constrained structures, allow for slack-free reformulations, leading to significant qubit savings without compromising performance. In contrast, MIS does not inherently benefit from slack elimination but still gains in feasibility and objective quality from principled Lagrangian updates. We benchmark these methods across classically hard instances, analyzing trade-offs in qubit usage, feasibility, and optimality gaps. Our results highlight the flexibility of Lagrangian formulations as a scalable alternative to naive QUBO penalization, even when qubit savings are not always achievable. This work provides practical insights for deploying constraint-aware quantum optimization pipelines, with applications in logistics, network design, and resource allocation.

  • 2 authors
·
Jul 16, 2025

Physics-informed Reduced Order Modeling of Time-dependent PDEs via Differentiable Solvers

Reduced-order modeling (ROM) of time-dependent and parameterized differential equations aims to accelerate the simulation of complex high-dimensional systems by learning a compact latent manifold representation that captures the characteristics of the solution fields and their time-dependent dynamics. Although high-fidelity numerical solvers generate the training datasets, they have thus far been excluded from the training process, causing the learned latent dynamics to drift away from the discretized governing physics. This mismatch often limits generalization and forecasting capabilities. In this work, we propose Physics-informed ROM (Φ-ROM) by incorporating differentiable PDE solvers into the training procedure. Specifically, the latent space dynamics and its dependence on PDE parameters are shaped directly by the governing physics encoded in the solver, ensuring a strong correspondence between the full and reduced systems. Our model outperforms state-of-the-art data-driven ROMs and other physics-informed strategies by accurately generalizing to new dynamics arising from unseen parameters, enabling long-term forecasting beyond the training horizon, maintaining continuity in both time and space, and reducing the data cost. Furthermore, Φ-ROM learns to recover and forecast the solution fields even when trained or evaluated with sparse and irregular observations of the fields, providing a flexible framework for field reconstruction and data assimilation. We demonstrate the framework's robustness across various PDE solvers and highlight its broad applicability by providing an open-source JAX implementation that is readily extensible to other PDE systems and differentiable solvers, available at https://phi-rom.github.io.

  • 4 authors
·
May 20, 2025

Solving High-Dimensional PDEs with Latent Spectral Models

Deep models have achieved impressive progress in solving partial differential equations (PDEs). A burgeoning paradigm is learning neural operators to approximate the input-output mappings of PDEs. While previous deep models have explored the multiscale architectures and various operator designs, they are limited to learning the operators as a whole in the coordinate space. In real physical science problems, PDEs are complex coupled equations with numerical solvers relying on discretization into high-dimensional coordinate space, which cannot be precisely approximated by a single operator nor efficiently learned due to the curse of dimensionality. We present Latent Spectral Models (LSM) toward an efficient and precise solver for high-dimensional PDEs. Going beyond the coordinate space, LSM enables an attention-based hierarchical projection network to reduce the high-dimensional data into a compact latent space in linear time. Inspired by classical spectral methods in numerical analysis, we design a neural spectral block to solve PDEs in the latent space that approximates complex input-output mappings via learning multiple basis operators, enjoying nice theoretical guarantees for convergence and approximation. Experimentally, LSM achieves consistent state-of-the-art and yields a relative gain of 11.5% averaged on seven benchmarks covering both solid and fluid physics. Code is available at https://github.com/thuml/Latent-Spectral-Models.

  • 5 authors
·
Jan 29, 2023

Full optimization of Jastrow-Slater wave functions with application to the first-row atoms and homonuclear diatomic molecules

We pursue the development and application of the recently-introduced linear optimization method for determining the optimal linear and nonlinear parameters of Jastrow-Slater wave functions in a variational Monte Carlo framework. In this approach, the optimal parameters are found iteratively by diagonalizing the Hamiltonian matrix in the space spanned by the wave function and its first-order derivatives, making use of a strong zero-variance principle. We extend the method to optimize the exponents of the basis functions, simultaneously with all the other parameters, namely the Jastrow, configuration state function and orbital parameters. We show that the linear optimization method can be thought of as a so-called augmented Hessian approach, which helps explain the robustness of the method and permits us to extend it to minimize a linear combination of the energy and the energy variance. We apply the linear optimization method to obtain the complete ground-state potential energy curve of the C_2 molecule up to the dissociation limit, and discuss size consistency and broken spin-symmetry issues in quantum Monte Carlo calculations. We perform calculations of the first-row atoms and homonuclear diatomic molecules with fully optimized Jastrow-Slater wave functions, and we demonstrate that molecular well depths can be obtained with near chemical accuracy quite systematically at the diffusion Monte Carlo level for these systems.

  • 2 authors
·
Mar 19, 2008

Artificial Entanglement in the Fine-Tuning of Large Language Models

Large language models (LLMs) can be adapted to new tasks using parameter-efficient fine-tuning (PEFT) methods that modify only a small number of trainable parameters, often through low-rank updates. In this work, we adopt a quantum-information-inspired perspective to understand their effectiveness. From this perspective, low-rank parameterizations naturally correspond to low-dimensional Matrix Product States (MPS) representations, which enable entanglement-based characterizations of parameter structure. Thereby, we term and measure "Artificial Entanglement", defined as the entanglement entropy of the parameters in artificial neural networks (in particular the LLMs). We first study the representative low-rank adaptation (LoRA) PEFT method, alongside full fine-tuning (FFT), using LLaMA models at the 1B and 8B scales trained on the Tulu3 and OpenThoughts3 datasets, and uncover: (i) Internal artificial entanglement in the updates of query and value projection matrices in LoRA follows a volume law with a central suppression (termed as the "Entanglement Valley"), which is sensitive to hyper-parameters and is distinct from that in FFT; (ii) External artificial entanglement in attention matrices, corresponding to token-token correlations in representation space, follows an area law with logarithmic corrections and remains robust to LoRA hyper-parameters and training steps. Drawing a parallel to the No-Hair Theorem in black hole physics, we propose that although LoRA and FFT induce distinct internal entanglement signatures, such differences do not manifest in the attention outputs, suggesting a "no-hair" property that results in the effectiveness of low rank updates. We further provide theoretical support based on random matrix theory, and extend our analysis to an MPS Adaptation PEFT method, which exhibits qualitatively similar behaviors.

  • 6 authors
·
Jan 11 2

Autoregressive Transformer Neural Network for Simulating Open Quantum Systems via a Probabilistic Formulation

The theory of open quantum systems lays the foundations for a substantial part of modern research in quantum science and engineering. Rooted in the dimensionality of their extended Hilbert spaces, the high computational complexity of simulating open quantum systems calls for the development of strategies to approximate their dynamics. In this paper, we present an approach for tackling open quantum system dynamics. Using an exact probabilistic formulation of quantum physics based on positive operator-valued measure (POVM), we compactly represent quantum states with autoregressive transformer neural networks; such networks bring significant algorithmic flexibility due to efficient exact sampling and tractable density. We further introduce the concept of String States to partially restore the symmetry of the autoregressive transformer neural network and improve the description of local correlations. Efficient algorithms have been developed to simulate the dynamics of the Liouvillian superoperator using a forward-backward trapezoid method and find the steady state via a variational formulation. Our approach is benchmarked on prototypical one and two-dimensional systems, finding results which closely track the exact solution and achieve higher accuracy than alternative approaches based on using Markov chain Monte Carlo to sample restricted Boltzmann machines. Our work provides general methods for understanding quantum dynamics in various contexts, as well as techniques for solving high-dimensional probabilistic differential equations in classical setups.

  • 4 authors
·
Sep 11, 2020

Practical Benchmarking of Randomized Measurement Methods for Quantum Chemistry Hamiltonians

Many hybrid quantum-classical algorithms for the application of ground state energy estimation in quantum chemistry involve estimating the expectation value of a molecular Hamiltonian with respect to a quantum state through measurements on a quantum device. To guide the selection of measurement methods designed for this observable estimation problem, we propose a benchmark called CSHOREBench (Common States and Hamiltonians for ObseRvable Estimation Benchmark) that assesses the performance of these methods against a set of common molecular Hamiltonians and common states encountered during the runtime of hybrid quantum-classical algorithms. In CSHOREBench, we account for resource utilization of a quantum computer through measurements of a prepared state, and a classical computer through computational runtime spent in proposing measurements and classical post-processing of acquired measurement outcomes. We apply CSHOREBench considering a variety of measurement methods on Hamiltonians of size up to 16 qubits. Our discussion is aided by using the framework of decision diagrams which provides an efficient data structure for various randomized methods and illustrate how to derandomize distributions on decision diagrams. In numerical simulations, we find that the methods of decision diagrams and derandomization are the most preferable. In experiments on IBM quantum devices against small molecules, we observe that decision diagrams reduces the number of measurements made by classical shadows by more than 80%, that made by locally biased classical shadows by around 57%, and consistently require fewer quantum measurements along with lower classical computational runtime than derandomization. Furthermore, CSHOREBench is empirically efficient to run when considering states of random quantum ansatz with fixed depth.

  • 7 authors
·
Dec 12, 2023

AdaPreLoRA: Adafactor Preconditioned Low-Rank Adaptation

Low-Rank Adaptation (LoRA) reparameterizes a weight update as a product of two low-rank factors, but the Jacobian J_{G} of the generator mapping the factors to the weight matrix is rank-deficient, so the factor-space preconditioner J_{G}^* {F}_t J_{G} induced by any {W}-space preconditioner {F}_t is singular, and consequently the standard chain rule cannot be uniquely inverted to map a preconditioned {W}-space direction back to a factor-space update. We cast existing LoRA optimizers in a unified framework parameterized by two choices: (i) which invertible surrogate for J_{G}^* {F}_t J_{G} to use, and (ii) which {F}_t on {W} to use. Existing methods occupy four families along these axes: factor-space adaptive updates, block-diagonal surrogates for J_{G}^* J_{G}, Frobenius-residual pseudoinverse methods, and Riemannian manifold constraint. Within this design space, a gradient-statistics-aware {F}_t paired with a closed-form factor-space solve at {O}((m+n)r) memory remains underexplored. We propose AdaPreLoRA, which fills this gap by adopting the Adafactor diagonal Kronecker preconditioner {H}_t on {W} and selecting from the resulting factor-space solution family the element minimizing an {H}_t-weighted imbalance between the two factor contributions; by construction, the resulting factor update is the closest LoRA approximation to the preconditioned {W}-space direction under the {H}_t-weighted norm. Across GPT-2 (E2E), Mistral-7B and Qwen2-7B (GLUE, ARC, GSM8K), and diffusion-model personalization, AdaPreLoRA is competitive with or improves over a representative set of LoRA optimizers while keeping peak GPU memory at the LoRA optimizer level.

  • 3 authors
·
May 8 1

QR-SPPS: Quantum-Native Retail Supply Chain Risk Simulation via VQE, ADAPT-VQE Counterfactual Policy Ranking, and DOS-QPE Boltzmann Tail Risk Quantification

Classical supply chain risk models treat node failures as statistically independent events, systematically underestimating cascade probabilities when supplier dependencies are strongly correlated. At n=40 nodes, the full correlated failure distribution requires O(2^n) classical samples, a regime where exact simulation demands 17.6 TB of memory and over 369,000 hours of computation on a standard workstation. We present QR-SPPS (Quantum-Native Retail Shock Propagation and Policy Stress Simulator), a three-algorithm quantum pipeline implemented using the Qiskit framework with the Aer statevector_simulator backend. First, a 40-node, 4-tier retail supply network is encoded as a 40-qubit Ising Hamiltonian using OpenFermion QubitOperator, where ZZ coupling terms encode correlated cascade probabilities structurally absent from classical Monte Carlo. Second, a hardware-efficient VQE circuit finds the ground-state stress distribution with zero error, detecting entangled cascade failures in 14/40 nodes with max|ΔP|=0.637 versus classical Monte Carlo. Third, we introduce the first application of ADAPT-VQE gradient screening to counterfactual macroeconomic policy evaluation: six crisis interventions are ranked in O(1) Qiskit operator evaluations per policy, a 287x speedup over sequential VQE re-optimisation. Fourth, Density-of-States QPE (DOS-QPE) reconstructs the full eigenspectrum via 32-step Trotter evolution and introduces a novel mapping of the Boltzmann catastrophe probability P_cat(T) to VIX-equivalent market volatility temperature, enabling direct integration into regulatory Value-at-Risk frameworks. Qiskit Aer scaling benchmarks confirm exponential classical intractability at 40 qubits.

  • 1 authors
·
Mar 20

Light Schrödinger Bridge

Despite the recent advances in the field of computational Schr\"odinger Bridges (SB), most existing SB solvers are still heavy-weighted and require complex optimization of several neural networks. It turns out that there is no principal solver which plays the role of simple-yet-effective baseline for SB just like, e.g., k-means method in clustering, logistic regression in classification or Sinkhorn algorithm in discrete optimal transport. We address this issue and propose a novel fast and simple SB solver. Our development is a smart combination of two ideas which recently appeared in the field: (a) parameterization of the Schr\"odinger potentials with sum-exp quadratic functions and (b) viewing the log-Schr\"odinger potentials as the energy functions. We show that combined together these ideas yield a lightweight, simulation-free and theoretically justified SB solver with a simple straightforward optimization objective. As a result, it allows solving SB in moderate dimensions in a matter of minutes on CPU without a painful hyperparameter selection. Our light solver resembles the Gaussian mixture model which is widely used for density estimation. Inspired by this similarity, we also prove an important theoretical result showing that our light solver is a universal approximator of SBs. Furthemore, we conduct the analysis of the generalization error of our light solver. The code for our solver can be found at https://github.com/ngushchin/LightSB

  • 3 authors
·
Oct 2, 2023

Towards Cross Domain Generalization of Hamiltonian Representation via Meta Learning

Recent advances in deep learning for physics have focused on discovering shared representations of target systems by incorporating physics priors or inductive biases into neural networks. While effective, these methods are limited to the system domain, where the type of system remains consistent and thus cannot ensure the adaptation to new, or unseen physical systems governed by different laws. For instance, a neural network trained on a mass-spring system cannot guarantee accurate predictions for the behavior of a two-body system or any other system with different physical laws. In this work, we take a significant leap forward by targeting cross domain generalization within the field of Hamiltonian dynamics. We model our system with a graph neural network and employ a meta learning algorithm to enable the model to gain experience over a distribution of tasks and make it adapt to new physics. Our approach aims to learn a unified Hamiltonian representation that is generalizable across multiple system domains, thereby overcoming the limitations of system-specific models. Our results demonstrate that the meta-trained model not only adapts effectively to new systems but also captures a generalized Hamiltonian representation that is consistent across different physical domains. Overall, through the use of meta learning, we offer a framework that achieves cross domain generalization, providing a step towards a unified model for understanding a wide array of dynamical systems via deep learning.

  • 2 authors
·
Dec 2, 2022

Unleashing High-Quality Image Generation in Diffusion Sampling Using Second-Order Levenberg-Marquardt-Langevin

The diffusion models (DMs) have demonstrated the remarkable capability of generating images via learning the noised score function of data distribution. Current DM sampling techniques typically rely on first-order Langevin dynamics at each noise level, with efforts concentrated on refining inter-level denoising strategies. While leveraging additional second-order Hessian geometry to enhance the sampling quality of Langevin is a common practice in Markov chain Monte Carlo (MCMC), the naive attempts to utilize Hessian geometry in high-dimensional DMs lead to quadratic-complexity computational costs, rendering them non-scalable. In this work, we introduce a novel Levenberg-Marquardt-Langevin (LML) method that approximates the diffusion Hessian geometry in a training-free manner, drawing inspiration from the celebrated Levenberg-Marquardt optimization algorithm. Our approach introduces two key innovations: (1) A low-rank approximation of the diffusion Hessian, leveraging the DMs' inherent structure and circumventing explicit quadratic-complexity computations; (2) A damping mechanism to stabilize the approximated Hessian. This LML approximated Hessian geometry enables the diffusion sampling to execute more accurate steps and improve the image generation quality. We further conduct a theoretical analysis to substantiate the approximation error bound of low-rank approximation and the convergence property of the damping mechanism. Extensive experiments across multiple pretrained DMs validate that the LML method significantly improves image generation quality, with negligible computational overhead.

  • 12 authors
·
May 30, 2025

Adaptive Graph Shrinking for Quantum Optimization of Constrained Combinatorial Problems

A range of quantum algorithms, especially those leveraging variational parameterization and circuit-based optimization, are being studied as alternatives for solving classically intractable combinatorial optimization problems (COPs). However, their applicability is limited by hardware constraints, including shallow circuit depth, limited qubit counts, and noise. To mitigate these issues, we propose a hybrid classical--quantum framework based on graph shrinking to reduce the number of variables and constraints in QUBO formulations of COPs, while preserving problem structure. Our approach introduces three key ideas: (i) constraint-aware shrinking that prevents merges that will likely violate problem-specific feasibility constraints, (ii) a verification-and-repair pipeline to correct infeasible solutions post-optimization, and (iii) adaptive strategies for recalculating correlations and controlling the graph shrinking process. We apply our approach to three standard benchmark problems: Multidimensional Knapsack (MDKP), Maximum Independent Set (MIS), and the Quadratic Assignment Problem (QAP). Empirical results show that our approach improves solution feasibility, reduces repair complexity, and enhances quantum optimization quality on hardware-limited instances. These findings demonstrate a scalable pathway for applying near-term quantum algorithms to classically challenging constrained optimization problems.

  • 2 authors
·
Jun 17, 2025

A Deep Conjugate Direction Method for Iteratively Solving Linear Systems

We present a novel deep learning approach to approximate the solution of large, sparse, symmetric, positive-definite linear systems of equations. These systems arise from many problems in applied science, e.g., in numerical methods for partial differential equations. Algorithms for approximating the solution to these systems are often the bottleneck in problems that require their solution, particularly for modern applications that require many millions of unknowns. Indeed, numerical linear algebra techniques have been investigated for many decades to alleviate this computational burden. Recently, data-driven techniques have also shown promise for these problems. Motivated by the conjugate gradients algorithm that iteratively selects search directions for minimizing the matrix norm of the approximation error, we design an approach that utilizes a deep neural network to accelerate convergence via data-driven improvement of the search directions. Our method leverages a carefully chosen convolutional network to approximate the action of the inverse of the linear operator up to an arbitrary constant. We train the network using unsupervised learning with a loss function equal to the L^2 difference between an input and the system matrix times the network evaluation, where the unspecified constant in the approximate inverse is accounted for. We demonstrate the efficacy of our approach on spatially discretized Poisson equations with millions of degrees of freedom arising in computational fluid dynamics applications. Unlike state-of-the-art learning approaches, our algorithm is capable of reducing the linear system residual to a given tolerance in a small number of iterations, independent of the problem size. Moreover, our method generalizes effectively to various systems beyond those encountered during training.

  • 6 authors
·
May 22, 2022

Enhancing Quantum Variational Algorithms with Zero Noise Extrapolation via Neural Networks

In the emergent realm of quantum computing, the Variational Quantum Eigensolver (VQE) stands out as a promising algorithm for solving complex quantum problems, especially in the noisy intermediate-scale quantum (NISQ) era. However, the ubiquitous presence of noise in quantum devices often limits the accuracy and reliability of VQE outcomes. This research introduces a novel approach to ameliorate this challenge by utilizing neural networks for zero noise extrapolation (ZNE) in VQE computations. By employing the Qiskit framework, we crafted parameterized quantum circuits using the RY-RZ ansatz and examined their behavior under varying levels of depolarizing noise. Our investigations spanned from determining the expectation values of a Hamiltonian, defined as a tensor product of Z operators, under different noise intensities to extracting the ground state energy. To bridge the observed outcomes under noise with the ideal noise-free scenario, we trained a Feed Forward Neural Network on the error probabilities and their associated expectation values. Remarkably, our model proficiently predicted the VQE outcome under hypothetical noise-free conditions. By juxtaposing the simulation results with real quantum device executions, we unveiled the discrepancies induced by noise and showcased the efficacy of our neural network-based ZNE technique in rectifying them. This integrative approach not only paves the way for enhanced accuracy in VQE computations on NISQ devices but also underlines the immense potential of hybrid quantum-classical paradigms in circumventing the challenges posed by quantum noise. Through this research, we envision a future where quantum algorithms can be reliably executed on noisy devices, bringing us one step closer to realizing the full potential of quantum computing.

  • 4 authors
·
Mar 10, 2024

simple-idealized-1d-nlse: Pseudo-Spectral Solver for the 1D Nonlinear Schrödinger Equation

We present an open-source Python implementation of an idealized high-order pseudo-spectral solver for the one-dimensional nonlinear Schr\"odinger equation (NLSE). The solver combines Fourier spectral spatial discretization with an adaptive eighth-order Dormand-Prince time integration scheme to achieve machine-precision conservation of mass and near-perfect preservation of momentum and energy for smooth solutions. The implementation accurately reproduces fundamental NLSE phenomena including soliton collisions with analytically predicted phase shifts, Akhmediev breather dynamics, and the development of modulation instability from noisy initial conditions. Four canonical test cases validate the numerical scheme: single soliton propagation, two-soliton elastic collision, breather evolution, and noise-seeded modulation instability. The solver employs a 2/3 dealiasing rule with exponential filtering to prevent aliasing errors from the cubic nonlinearity. Statistical analysis using Shannon, R\'enyi, and Tsallis entropies quantifies the spatio-temporal complexity of solutions, while phase space representations reveal the underlying coherence structure. The implementation prioritizes code transparency and educational accessibility over computational performance, providing a valuable pedagogical tool for exploring nonlinear wave dynamics. Complete source code, documentation, and example configurations are freely available, enabling reproducible computational experiments across diverse physical contexts where the NLSE governs wave evolution, including nonlinear optics, Bose-Einstein condensates, and ocean surface waves.

  • 5 authors
·
Sep 6, 2025

PFGM++: Unlocking the Potential of Physics-Inspired Generative Models

We introduce a new family of physics-inspired generative models termed PFGM++ that unifies diffusion models and Poisson Flow Generative Models (PFGM). These models realize generative trajectories for N dimensional data by embedding paths in N{+}D dimensional space while still controlling the progression with a simple scalar norm of the D additional variables. The new models reduce to PFGM when D{=}1 and to diffusion models when D{to}infty. The flexibility of choosing D allows us to trade off robustness against rigidity as increasing D results in more concentrated coupling between the data and the additional variable norms. We dispense with the biased large batch field targets used in PFGM and instead provide an unbiased perturbation-based objective similar to diffusion models. To explore different choices of D, we provide a direct alignment method for transferring well-tuned hyperparameters from diffusion models (D{to} infty) to any finite D values. Our experiments show that models with finite D can be superior to previous state-of-the-art diffusion models on CIFAR-10/FFHQ 64{times}64 datasets, with FID scores of 1.91/2.43 when D{=}2048/128. In class-conditional setting, D{=}2048 yields current state-of-the-art FID of 1.74 on CIFAR-10. In addition, we demonstrate that models with smaller D exhibit improved robustness against modeling errors. Code is available at https://github.com/Newbeeer/pfgmpp

  • 6 authors
·
Feb 8, 2023

Physics-informed cluster analysis and a priori efficiency criterion for the construction of local reduced-order bases

Nonlinear model order reduction has opened the door to parameter optimization and uncertainty quantification in complex physics problems governed by nonlinear equations. In particular, the computational cost of solving these equations can be reduced by means of local reduced-order bases. This article examines the benefits of a physics-informed cluster analysis for the construction of cluster-specific reduced-order bases. We illustrate that the choice of the dissimilarity measure for clustering is fundamental and highly affects the performances of the local reduced-order bases. It is shown that clustering with an angle-based dissimilarity on simulation data efficiently decreases the intra-cluster Kolmogorov N-width. Additionally, an a priori efficiency criterion is introduced to assess the relevance of a ROM-net, a methodology for the reduction of nonlinear physics problems introduced in our previous work in [T. Daniel, F. Casenave, N. Akkari, D. Ryckelynck, Model order reduction assisted by deep neural networks (ROM-net), Advanced Modeling and Simulation in Engineering Sciences 7 (16), 2020]. This criterion also provides engineers with a very practical method for ROM-nets' hyperparameters calibration under constrained computational costs for the training phase. On five different physics problems, our physics-informed clustering strategy significantly outperforms classic strategies for the construction of local reduced-order bases in terms of projection errors.

  • 5 authors
·
Mar 25, 2021

Efficient Implementation of Gaussian Process Regression Accelerated Saddle Point Searches with Application to Molecular Reactions

The task of locating first order saddle points on high-dimensional surfaces describing the variation of energy as a function of atomic coordinates is an essential step for identifying the mechanism and estimating the rate of thermally activated events within the harmonic approximation of transition state theory. When combined directly with electronic structure calculations, the number of energy and atomic force evaluations needed for convergence is a primary issue. Here, we describe an efficient implementation of Gaussian process regression (GPR) acceleration of the minimum mode following method where a dimer is used to estimate the lowest eigenmode of the Hessian. A surrogate energy surface is constructed and updated after each electronic structure calculation. The method is applied to a test set of 500 molecular reactions previously generated by Hermez and coworkers [J. Chem. Theory Comput. 18, 6974 (2022)]. An order of magnitude reduction in the number of electronic structure calculations needed to reach the saddle point configurations is obtained by using the GPR compared to the dimer method. Despite the wide range in stiffness of the molecular degrees of freedom, the calculations are carried out using Cartesian coordinates and are found to require similar number of electronic structure calculations as an elaborate internal coordinate method implemented in the Sella software package. The present implementation of the GPR surrogate model in C++ is efficient enough for the wall time of the saddle point searches to be reduced in 3 out of 4 cases even though the calculations are carried out at a low Hartree-Fock level.

  • 5 authors
·
May 18, 2025

Dynamic Rank Reinforcement Learning for Adaptive Low-Rank Multi-Head Self Attention in Large Language Models

We propose Dynamic Rank Reinforcement Learning (DR-RL), a novel framework that adaptively optimizes the low-rank factorization of Multi-Head Self-Attention (MHSA) in Large Language Models (LLMs) through the integration of reinforcement learning and online matrix perturbation theory. While traditional low-rank approximations often rely on static rank assumptions--limiting their flexibility across diverse input contexts--our method dynamically selects ranks based on real-time sequence dynamics, layer-specific sensitivities, and hardware constraints. The core innovation lies in an RL agent that formulates rank selection as a sequential policy optimization problem, where the reward function strictly balances attention fidelity against computational latency. Crucially, we employ online matrix perturbation bounds to enable incremental rank updates, thereby avoiding the prohibitive cost of full decomposition during inference. Furthermore, the integration of a lightweight Transformer-based policy network and batched Singular Value Decomposition (SVD) operations ensures scalable deployment on modern GPU architectures. Experiments demonstrate that DR-RL maintains downstream accuracy statistically equivalent to full-rank attention while significantly reducing Floating Point Operations (FLOPs), particularly in long-sequence regimes (L > 4096). This work bridges the gap between adaptive efficiency and theoretical rigor in MHSA, offering a principled, mathematically grounded alternative to heuristic rank reduction techniques in resource-constrained deep learning. Source code and experiment logs are available at: https://github.com/canererden/DR_RL_Project

  • 1 authors
·
Dec 17, 2025

A Physics-Informed, Global-in-Time Neural Particle Method for the Spatially Homogeneous Landau Equation

We propose a physics-informed neural particle method (PINN--PM) for the spatially homogeneous Landau equation. The method adopts a Lagrangian interacting-particle formulation and jointly parameterizes the time-dependent score and the characteristic flow map with neural networks. Instead of advancing particles through explicit time stepping, the Landau dynamics is enforced via a continuous-time residual defined along particle trajectories. This design removes time-discretization error and yields a mesh-free solver that can be queried at arbitrary times without sequential integration. We establish a rigorous stability analysis in an L^2_v framework. The deviation between learned and exact characteristics is controlled by three interpretable sources: (i) score approximation error, (ii) empirical particle approximation error, and (iii) the physics residual of the neural flow. This trajectory estimate propagates to density reconstruction, where we derive an L^2_v error bound for kernel density estimators combining classical bias--variance terms with a trajectory-induced contribution. Using Hyvarinen's identity, we further relate the oracle score-matching gap to the L^2_v score error and show that the empirical loss concentrates at the Monte Carlo rate, yielding computable a posteriori accuracy certificates. Numerical experiments on analytical benchmarks, including the two- and three-dimensional BKW solutions, as well as reference-free configurations, demonstrate stable transport, preservation of macroscopic invariants, and competitive or improved accuracy compared with time-stepping score-based particle and blob methods while using significantly fewer particles.

  • 4 authors
·
Mar 11 1

Reservoir Computing via Quantum Recurrent Neural Networks

Recent developments in quantum computing and machine learning have propelled the interdisciplinary study of quantum machine learning. Sequential modeling is an important task with high scientific and commercial value. Existing VQC or QNN-based methods require significant computational resources to perform the gradient-based optimization of a larger number of quantum circuit parameters. The major drawback is that such quantum gradient calculation requires a large amount of circuit evaluation, posing challenges in current near-term quantum hardware and simulation software. In this work, we approach sequential modeling by applying a reservoir computing (RC) framework to quantum recurrent neural networks (QRNN-RC) that are based on classical RNN, LSTM and GRU. The main idea to this RC approach is that the QRNN with randomly initialized weights is treated as a dynamical system and only the final classical linear layer is trained. Our numerical simulations show that the QRNN-RC can reach results comparable to fully trained QRNN models for several function approximation and time series prediction tasks. Since the QRNN training complexity is significantly reduced, the proposed model trains notably faster. In this work we also compare to corresponding classical RNN-based RC implementations and show that the quantum version learns faster by requiring fewer training epochs in most cases. Our results demonstrate a new possibility to utilize quantum neural network for sequential modeling with greater quantum hardware efficiency, an important design consideration for noisy intermediate-scale quantum (NISQ) computers.

  • 5 authors
·
Nov 4, 2022