diff --git "a/data/chunks/2603.10777_semantic.json" "b/data/chunks/2603.10777_semantic.json"
new file mode 100644--- /dev/null
+++ "b/data/chunks/2603.10777_semantic.json"
@@ -0,0 +1,1226 @@
+[
+  {
+    "chunk_id": "03fd3437-e614-427d-b38d-26740a7ad2c1",
+    "text": "Dynamics-Informed Deep Learning for Predicting Extreme Events Eirini Katsidoniotaki, Themistoklis P.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 0,
+    "total_chunks": 68,
+    "char_count": 100,
+    "word_count": 11,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "cc410bf4-b167-4c50-ad1e-2e68a5ff9b78",
+    "text": "Sapsis ∗\nDepartment of Mechanical Engineering,\nMassachusetts Institute of Technology,\n77 Massachusetts Ave., Cambridge, MA 02139 Abstract2026\nPredicting extreme events in high-dimensional chaotic dynamical systems remains a fundamental challenge, as\nsuch events are rare, intermittent, and arise from transient dynamical mechanisms that are difficult to infer fromMar limited observations. Accordingly, real-time forecasting calls for precursors that encode the mechanisms driving\nextremes, rather than relying solely on statistical associations. We propose a fully data-driven framework for long-\n11 lead prediction of extreme events that constructs interpretable, mechanism-aware precursors by explicitly tracking\ntransient instabilities preceding event onset. The approach leverages a reduced-order formulation to compute\nfinite-time Lyapunov exponent (FTLE)–like precursors directly from state snapshots, without requiring knowledge\nof the governing equations. To avoid the prohibitive computational cost of classical FTLE computation, instability\ngrowth is evaluated in an adaptively evolving low-dimensional subspace spanned by Optimal Time-Dependent\n(OTD) modes, enabling efficient identification of transiently amplifying directions. These precursors are then pro-[cs.LG] vided as input to a Transformer-based model, enabling forecast of extreme event observables. We demonstrate the\nframework on Kolmogorov flow, a canonical model of intermittent turbulence. The results show that explicitly\nencoding transient instability mechanisms substantially extends practical prediction horizons compared to baseline\nobservable-based approaches. keywords: Extreme events prediction; Dynamical precursors; Time-series forecasting; Extreme events are rare yet high-impact episodes, associated with abrupt changes in the state and dynamics of a\nsystem, that arise across a variety of natural and engineering systems—including oceanic rogue waves [19], extreme\nweather [23], shocks in power grids [44], and sudden market drawdowns [16], just to mention a few examples. They\noften lead to severe humanitarian, environmental, and financial consequences, motivating the development ofarXiv:2603.10777v1 real-time forecasting tools that enable timely mitigation.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 1,
+    "total_chunks": 68,
+    "char_count": 2253,
+    "word_count": 283,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "e66db018-9d08-4282-9750-529c6c51c144",
+    "text": "To formalize this notion, an extreme event may be defined as a high-amplitude excursion of a system response\nthat markedly exceeds the typical fluctuations generated by the background dynamics [22, 43]. In time series,\nsuch excursions appear as intermittent bursts: short-lived peaks separated by long intervals of typical behavior. In practice, extremes are specified with respect to an observable—a scalar quantity of interest derived from the\nsystem state, such as wave crest height in ocean dynamics, or portfolio losses in financial markets. An event is\ndeemed extreme when the observable attains values in the far tail of its empirical (or stationary) distribution, and\nthus occurs with very low probability under nominal conditions. While this definition provides a clear criterion for identifying extremes, it does not by itself make them predictable:\nextreme excursions may develop rapidly and often exhibit weak or ambiguous precursory signatures. As a result,\nreal-time prediction hinges on identifying measurable diagnostic quantities that characterize the evolving system\nand exhibit consistent, detectable changes prior to event onset. We refer to such diagnostics as precursors of\nextremes. For a precursor to be practically useful, it must robustly discriminate impending extremes from typical\nfluctuations, yielding low false-positive and false-negative rates. In this context, even partial understanding of ∗Corresponding author: sapsis@mit.edu, Tel: (617) 324-7508, Fax: (617) 253-8689",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 2,
+    "total_chunks": 68,
+    "char_count": 1504,
+    "word_count": 215,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "b8bbc523-ad76-4427-b886-d78c89ea46f7",
+    "text": "the dynamics that underlie the extreme events and trigger their formation can guide the discovery and design of\nreliable precursors that facilitate the data-driven prediction of extremes [8,22,43]. In the context of extreme events associated with special type of dynamics, such as instabilities, special emphasis\nshould be given to mechanism-linked features rather than purely statistical correlations. For example, in unsteady aerodynamic flows, pressure-based precursors—often constructed to isolate event-relevant dynamics (e.g.,\nvia spectral or wavelet filtering) and informed by sparse-sensing measurements—have been coupled with sequential\ntime-series models to improve prediction of intermittent force excursions [8,29,40]. In turbulent flows with external\nforcing, mechanism-based precursors of extreme dissipation bursts have been identified by linking burst onset to\ncharacteristic nonlinear interactions among Fourier modes and associated depletion of low-wavenumber energy,\nwhile the underlying trigger of this energy exchange remains unresolved [12,21]. A dynamical–statistical approach\nwas proposed for wall-bounded turbulence that identifies precursors of extreme dissipation events by computing\nan attractor-consistent critical state via finite-time energy-growth optimization and using its alignment with the\ninstantaneous flow as an interpretable early-warning indicator. Recent work [48] has further shown that dissipation bursts of comparable magnitude can exhibit markedly different predictability depending on their underlying\ndynamical route, underscoring the need for mechanism-linked, pathway-discriminative precursors. Similarly, in different domains such as medicine, instability-informed scalars, such as data-driven Lyapunov-exponent estimates,\nhave been used in deep-learning models for event detection and forecasting in physiological time series [47]. On the other hand, we have purely data-driven precursors, where predictive signatures of upcoming extremes are\nlearned directly from data and statistical associations [15,26]. In the same spirit, relevant efforts have increasingly\nleveraged deep learning methods to predict extreme events directly from time-series data, including recurrent architectures, such as reservoir computing [3,18,37,39] and LSTM-based models [50]. To improve performance recent\nworks proposed extreme-event-tailored loss functions [41] and optimal sampling approaches [10] to emphasize rare\nevents during training. In addition, attention-based sequence models have been proposed to improve performance\nby explicitly separating \"normal\" from \"extreme\" regimes [2]. In this purely data-driven paradigm, attribution\nand explainability methods can be used to extract the input patterns most responsible for a model's predictions\n(e.g., for heatwave forecasting), yielding \"machine-view\" precursors that support knowledge discovery [51]. While\npowerful, these precursors are typically association-based, they rely on a large number of extremes for training,\nwhereas dynamics-based precursors are constructed to explicitly quantify the underlying dynamical pathways to\nextremes. In this study, we focus on high-dimensional systems in which extremes are generated by internal transient instabilities and we build dynamical precursors that encode the responsible dynamics. More specifically, we utilize a\ngeometric viewpoint, where the system evolves primarily on a background attractor where the trajectories lie most\nof the time.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 3,
+    "total_chunks": 68,
+    "char_count": 3484,
+    "word_count": 448,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "362442e8-de55-4ba1-b86b-a28afb4f574d",
+    "text": "Extreme events occur when the system enters localized regions of this attractor where the dynamics become strongly nonlinear and exhibit finite-time instabilities [43]. In this case the state is transiently repelled from\nthe main attractor, causing a rapid amplification of the observable that appears as an intermittent burst in the time\nseries [12,21,33]. Our goal is to formulate dynamic precursors by encoding the local unstable dynamics associated\nwith these transient events. This is done by relying on the observation that, even in high- or infinite-dimensional\nsystems, the emergence of extreme events is typically dominated by a small number of transiently-positive Lyapunov exponents, i.e. by a small number of effective modes. These modes are hard to capture with traditional\nspectral decomposition methods —such as dynamic mode decomposition [45] and Koopman-mode analysis [32]—\ndue to their essentially transient character. To address this challenge we employ the Optimal Time-Dependent (OTD) framework [6], in a fully data-driven\nformulation. OTD modes define an adaptive, trajectory-dependent low-dimensional subspace that continuously\naligns with the directions of maximal transient growth, i.e. directions associated with the largest finite-time\nLyapunov exponents (FTLE), and therefore with the finite-time instabilities that precede extreme excursions [7]. The key idea is to discover features that directly track the finite-time instability direction responsible for extremeevent growth and to use those to synthesize a precursor of an upcoming extreme event.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 4,
+    "total_chunks": 68,
+    "char_count": 1579,
+    "word_count": 224,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "c1d80fee-25d0-47f2-a423-fad71f25f8a9",
+    "text": "This is achieved by\ncomputing the FTLE in the reduced-order subspace spanned by the OTD modes. Previous work [20] relied on\npurely OTD-based precursors capturing local-in-time transient amplification, whereas our FTLE-based approach\nencodes instability growth accumulated over a finite horizon. A key remaining challenge is to translate mechanistic,\ntrajectory-based instability diagnostics, in this case FTLE, into reliable forecasts of event occurrence from timeseries observations. Recent work has shown that the predictive skill for rare extremes depends not only on the use\nof dynamically informative, mechanism-linked features but also on their integration with appropriate deep-learning\narchitectures to improve forecasting performance and extend lead times [5]. In our case, we address this step by\nemploying a Transformer-based architecture that exploits temporal context to forecast extreme-event occurrence\nover a prescribed lead time, using the FTLE obtained in earlier times. We demonstrate the method on a prototype high-dimensional turbulent system —the Kolmogorov flow— in which\nextremes appear as intermittent bursts of the total energy dissipation rate, showing that the proposed framework\nextends substantially the effective prediction horizon relative to baseline precursors while maintaining computational efficiency, thereby enabling practical early warning in settings where extreme events pose significant risk.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 5,
+    "total_chunks": 68,
+    "char_count": 1435,
+    "word_count": 194,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "30f38a58-8877-4c21-8f2d-c8934c302bba",
+    "text": "2 Preliminaries and Background We consider a general nonlinear dynamical system whose evolution in the phase space is governed by ˙u = F(u), u(t) ∈U, (1) where U denotes the state space (e.g., Rn for a finite-dimensional ODE, or an appropriate Hilbert space for an\ninfinite-dimensional PDE), and F : U →U is a nonlinear vector field or nonlinear operator governing the system\ndynamics. For any initial condition u(t0) = u0, the system state at time t can be written as u(t; u0) = ψtt0(u0), (2) where ψtt0 : U →U is the flow map in the phase space. We assume that the long-time dynamics are supported\non a chaotic attractor and that the system exhibits rare, intermittent excursions away from typical fluctuations,\nwhich we refer to as extreme events. To characterize the local, finite-time stability properties of the dynamics along a given trajectory, we examine the\nevolution of infinitesimal perturbations to the state. Let ξ(t) ∈U denote a small perturbation about the reference\ntrajectory u(t). Linearizing the governing equations about u(t) yields the variational equation. Small perturbations\nsuperposed on a reference trajectory in a dynamical system can be described as tangent linear evolutions about\nthe trajectory.\n˙ξ = L ξ, L(t) := ∇uF(u), (3) where L is the Jacobian (or Fr´echet derivative, in the infinite-dimensional case) of the vector field evaluated\nalong the trajectory. Solutions of (3) describe the instantaneous growth or decay of perturbations and provide a\ntrajectory-dependent notion of stability. This is particularly relevant in chaotic systems where transient instabilities—rather than asymptotic behavior—play a central role in the formation of extreme events [33].",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 6,
+    "total_chunks": 68,
+    "char_count": 1696,
+    "word_count": 266,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "d6b8df27-bce6-4f89-bdb0-6dea8ea0c29a",
+    "text": "In the studies of nonlinear dynamical systems, Lyapunov spectral analysis has been used to characterize chaotic\nbehaviors [25, 35]. The asymptotic stability of the reference trajectory u(t) with respect to infinitesimal perturbations is classically characterized by the Lyapunov spectrum. Lyapunov exponents give a measure of the mean\ndivergence rates of nearby trajectories on a strange attractor of the dynamical system, quantifying the long-time\nexponential growth or decay rates of perturbations in multiple directions. Positive values indicating instability and\nnegative values indicating decay. In practice, computing the Lyapunov spectrum requires long-term integration of\nthe variational equation (3) which is numerically challenging [13,42]. Most importantly, because of the long time\ncharacter of Lyapunov spectrum, transient instabilities are not captured and therefore Lyapunov spectrum cannot\nbe utilized for any time of prediction. To overcome this limitation, the variational equation can be used in the\ncontext of optimal time dependents, which is discussed in the next section.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 7,
+    "total_chunks": 68,
+    "char_count": 1094,
+    "word_count": 153,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "69b6cd9b-289d-43fa-951c-2e816896ee96",
+    "text": "Extreme events observable. Let z(t) ∈R denote an observable — a scalar quantity of interest used to\ncharacterize extreme events along the system evolution. We characterize extreme events as rare, intermittent\nexcursions of the observable to unusually large values. In practice, we fix an extreme event level z⋆(e.g., chosen\nbased on a high quantile of the observed time series or a physically motivated reference value) and label the system\nas being in an extreme state whenever z(t) ≥z⋆. We assume that z(t) is obtained from the system state; namely,\nthere exists a mapping G : U →R such that\nz(t) = Z u(t) . (4)",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 8,
+    "total_chunks": 68,
+    "char_count": 613,
+    "word_count": 105,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "568d6ca8-d03d-41bf-be42-fc9970b4c451",
+    "text": "Precursor for upcoming extremes. Our objective is to identify a precursor — a low-dimensional, timedependent indicator derived from the system evolution — that provide early warning of extreme excursions of the\nobservable z(t). Specifically, we seek a precursor signal in the form",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 9,
+    "total_chunks": 68,
+    "char_count": 280,
+    "word_count": 42,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "af690216-647c-4436-8980-4ea4cf484d5b",
+    "text": "π(t) = Π u(t) , or more generally π(t) = Π u[t−∆,t] , (5) computed from information available up to time t (instantaneous state or a short history window of duration\n∆≥0), such that large values of π(t) indicate that the observable will exceed the extreme-event threshold z⋆at a\nprescribed lead time τ > 0, i.e., at time t + τ. We assume access to time series data from one (or more) trajectories, sampled at times {tk}Nk=0. In particular, we consider state snapshots {u(tk)} from which the precursor signal π(tk) and quantity of interest\nz(tk) can be evaluated. 2.2 Optimal Time-Dependent Modes The framework of Optimal Time-Dependent (OTD) modes [6], has been developed to construct a time-dependent\northonormal basis that adapts with the evolving dynamics while remaining sensitive to the finite-time dynamic\ninstabilities. The OTD formulation uses the variational equation (3) and provides a time dependent, orthonormal\nbasis, while still spanning the same flow-invariant subspaces as the solutions of the variational equation. This\nproperty ensures that transient instabilities can be captured in a numerically stable manner. Specifically, the first\nr OTD modes {vi}ri=1 are defined through the constrained minimization problem [6] arg min X ||˙vi −Lv||2 subject to ⟨vi, vj⟩= Ir×r, (6)\n˙vi\ni=1 where ⟨·, ·⟩is a suitable inner product, ∥·∥the induced norm, and Ir×r the identity matrix of size r (1 ≤r ≤n). The\noptimization in eq. (6) is performed with respect to ˙vi and not vi, therefore, the OTD modes are by construction\nthe best approximation of the linearized dynamics in the subspace that they span. For the generic dynamical\nsystem of eq. (1) and an r-dimensional OTD subspace, the evolution equation of the ith mode is: ˙vi = Lvi − X (⟨Lvi, vk⟩vk −Φikvk) , (7) where Φ is an arbitrary skew-symmetric matrix.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 10,
+    "total_chunks": 68,
+    "char_count": 1820,
+    "word_count": 297,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "bd24976f-3aad-4f52-adcd-b74b161e27ec",
+    "text": "Following [11], the Φ is selected as: −⟨Lvk, vi⟩, for k < i,\n Φik = 0, for k = i, (8)\n⟨Lvi, vk⟩, for k > i. With this choice of Φ, the evolution equation for the ith OTD mode takes the form: i−1\n˙vi = Lvi −⟨Lvi, vi⟩vi − X (⟨Lvi, vk⟩+ ⟨Lvk, vi⟩) vk (9) In this formulation we obtain a lower-triangular structure that can be solved sequentially by forward substitution. A key property of the OTD modes is their exponentially fast alingment with the transiently most unstable directions\n[7]. Specifically, the first OTD mode, v1, aligns with the most unstable direction, whereas the second mode, v2,\nis constrained to remain orthogonal to v1, spanning the second most unstable direction; together, they span the\ntwo-dimensional subspace exhibiting the fastest growth. The orthonormality of the OTD modes ensures numerical\nstability and provides a rigorous framework for analyzing finite-time instabilities, by continuously tracking the\nmost unstable directions in phase space, even for time dependent systems. Under mild assumptions, the OTD subspace converges exponentially to the dominant eigenspace of the Cauchy–Green\ntensor, which characterizes transient instabilities [7]. At hyperbolic fixed points, OTD modes converge to the subspace spanned by the r least-stable eigenvectors of the linearized operator L [6]. We also note that OTD modes\ncoincide to Gram–Schmidt vectors, or backward Lyapunov vectors, which are classical tools for identifying unstable\ndirections in phase space [11]. Figure 1 illustrates graphically the geometry of OTD modes for r = 2. The main trajectory around which the OTD\nmodes are computed is shown with green color. The two OTD modes are colored according to their stability: blue\nindicates a stable direction and red an unstable direction. A perturbed trajectory (light green) is also shown as\nit undergoes a rapid growth towards the unstable (first OTD) direction, resulting in an extreme event.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 11,
+    "total_chunks": 68,
+    "char_count": 1934,
+    "word_count": 308,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "7385e286-83e0-4891-8e0a-dad2f756ec06",
+    "text": "Using the\nstability properties along the direction of the OTD modes provides information which could serve the discovery of\neffective precursors for upcoming bursts in chaotic trajectories [20]. Order reduction of the dynamics on OTD modes. The OTD modes span flow-invariant subspaces of the\ntangent space, enabling a dynamically consistent reduction of the linear operator L onto the OTD subspaces. In the\ncase of infinite-dimensional setting, projection of the linearized operator to an r-dimensional OTD subspace yields\na finite-dimensional reduced operator, i.e., an r × r matrix [20]. We present the derivation in finite-dimensional\nform for clarity. Let ξ ∈Rn denote a solution of the full variational equation (3), and let η ∈Rr denote its\nprojection onto the OTD basis V,\nη = VTξ, (10)\nwhere V(t) = v1 v2 · · · vr ∈Rn×r is the time-dependent matrix whose columns are the OTD modes\nobtained from eq. (9). The perturbation can equivalently be expressed as ξ = Vη. Substituting this representation Figure 1: An illustration of the first two OTD modes, colored according to their stability properties (blue is stable\nand red is unstable), along a reference trajectory (green), u(t; u0). A perturbation generates a nearby trajectory\n(shown in light green color), which undergoes rapid growth along the first OTD direction, resulting in an extreme\nevent. into the variational equation (3) yields the reduced-order linear equation",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 12,
+    "total_chunks": 68,
+    "char_count": 1431,
+    "word_count": 227,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "3f893522-1b29-4c14-b25b-8e0b674ab1c9",
+    "text": "which is dynamically consistent with the full system. Conversely, if η solves (11), then ξ = Vη is an exact solution\nof the full variational equation (see [6], Theorem 2.4). We therefore define the reduced linear operator Lr : Rr →Rr\nLr = VTLV. (12) A key advantage of this reduction is that it preserves the transient instabilities of the full-order system, regardless\nof whether they arise from modal or non-modal growth, assuming that r is sufficiently large to capture the unstable\nsubspace. Since the OTD basis evolves along the trajectory, it adapts to the most unstable directions encountered\nin phase space, making it a natural projection framework. However, as discussed in [20], the eigenvalues of Lr\ncannot be interpreted as physical growth or decay rates. Instead, one may consider the symmetric part",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 13,
+    "total_chunks": 68,
+    "char_count": 812,
+    "word_count": 133,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "5c60123c-dfbf-47da-adc9-2b065470d169",
+    "text": "Sr := 12 Lr + LTr , (13) whose eigenvalues λ1 ≥λ2 ≥· · · ≥λr provide instantaneous measures of perturbation growth or decay within the\nOTD subspace. Even this measure of growth is not effective though and for extreme events precursors one should\nfocus on finite-time analysis, i.e. the use of finite-time Lyapunov exponents. 2.3 Finite-Time Lyapunov Exponents The finite-time Lyapunov exponents (FTLEs) quantify the growth or decay of infinitesimal perturbations along\nthe system's trajectory over a finite time window, providing a measure of transient instability of the dynamics [1]. Unlike asymptotic Lyapunov exponents, which reflect long-time averaged behavior, FTLEs capture time-dependent\nand state-dependent amplification of disturbances. In the present setting, we assume that the system is observed\nat the current time t, and we are interested in characterizing the cumulative amplification of perturbations over\nthe preceding interval [t −T, t], where T > 0. An infinitesimal perturbation ξt−T applied at time t −T evolves\nforward to the current time t according to the linearized flow operator Ψt−Tt := Duψ t−Tt (u), (14) where ψ t−Tt denotes the flow map defined in Eq. (2). Consequently, the perturbation at time t satisfies ξ(t) = Ψt−Tt (ξt−T ). (15) To measure the growth of the infinitesimal perturbations in phase space [27], the right Cauchy–Green deformation\ntensor is typically used\nCt−Tt = Ψt−Tt ⊤Ψt−Tt , (16) which is symmetric and positive definite, and its eigenvalues λi(t; t −T) quantify the finite-time stretching of\ninfinitesimal perturbations along orthogonal directions in phase space. We order the Cauchy-Green eigenvalues in\na descending order,\nλ1 ≥λ1 ≥. . . ≥λn ≥0. (17)",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 14,
+    "total_chunks": 68,
+    "char_count": 1704,
+    "word_count": 266,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "5f13f814-8b1a-42ee-878d-76c534d2dc68",
+    "text": "The finite-time Lyapunov exponents over the interval [t −T, t] are defined as Λi(t; t −T) = log p λi(t; t −T), i = 1, . . . , n, (18) Large values of Λi indicate strong local stretching and sensitivity to initial conditions, whereas during quiescent\nphases their magnitudes remain small, reflecting weak perturbation growth. As the system approaches an extreme\nor dissipative event, the FTLEs exhibit sharp peaks that signal the transient amplification of instabilities preceding\nthe onset of such bursts. 2.3.1 Reduced-order computation of FTLE The direct computation of FTLEs in high-dimensional systems is computationally prohibitive, as it requires evaluating the full Cauchy–Green tensor - in practice this means the solution of at least n perturbed trajectories that\nwill be used to quantify the the Cauchy-Green tensor. To address this limitation, Babaee et al. [7] developed\na reduced-order framework for computing FTLEs utilizing the subspace spanned by OTD modes, which adapt\ndynamically to transient instabilities. It was proved that, under suitable conditions, the OTD modes converge\nexponentially fast to the dominant eigendirections of the Cauchy–Green tensor corresponding to the strongest\nfinite-time instabilities, i.e., those associated with the largest FTLEs. We summarize the reduced-order procedure\nfor computing FTLEs over the finite-time interval [t −T, t]. Utilize the given data, expressed through a trajectory u(t) over the interval [t −T, t]. OTD subspace construction. Compute the r-dimensional OTD basis corresponding to this trajectory\nusing Eq. (9). Evolution of the reduced-order fundamental matrix. Evolve the reduced fundamental solution matrix\nYtt−T ∈Rr×r according to\nYtt−T = Lr Ytt−T , Ytt−T = Ir, (19)\nwhere Lr denotes the projection of the full linearized operator L onto the OTD subspace; see Eq. (12). Here,\nIr denotes the identity matrix in Rr×r.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 15,
+    "total_chunks": 68,
+    "char_count": 1888,
+    "word_count": 288,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "e81b3360-b271-4788-8d0f-a685c67308e7",
+    "text": "Reduced-order Cauchy–Green tensor. Construct the reduced-order right Cauchy–Green tensor Rtt−T = Y t−Tt ⊤Y t−Tt , (20) with eigenvalues γ1 ≥γ2 ≥· · · ≥γr ≥0. The finite-time Lyapunov exponents in the reduced subspace over the interval\n[t −T, t] are given by\nΓi(t; t −T) = log pγi(t; t −T), i = 1, . . . , r. (21) 3 Extreme Event Precursors We now formulate a fully data-driven framework for the real-time prediction of extreme events in chaotic, highdimensional dynamical systems.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 16,
+    "total_chunks": 68,
+    "char_count": 480,
+    "word_count": 83,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "293b2b4a-4515-48bf-aca9-e8b7fb6eb413",
+    "text": "The framework consists of two components. The first component provides a\nmethodology for computing reduced-order FTLEs using just data, i.e the sequence of snapshots {u(tk)}Nk=0. The\nsecond component is the predictive, where the leading FTLE, ˆΓ1, computed from system observations up to time t,\nprovides dynamical information that is mapped directly to the predicted value of the observable, ˆz, at a prescribed\nlead time t + τ. If the predicted value exceeds the threshold condition ˆz(t + τ) ≥z∗, we have have identified an\nextreme event.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 17,
+    "total_chunks": 68,
+    "char_count": 541,
+    "word_count": 86,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "440995e6-1230-484a-8e87-203dfd1b431c",
+    "text": "In this section we discuss in detail the various steps involved (Figure 2). Figure 2: Illustration of the training algorithm. A long sequence of snapshots of the system state allows for the\napproximation of the dynamics (step 2). Dynamics is used to approximate the action of the linearized flow on the\nOTD subspace, which allows for parsimonious evolution of the OTD modes (step 3). A computation of the FTLE\nis performed, within the OTD subspace (step 4). Final step 5 is the machine learning of a map from the dominant\nFTLE to the predicted observable for extreme events. 3.1 Dynamics and OTD modes from data We assume an equation-agnostic setup where only discrete time series of the system state is available. To compute\nOTD modes, it is essential to obtain an approximation of the variatonal equation.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 18,
+    "total_chunks": 68,
+    "char_count": 807,
+    "word_count": 136,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "f3a1e9f2-c4a8-4daa-8da2-8b365433f976",
+    "text": "This is achieved by first modeling\nin a data-driven way the system dynamics, ˆF (Figure 2 - step 2). Here the hat denotes the approximation of the\ndynamical system. A broad class of methods exists for inferring the dynamical system, ˆF, directly from data [30, 49, 50, 52]. Here\nwe assume that the snapshots are sampled along long trajectories with a uniform and sufficiently small sampling\ntime-step ∆t, so we approximate ˆF(u) utilizing a forth-order central finite-time-differences scheme. See Appendix\nA.1 for details of this step.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 19,
+    "total_chunks": 68,
+    "char_count": 535,
+    "word_count": 86,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "317e9bdc-5018-4308-9d00-56542baafeca",
+    "text": "To evolve OTD modes it is sufficient to compute the action of the linearized operator L(u) just on the OTD modes. We employ the widely used practice of estimating Jacobian-vector products through finite differences of the vector\nfield [4]. This matrix-free approach avoid reconstructing or storing the Jacobian (linearized operator), which is\ncritical for high-dimensional systems.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 20,
+    "total_chunks": 68,
+    "char_count": 381,
+    "word_count": 56,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "68df7752-6331-4128-af98-586a6ea93bd7",
+    "text": "Specifically, the directional derivative of F(u) along v is approximated as ˆF(u + ϵv) −ˆF(u)\nLv ≈ , (22)\nwhere ϵ ∈R is a small finite-difference step, which allows us to compute the action of the linearized operator L\non the OTD modes v. With these ingredients in place, we compute the OTD eq. (9) (Figure 2 - step 3) and the\nreduced-order linearized operator bLr using the projection formula (12) in a fully data-driven way.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 21,
+    "total_chunks": 68,
+    "char_count": 426,
+    "word_count": 76,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "04800434-b7a7-4aa5-9d13-a7e01342fbfc",
+    "text": "It is important to emphasize that the learned dynamical system, ˆF(u), is never directly used for prediction or\nforecast of extreme events. It only employed to approximate the variational flow, and from there the OTD modes. In other words the approximated dynamics are only used to characterize the local neighborhood of the system\nstate and its transient instabilities.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 22,
+    "total_chunks": 68,
+    "char_count": 370,
+    "word_count": 58,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "7d975177-5f90-4e3a-8f7d-cdb43f054302",
+    "text": "3.2 Mapping FTLEs to extremes We are now in position to compute FTLEs, using the reduced-order linearized operator, bLr, and the process\ndescribed in Section 2.3.1 (Figure 2 - step 4). We focus on the leading FTLE, ˆΓ1(t), and machine learn a model\nthat performs long-horizon, τ, prediction of extreme events, quantified through the observable, z(t + τ) (Figure 2\n- step 5). The predictive capability of the model in capturing such extremes is evaluated through binary classification metrics\nthat differentiate between extreme and non-extreme events, supplemented by conditional statistical measures that\nassess the forecasting skill of the precursor in the vicinity of extreme occurrences. 3.2.1 Deep learning a precursor model The objective of this step is to forecast the future evolution of the observable z(t) for times t ∈[t, t + τ], given\nthe value of the dominant FTLE, ˆΓ1(t). In addition to the value of the FTLE we will also include its time\nderivative ˆΓ′1(t). The inclusion of the derivative provides information on the instantaneous growth rate of local\ninstabilities, enriching the temporal context available to the forecasting model form a two-channel input sequence\nfor the forecasting model:\nπ(t) = [ ˆΓ1(t), ˆΓ′1(t) ]. The problem is formulated as a sequence-to-sequence learning task, where a nonlinear operator is trained to map the\nrecent history of the precursor to the corresponding future dissipation response over a prediction horizon [t, t + τ]. The lookback window of length ∆, partinioded over n∆steps, provides the model with sufficient temporal context\nto capture both the amplitude and rate of change of the system's instability, while length τ and its partition to\nnτ, specifies the number of future steps to be predicted. To implement this forecast step we employ a Transform\narchitecture (details in Appendix A.2).",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 23,
+    "total_chunks": 68,
+    "char_count": 1849,
+    "word_count": 293,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "9307d8ce-cea0-4751-a6d0-3f8bf1df9bb9",
+    "text": "Nθ : R2n∆−→Rnτ , ˆZ = Nθ( Π) , (23) where\nΠ = [ π(t −∆), . . . , π(t) ], ˆZ = [ ˆz(t + τ/nτ), . . . , ˆz(t + τ) ]. The Nθ denotes the machine-learning model parameterized by θ, trained to approximate the nonlinear mapping\nbetween the temporal evolution of the precursors π(t) and the observable z(t). To train the prediction model, we employ an output-weighted loss function designed to emphasize\nthe accurate prediction of rare, high-magnitude events. Following the concept of output-weighted regression [8,41],\nwe employ a weighting scheme inversely proportional to the probability density of the target variable, thereby\namplifying the influence of rare events on the total loss. The output-weighted mean absolute error (MAEOW) loss\nis defined as\n| ˆz −z | LOW = Ez , (24)\npz(z) where pz(z) denotes the probability density function (PDF) of the true output and acts as a weighting function\nthat adjusts the contribution of each sample according to its rarity. In practice, the empirical form of Eq. (24) is\nestimated as\n1 | ˆzj −zj | LOW = X ,\nN pz(zj) j=1 where zj and ˆzj denote the true and predicted values of the observable at sample j, and pz(zj) is estimated via\nkernel density estimation (KDE) from the training data. This formulation increases the penalty for errors in regions where the probability density pz(zj) is small, directing\nthe optimization toward better prediction of rare and extreme events. Since each sample's contribution to the loss\nis scaled by 1/pz(zj), values with smaller probability (i.e., smaller denominators) lead to larger loss terms, while\nfrequent, nominal states contribute less. 3.3 Summary of the prediction algorithm The prediction algorithm can thus be summarized with the following steps (Figure 3):",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 24,
+    "total_chunks": 68,
+    "char_count": 1745,
+    "word_count": 293,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "591cb80a-4ffa-4cab-8324-3acf3ba46510",
+    "text": "Use the history of the system state, u(t), and compute the OTD modes, v1(t; u), ..., vr(t; u) up to the current\ntime, t. Figure 3: Illustration of the prediction steps: 1.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 25,
+    "total_chunks": 68,
+    "char_count": 171,
+    "word_count": 32,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "a8ca3f11-519f-4441-8bb4-a0c68f6951c4",
+    "text": "Computation of OTD modes; 2. Computation of the associated\nFTLEs; 3. Prediction of the observable of interest for extreme events. Compute the FTLEs, Γi(t; t −T), i = 1, ..., r, up to to the current time, t, starting from t −T and over a\nfinite-time horizon of length T. Use the prediction map to compute the value of the observable over the prediction horizon, z(t + τ). 3.4 Performance measures for precursors of extreme events To evaluate the predictive skill of the proposed precursors, we adopt several binary classification metrics [26]. Specifically, each prediction outcome is categorized as one of four outcomes: a true positive (TP) when an extreme\nevent is both observed and correctly predicted, i.e., z > z⋆when ˆz > ˆz⋆; a true negative (TN), when a non-extreme\nevent is correctly identified, i.e., z < z⋆and ˆz < ˆz⋆; a false positive (FP), when an extreme is predicted but does\nnot occur, i.e., z < z⋆and ˆz > ˆz⋆; and a false negative (FN), when an actual extreme event is missed by the\nmodel, i.e., z > z⋆and ˆz < ˆz⋆. To assess the accuracy of the precursors, we employ the following criteria which\nare effective to deal with the strongly unbalanced character of the datasets that contain extreme rare events: 1) F1-score provides a unified quantitative metric that balances the precursor's ability to both correctly identify\nand accurately predict extreme events. It combines two complementary measures: the precision, which denotes\nthe probability that an event predicted as extreme is indeed a true extreme (reflecting the model's reliability in\navoiding false alarms), and the recall, which denotes the probability that an event that is truly extreme is correctly\nidentified as such (reflecting the model's ability to capture all extreme occurrences). These quantities are defined\nTP(ˆz⋆) TP(ˆz⋆)\nS(ˆz⋆) = [Precision], R(ˆz⋆) = [Recall]\nTP(ˆz⋆) + FP(ˆz⋆) TP(ˆz⋆) + FN(ˆz⋆)\nThe F1-score is computed as the harmonic mean of precision and recall: S × R\nF1 = 2 ×\nS + R\nThis formulation penalizes models that achieve high performance on only one of the two metrics and attains its\nmaximum value, F1 = 1, when both precision and recall are perfect (i.e., the model neither generates false extremes\nnor misses true ones). However, the F1 score depends explicitly on the chosen threshold ˆz⋆. 2) Area under the Precision-Recall Curve (AUC): To mitigate the dependence of the evaluation metrics on the\nprediction threshold ˆz⋆, we adopt a threshold-independent measure by integrating precision and recall over the\nentire range of ˆz⋆.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 26,
+    "total_chunks": 68,
+    "char_count": 2546,
+    "word_count": 421,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "e1c2a449-b102-4f72-ae19-62dc94fb06e6",
+    "text": "Specifically, we first fix the extreme-event threshold z⋆used to label true events and construct\nthe precision–recall (PR) curve by varying the prediction threshold ˆz⋆and plotting precision as a function of\nrecall. This curve provides a comprehensive view of how the model's classification performance varies as the\ndecision boundary changes. The area under the PR curve (AUC) is then defined as Z 1 Z ∞ ∂R\nAUC = S(R) dR = S(ˆz⋆) dˆz⋆, (25)\n0 −∞ ∂ˆz⋆ where S denotes precision as a function of recall R. If ˆz⋆is set too low, most extreme events will be correctly identified (high recall) but many false positives will occur\n(low precision). Conversely, if ˆz⋆is too high, false positives are minimized but numerous true extremes are missed\n(low recall). A robust predictor achieves simultaneously high precision and recall over a wide range of ˆz⋆values. The AUC thus provides a scalar, threshold-independent measure of this robustness, with larger AUC values (closer\nto 1) indicating more consistent and reliable detection performance. 3) Maximum adjusted area under the curve, α∗: Following [26], the maximum adjusted area under the curve\ncriterion compares the AUC associated with an extreme-event rate ω, denoted AUC(ω), to that of an uninformed\n(random) predictor whose expected AUC equals ω. The difference AUC(ω) −ω quantifies the gain in predictive\nskill relative to random guessing, and the maximum of this difference over all possible event rates defines α∗= max AUC(ω) −ω . (26)\nω∈[0,1] The quantity α∗identifies the event rate at which the predictor achieves the greatest improvement over a random\nclassifier, highlighting the regime where the model most effectively separates extreme from quiescent states. Unlike\nthe standard AUC, this measure is fully threshold-independent and particularly suitable for evaluating predictors\nin highly unbalanced datasets containing rare events.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 27,
+    "total_chunks": 68,
+    "char_count": 1896,
+    "word_count": 292,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "ac932a2b-6d88-45de-bbc8-1c0b550baaa6",
+    "text": "4) Extreme Event Count: To assess the model's ability to reproduce the temporal occurrence of extreme events,\nwe employ the extreme event count approach. This measure quantifies how well the model captures the number of\ndistinct extreme events within a given time window. Formally, a time instant tj is classified as an extreme event,\ndenoted tEE, if it satisfies\n\" # ∂z\ntEE : tj s.t. = 0 and z(tj) > z⋆ .\n∂t tj The total number of extreme events occurring within the interval [t1, t2] is then given by NEE(t1, t2) = X δtj,tEE,\nj=j1 where δtj,tEE = 1 if tj corresponds to an identified extreme event and 0 otherwise. To prevent spurious detections\ndue to high-frequency noise, a minimum temporal separation between successive peaks is imposed, defined as the\ncharacteristic period associated with the dominant extreme-event frequency, TEE = 1/fEE. While the approach\ndepends on two user-defined parameters—the threshold z⋆and the minimum separation TEE—the extreme event\ncount provides a direct and interpretable measure of the model's forecasting skill. In order to compare the predicted and true counts of extreme events we evaluate the absolute difference, ∆NEE = |NEEtrue −NEEpred |,",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 28,
+    "total_chunks": 68,
+    "char_count": 1187,
+    "word_count": 193,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "6bd3e148-2d00-4903-83bb-84c97d329c0d",
+    "text": "as an indicator of how accurately the model reproduces the true frequency of extreme occurrences over time. Smaller ∆NEE values indicate better agreement between the predicted and observed event statistics. Quantification of the tail statistics To assess how closely the learned model reproduces the true probability\ndensity function pz, we compare their distributions, with particular attention to the behavior of the tails. Following [41], we employ a metric that measures the average absolute difference between the logarithms of the true and\npredicted densities over the intersection of their respective supports. The metric is normalized by the size of this\nintersection, thereby penalizing cases where the overlap between the two distributions is limited.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 29,
+    "total_chunks": 68,
+    "char_count": 761,
+    "word_count": 112,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "fd79d00d-8d89-4535-a29f-e4ce518dcd7d",
+    "text": "1 Z\nD(pz, ˆpz) = | log(pz(z)) −log(ˆpz(z))|dz, (27)\n|Ω(pz, ˆpz)|2 Ω(pz,ˆpz) where,\nΩ(pz, ˆpz) ≈supp(pz) ∩supp(ˆpz), and ˆpz denotes the density estimated from the learned model. Because both pz and ˆpz are empirically approximated\nfrom finite data, their exact support is unknown, and the behavior of low-density regions is difficult to capture\naccurately. Nevertheless, since D is sensitive to both the magnitude and extent of overlap, we approximate each\ndistribution's support as the interval spanning the observed data range. While this underestimates the true width\nof the support, it provides a consistent and practical approach for computing D.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 30,
+    "total_chunks": 68,
+    "char_count": 651,
+    "word_count": 99,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "42c9c9a8-13dc-422a-a796-5d4aa6a79812",
+    "text": "4 Kolmogorov Flow as a Prototype Model for Extreme Events To demonstrate the developed scheme we employ the Kolmogorov flow. The two-dimensional Kolmogorov flow is\na canonical solution of the incompressible Navier–Stokes equations subject to a sinusoidal body force [24]. The\ngoverning equations are\n= −u · ∇u −∇p + ν∇2u + f, ∇· u = 0, (28)\nwhere u(x, t) ∈R2 is the velocity field, p(x, t) is the pressure, and ν = 1/Re is the kinematic viscosity, inversely\nproportional to the Reynolds number. The external forcing is taken to be a sinusoidal shear in the x-direction, f(x) = sin(ny) e1, e1 = (1, 0)T, n ∈N.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 31,
+    "total_chunks": 68,
+    "char_count": 608,
+    "word_count": 108,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "ac2c82f6-ab79-4da5-bffd-305b67055b97",
+    "text": "The domain is the two-dimensional torus T2 = [0, 2π]2 with periodic boundary conditions. The solution is given\nby the time-dependent velocity–pressure pair (u, p). For forcing wavenumber n = 1, the Kolmogorov flow admits a stable laminar solution for all Reynolds numbers [24]. In contrast, for n > 1 and sufficiently large Reynolds numbers, the laminar solution loses stability. As demonstrated\nin [14,38], the flow undergoes a transition to spatiotemporal chaos. The resulting turbulent attractor has a highdimensional structure, with its dimension growing approximately linearly with the Reynolds number. This property\nrenders the analysis of intermittency in turbulent flows particularly challenging. In the present study, we focus\non the case n = 4 and Re = 40, where the Kolmogorov flow exhibits chaotic dynamics and evolves on a strange\nattractor. Important properties of the Kolmogorov flow are the energy input I, the energy dissipation D, and the kinetic\nenergy E. These quantities satisfy the energy balance law dE/dt = I −D, and are defined as: 1 Z\nI(t) = u(x, t) · f(x) dx, (29)\nL2 Ω\nν Z\nD(t) = |ω(x, t)|2 dx, (30)\nL2 Ω\n1 Z\nE(t) = |u(x, t)|2 dx, (31)\n2L2 Ω\nwhere L = 2π is the side length of the domain Ω= [0, L]2 and ω = ∇× u is the scalar vorticity field in two\ndimensions.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 32,
+    "total_chunks": 68,
+    "char_count": 1288,
+    "word_count": 225,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "387d8171-f19b-43da-acdc-3fb9a003fa83",
+    "text": "A ubiquitous feature of turbulent fluid flows is intermittency, manifested as sudden burst-like excursions in observable quantities. In this work, the energy dissipation D(t) is the primary observable and the quantity of interest\nused to track extreme events, as illustrated in Figure 4. We define extreme events as instances in which D(t) exceeds a threshold equal to two standard deviations above its mean.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 33,
+    "total_chunks": 68,
+    "char_count": 408,
+    "word_count": 63,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "63ef175e-4910-41dc-8640-d2f168c0aad3",
+    "text": "According to eq. (29), increases in the energy\ninput I(t) arise from transient alignment between the velocity field u and the external forcing f. Such alignment\nproduces sharp surges in I(t), which, through the energy balance relation, translate into corresponding peaks in\nthe energy dissipation D(t). These observations indicate that the onset of extreme dissipation events is associated\nwith the growth of perturbations that become transiently aligned with the forcing direction. Figure 4: Time evolution of the energy dissipation D(t) along a trajectory of Kolmogorov flow with n = 4 and\nRe = 40. The signal exhibits small-amplitude background oscillations around D ≈0.1, punctuated by intermittent\nburst-like excursions corresponding to extreme dissipation events. 4.1 OTD modes and the Kolmogorov flow The governing equations of the Kolmogorov flow can be written in projected form as: = F(u) = P −u · ∇u + ν∇2u + f , (32)\nwhere P denotes the Leray projection onto the divergence-free subspace, enforcing ∇· u = 0 and eliminating the\npressure term so that the dynamics are entirely described by the velocity field. Within this setting, the nonlinear\noperator F : U →U governs the time evolution of the velocity field according to the projected Navier–Stokes\ndynamics. While in the developed prediction framework the OTD equations are fully-driven, just for the purpose of numerical\ncomparison, we derive the variational equations for Navier-Stokes. Specifically, linearizing F about a state u yields\nthe linearized Navier–Stokes operator LNS(u), which acts on a perturbation field v as, LNS = P −u · ∇v −v · ∇u + ν∇2v . (33) The Leray projection P again ensures that the perturbation dynamics remain within the divergence-free subspace. Substituting LNS(u; v) into the OTD evolution equation (9) yields the evolution of the ith OTD mode, i−1\n˙vi = LNS(u; vi) −⟨LNS(u; vi), vi⟩vi − X h ⟨LNS(u; vi), vk⟩+ ⟨LNS(u; vk), vi⟩i vk, (34)",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 34,
+    "total_chunks": 68,
+    "char_count": 1935,
+    "word_count": 312,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "a0f529a1-b33a-46bb-aa32-99348607076f",
+    "text": "which are divergence-free, mutually orthogonal, and normalized in the L2 sense. This initialization is performed\nboth for the fully data driven OTD equation but also the exact one, derived above (computed only for the purpose\nof numerical comparison). In practice, the OTD system Eq. (9) is integrated on a 64 × 64 spatial grid using a\nfourth-order Runge–Kutta (RK4) scheme. We apply the proposed data-driven framework to the two-dimensional Kolmogorov flow using time-resolved snapshots of the velocity field {u(tk)}Nk=0. Snapshots are obtained by integrating the Navier–Stokes equations (32) in\nFourier space over a time horizon of 30,000 time units, using a temporal discretization of ∆t = 0.1 time units. This procedure yields a total of N = 300,000 state snapshots. Of these, 70% are used for training the proposed\nframework, while the remaining 30% are reserved for testing and performance evaluation. All results reported in\nthis section correspond to the test dataset.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 36,
+    "total_chunks": 68,
+    "char_count": 976,
+    "word_count": 152,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "fd628af6-ea14-47a9-843e-896898667d8e",
+    "text": "Extreme events are quantified by the energy dissipation rate D(t),\nwhich serves as the target observable, i.e. z(t) = D(t). We first examine the data-driven computation of FTLEs using the reduced-order formulation based on OTD\nmodes, relying solely on the time-resolved snapshots of the velocity field. We then evaluate the predictive skill of\nthe leading FTLE as a mechanism-based precursor by evaluating its ability to forecast the future evolution of D(t)\nover the prescribed lead time t + τ with particular emphasis on its ability to capture extreme dissipation events. 5.1 Reduced-order stability measures and precursors of dissipation The first step is to obtain a good approximation of the dynamics ˆF. A detailed analysis of this step is provided\nin Appendix B.1.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 37,
+    "total_chunks": 68,
+    "char_count": 771,
+    "word_count": 122,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "6ccef462-ccbf-4816-b156-25b17c960807",
+    "text": "Based on the learned operator, we approximate the linearized operator LNS and proceed with\nthe computation of the OTD modes. A comparison between the data-driven and equation-based OTD modes is\npresented in Appendix B.2. Dependence of FTLE on the number of OTD modes and finite-time horizon, T. As outlined in the\nmethodological framework of Section 3 (step 3), the reduced FTLEs are computed from the data-driven operator\nˆLr(t), which represents the projection of the learned tangent dynamics onto an r-dimensional OTD subspace. This reduced representation captures the most dynamically active directions responsible for transient growth and\nfinite-time amplification. To evaluate how the dimensionality of the data-driven OTD subspace, r, influences the\naccuracy of the dominant FTLE, we consider three configurations with r = 2, 6, and 8 modes and evaluate the Figure 5: (a) Leading reduced-order finite-time Lyapunov exponent (FTLE), ˆΓ1, computed within subspaces\nspanned by r = 2, 6, 8 OTD modes. The results show that the FTLE converges for r ≥6, whereas smaller\nsubspaces (r = 2) underestimate transient growth rates. (b) Leading reduced-order FTLE, ˆΓ1, computed over\ndifferent integration horizons T = 5, 10, and 20 s; short horizons (T = 5 s) resolve a sequence of localized instability\nepisodes that precede the main event, while increasing T smooths these fluctuations, diminishing the FTLE's\nsensitivity as an early-warning indicator. (c) Comparison between data-driven computed FTLE and the one\nobtained from the analytical variational equations, for r = 6. (d) Superposition of the dominant FTLE and the\nobservable of interest, D(t) (both quantities are normalized for clarity). The close temporal alignment between\nthe peaks demonstrates that the FTLE effectively captures the buildup of instability preceding dissipation bursts. corresponding leading FTLE, ˆΓ1, in each case. Figure 5(a) displays these three cases.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 38,
+    "total_chunks": 68,
+    "char_count": 1934,
+    "word_count": 292,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "19e8f70f-6d18-453e-a563-4bea49208e73",
+    "text": "We note that the twomode approximation underestimates the growth rate, indicating that it fails to encompass all dominant instability\ndirections. Increasing the subspace dimension leads to rapid convergence of the estimated maximum FTLE, with\nthe results for r = 6 and r = 8 being nearly identical. These findings indicate that a six-dimensional OTD subspace\ncaptures the essential transient growth dynamics with sufficient accuracy, representing the most efficient choice\nthat balances fidelity and computational cost. A further comparison with the computed FTLE from the analytical\nOTD equations confirms that the fully data-driven accurately captures the OTD directions (Figure 5(c)). An additional quantity that is examined is the finite-time horizon T, over which the FTLEs are evaluated. To\nassess the sensitivity of the FTLEs to the interval length, the leading FTLE, ˆΓ1, is computed for T = 5, 10, and\n20 s. Figure 5(b) shows these three representative cases.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 39,
+    "total_chunks": 68,
+    "char_count": 968,
+    "word_count": 148,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "062abe60-3c69-467c-b3d2-18f3bbd1c2de",
+    "text": "The shortest horizon (T = 5 s, red curve) yields the\nmost distinct and temporally aligned precursor signal, effectively resolving the buildup of small-scale instabilities\nthat precede dissipation bursts. Intermediate horizons (T = 10 s, blue curve) retain partial correspondence with\nthe dissipation dynamics but exhibit increasing temporal smoothing and delay. The longest horizon (T = 20 s,\ngreen curve) performs the worst, producing a heavily smoothed and phase-lagged response that obscures shortterm fluctuations. These results demonstrate that the predictive sharpness of ˆΓ1 deteriorates with increasing T,\nemphasizing the need to select a horizon consistent with the characteristic time scales of transient amplification. Accordingly, T = 5 s is adopted in this work as the optimal horizon for subsequent analyses. Each FTLE represents the finite-time exponential growth rate of perturbations along a specific direction within\nthe evolving OTD subspace. The leading FTLE, ˆΓ1, corresponds to the most rapidly amplifying perturbation\ndirection and thus reflects the dominant local instability mechanism at a given instant.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 40,
+    "total_chunks": 68,
+    "char_count": 1129,
+    "word_count": 163,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "beb96ed7-31c0-4a59-8662-cb45cd47f303",
+    "text": "Figure 5(d) examines the\ntemporal relationship between ˆΓ1 and the energy dissipation rate, D. Peaks in ˆΓ1 precede sharp increases in D,\nnotably near t ≈120 s and t ≈350 s. This consistent lead–lag behavior reveals that transient instability episodes\nprecede energy dissipation bursts, indicating the effectiveness of ˆΓ1 as a precursor of upcoming extremes. From a\nphysical perspective, the peaks in ˆΓ1 correspond to intervals of intensified local stretching and strain amplification,\nduring which perturbations grow rapidly and generate sharper velocity and vorticity gradients. The increase in\ngradient magnitude accelerates the transfer of energy toward smaller scales, thereby initiating the forward cascade\nthat ultimately manifests as energy dissipation bursts. 5.2 Energy dissipation prediction",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 41,
+    "total_chunks": 68,
+    "char_count": 804,
+    "word_count": 114,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "8b4eb015-b152-4fab-9f26-c666255f9d6b",
+    "text": "To assess the effectiveness of the selected precursor for long-horizon forecasting, we evaluate the Transformer-based\nmodel (Section 3.2.1) across multiple lead times τ ∈{2, 5, 7, 10, 12, 15}. The model input is a two-channel precursor\nsequence Π, comprising the leading FTLE ˆΓ1 and its time derivative ˆΓ′1, sampled over a lookback window of length ∆up to the present time t. See Appendix A.2 for details of the hyperparameter tuning. The sequence-to-sequence\nmodel (Eq. (23)) predicts the evolution of energy dissipation ˆZ over the interval [t, t + τ]; however, evaluation\nfocuses on the terminal prediction ˆZ(t + τ) as a measure of long-horizon predictive skill. Training over the full\nforecast window provides supervision at intermediate times, guiding the model to learn the continuous temporal\nevolution toward extreme events.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 42,
+    "total_chunks": 68,
+    "char_count": 835,
+    "word_count": 129,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "8eeadad6-46d3-4d59-94aa-3382dd7f5ee4",
+    "text": "To evaluate the proposed FTLE-based precursor within the context of existing methodologies, we compare it\nagainst a Fourier-based precursor that has been widely used in prior studies. Specifically, [21] identified the Fourier\nmode α(1, 0) ∈C as an effective indicator of extreme events in Kolmogorov flow, with systematic reductions in\n|α(1, 0)| preceding bursts in energy dissipation. This precursor has since been adopted in subsequent extreme-event\nforecasting studies. For instance, [5] utilized the real and imaginary components of α(1, 0) as input channels in\ntime-series models to predict extreme energy dissipation, demonstrating predictive capability over short forecast\nhorizons (τ < 5). In this work, we compare this established Fourier-based approach with the proposed FTLE-based\nprecursor to assess their relative performance at longer forecast horizons.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 43,
+    "total_chunks": 68,
+    "char_count": 867,
+    "word_count": 124,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "7e07c914-66fb-4b2c-9b1a-74ec8a5fc32a",
+    "text": "Extreme events are defined as instances in which the energy dissipation exceeds its mean by more than two\nstandard deviations. To quantify extreme-event detection, we employ the binary classification metrics described in\nSection 3.4; the F1-score, the area under the precision–recall curve (AUC), the adjusted AUC metric α∗, and the\nabsolute deviation in the number of detected extremes, |∆NEE|. Figure 6 summarizes the results across prediction\nhorizons τ. All metrics exhibit decreasing performance with increasing lead time τ, reflecting the growing difficulty\nof long-horizon prediction.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 44,
+    "total_chunks": 68,
+    "char_count": 591,
+    "word_count": 85,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "a7cc55ea-e30d-4741-a338-6a29c20b39c2",
+    "text": "Nevertheless, the FTLE-based predictor consistently outperforms the Fourier-modebased model, achieving higher precision–recall performance and smaller event-count deviations across all horizons. This advantage is most pronounced for τ ≤10, where the FTLE-based approach maintains near-optimal F1 and\nAUC values. Although performance degrades for both methods at τ = 15, the FTLE-based predictor retains a\nsubstantial performance advantage. Figure 7 compares predicted and true energy-dissipation signals for representative horizons τ = 10 and τ = 15. The\nleft column shows forecasts based on the FTLE-based precursor, while the right column corresponds to the Fouriermode precursor; blue curves denote ground truth and red curves the predictions. At τ = 10, the FTLE-based\nmodel accurately reproduces the dissipation dynamics, with predicted peaks closely aligned in timing, amplitude,\nand frequency with observed bursts. Even at τ = 15, the FTLE-informed forecasts remain largely coherent\nwith the ground truth, capturing the timing and magnitude of most large-amplitude events, albeit with a modest\nincrease in false positives. In contrast, forecasts based on the Fourier precursor deteriorate with increasing horizon:\npartial agreement is observed at τ = 10, while at τ = 15 temporal alignment is largely lost and fluctuations are\noverestimated, underscoring the limited ability of Fourier observables to constrain long-horizon dynamics Figure 8 compares the probability density functions of the predicted and true energy dissipation for different lead\ntimes τ, with discrepancies quantified by the metric D (Eq. (27)). At short horizons (τ = 5), low D values indicate\naccurate reconstruction of both the bulk statistics and the onset of the tail. As τ increases, discrepancies arise\nprimarily in the high-dissipation regime, where capturing heavy-tailed behavior becomes essential. The FTLEbased forecasts preserve good agreement in the tail up to τ ≈10–12, whereas the Fourier-based forecasts show\npronounced tail attenuation and probability shifts toward moderate dissipation, resulting in a sharp increase in D. We have introduced a fully data-driven framework for long-horizon prediction of extreme events in high-dimensional\nchaotic dynamical systems, with emphasis on extremes generated by internal transient instabilities. The key idea\nis to move beyond purely statistical indicators and instead construct interpretable, dynamics-informed precursors\nthat encode the dynamical pathways responsible for extreme-event formation. We have relied on the concept of FTLEs which provide a natural description of the transient instability growth that\nprecedes extreme events; however, since their classical computation is prohibitively expensive in high-dimensional\nsystems, we adopt a reduced-order, data-driven approach based on OTD modes to compute FTLEs efficiently.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 45,
+    "total_chunks": 68,
+    "char_count": 2872,
+    "word_count": 405,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "46d8462a-8397-40eb-8ea3-d15625d6b84b",
+    "text": "Since\nthe governing equations are unavailable and only state observations are assumed, the system dynamics required for\nOTD evolution are approximated directly from data and used solely to infer the local variational behavior. The\nresulting leading FTLE provides a physically grounded measure of finite-time instability growth, capturing the\ndynamical buildup preceding extreme excursions. We demonstrate how these FTLE-based precursors, computed\nusing information available up to the present time, can be integrated with a Transformer-based architecture to\nenable long-horizon prediction of extreme events. The developed architecture is applied to the two-dimensional Kolmogorov flow, where extremes manifest as intermittent bursts of energy dissipation. Across multiple evaluation criteria—including binary classification metrics\nand statistical consistency measures—the FTLE-based approach consistently achieves higher predictive skill and\nrobustness, particularly at forecast horizons beyond those previously attainable with Fourier-based precursors. Figure 6: Performance comparison of binary classification metrics for forecasting extreme events using the FTLEbased precursor (black circles) and the Fourier-mode-based precursor (grey squares) as a function of the prediction\nhorizon τ. Panels show: (a) the F1-score, (b) the area under the precision–recall curve (AUC), (c) the adjusted\nAUC metric α∗, and (d) the absolute deviation in the number of detected extremes between prediction and\ntruth, |∆NEE|. The results indicate that the FTLE-based predictor maintains higher precision–recall performance\nand smaller event-count deviations across increasing forecast horizons, underscoring its superior robustness and\npredictive skill in capturing the onset of extreme events. Figure 7: Predicted and true energy dissipation D(t) at prediction horizons τ = 10 and τ = 15 time units. Blue\ncolor denote the ground-truth dissipation signal and red color indicate the model predictions. The left column\nshows forecasts obtained using the leading FTLE-based precursor while the right column corresponds to forecasts\nbased on the Fourier coefficient approach [5]. Figure 8: Probability density functions of the Kolmogorov flow energy dissipation D(t) for different prediction\nhorizons τ. The left column shows results based on predictions using the leading FTLE precursor ˆΓ1(t), while\nthe right column corresponds to predictions using the Fourier coefficient α(1, 0, t). The blue curve represents the\nground truth distributions, and the red curves corresponds to the predicted distributions. The reported value of\nD quantifies the discrepancy between the two PDFs and it is defined in Eq. (27).",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 46,
+    "total_chunks": 68,
+    "char_count": 2694,
+    "word_count": 372,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "f2a76933-deaa-4f88-aecc-5104f284443d",
+    "text": "Specifically, when FTLE-based precursors are used, predictive performance degrades more gradually with increasing lead time.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 47,
+    "total_chunks": 68,
+    "char_count": 124,
+    "word_count": 15,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "53e776c5-e7b4-485f-8eee-3840ec1fb489",
+    "text": "These results demonstrate that explicitly encoding transient instability mechanisms enables a\nmeaningful extension of practical prediction horizons for rare extreme events. As the proposed framework relies\nonly on time-resolved state observations and data-driven approximations of local dynamics, it is directly applicable to a broad class of high-dimensional systems in which extremes arise from transient instabilities, including\nturbulent flows, geophysical systems, and other complex multiscale dynamics. The research has been supported by the Vannevar Bush Faculty Fellowship N000142512059 as well as the AFOSR\ngrant FA9550-23-1-0517. No external datasets were used.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 48,
+    "total_chunks": 68,
+    "char_count": 671,
+    "word_count": 89,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "22262c07-dbb5-40e8-8951-6d42d41dd87b",
+    "text": "All results were generated computationally. The code required to reproduce the\nfindings and figures is available at https://github.com/Eirini-Katsidoniotaki/Precursors_Extreme_Events. Appendix A: Details on the Methodology A.1 Approximating the system dynamics We construct a data-driven approximation of the nonlinear operator which governs the instantaneous evolution of the system state as described in Eq. (1). Since the governing equations\nare not assumed to be available, F is inferred directly from observed state trajectories. Let {u(tk)}Nk=0 denote discrete snapshots of the system. Time derivatives are estimated from the data using a\nfourth-order central finite-difference scheme, yielding the training dataset D = {(u(tk), ˙u(tk))}Nk=1. We note that F is an unbounded operator in the infinite-dimensional function space due to the presence of differential terms that amplify high-frequency components.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 49,
+    "total_chunks": 68,
+    "char_count": 913,
+    "word_count": 123,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "c7422c5f-9a7a-4c32-820b-4e5b6ab74033",
+    "text": "This renders direct learning ill-conditioned in the continuous setting. To ensure numerical stability and well-posedness, the learning problem is formulated on the\nfinite-dimensional state space Ud associated with the numerical discretization of the trajectory, where ˆF provides a\nstable approximation suitable for subsequent linearization and instability analysis. We then learn an approximation\nof the dynamics via a nonlinear map\nˆF : u(t) 7→ˆ˙u(t), The learned operator ˆF is used to approximate the local linearized dynamics required for the\ncomputation of data-driven OTD modes. In particular, ˆF is substituted into the finite-difference formulation of\nEq. (22) to evaluate Jacobian–vector products, Lv, enabling the efficient computation of reduced-order instability\nmeasures. Three neural architectures with complementary inductive biases are employed; their specific constructions are\ndetailed in the following section.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 50,
+    "total_chunks": 68,
+    "char_count": 930,
+    "word_count": 126,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "aee27e75-5f5e-49f1-a368-eb07d699b8a0",
+    "text": "In the present study, the temporal resolution of the data is sufficiently fine\nto permit accurate estimation of time derivatives ˙u(t). For datasets sampled more coarsely in time, it may be\npreferable to infer the discrete flow map S : u(t) →u(t + h) instead; see, e.g., [34,36,46]. A.2 Time series forecasting A.2.1 Transformer-based model The objective is to learn a nonlinear operator that maps the temporal history ∆of the precursor to the future\nevolution of the observable z(t) over a prediction horizon τ, discretized into nτ steps,\nNθ : Π 7−→ˆZ = ˆz(t + nττ ), . . . , ˆz(t + τ) ∈Rnτ . Each precursor vector is mapped to a latent representation via a learned convolutional embedding that captures\nlocal temporal structure, and augmented with a fixed sinusoidal positional encoding to preserve temporal ordering. We employ a time-series forecasting model based on the Transformer-based architecture [53,54]. The model learns\nthe operator Nθ by constructing context-aware representations of the two-channel precursor sequence Π. The encoder embeds the precursor sequence into a latent space of dimension d, where Win ∈R2×d is a learnable projection matrix and P denotes positional encoding, which injects temporal\nordering into the sequence. Each encoder layer consists of a multi-head self-attention (MHSA) block followed by a position-wise feed-forward\nnetwork (FFN), combined with residual connections and layer normalization: Z(ℓ) = E(ℓ−1) + MHSA E(ℓ−1) , E(ℓ) = Z(ℓ) + FFN Z(ℓ) . For each attention head, the queries, keys, and values are given by Q = EWQ, K = EWK, V = EWV ,",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 51,
+    "total_chunks": 68,
+    "char_count": 1586,
+    "word_count": 258,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "5af71944-2710-4ec8-9029-8783741e661d",
+    "text": "The encoder output Henc provides a compact representation of the precursor history, capturing both short- and\nlong-range temporal dependencies in the instability dynamics. The decoder generates predictions conditioned on the encoded precursor representation. The initial\ndecoder input sequence D(0) consists of the most recent nℓobserved values of the target observable, followed by\nplaceholders for the future prediction horizon, where nℓdenotes the label length. Each decoder layer then applies\nmasked self-attention, encoder–decoder (cross) attention, and a feed-forward network: U(ℓ) = D(ℓ−1) + MaskedAttn D(ℓ−1) , V(ℓ) = U(ℓ) + CrossAttn U(ℓ), Henc , D(ℓ) = V(ℓ) + FFN V(ℓ) . The masking enforces causality, ensuring that predictions at future times depend only on precursor information\navailable up to the current forecast step. A final linear projection maps the decoder output to the predicted\nobservable,\nˆZ = DoutWout. Informer-specific efficiency. To enable efficient learning from long precursor sequences, we adopt the Informer\narchitecture, which replaces full self-attention with a probabilistic sparse attention mechanism.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 53,
+    "total_chunks": 68,
+    "char_count": 1138,
+    "word_count": 161,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "cff97216-1b94-4be2-87f3-842d8a8ea4a5",
+    "text": "This reduces the\ncomputational complexity from O(n2∆) to approximately O(n∆log n∆). In addition, convolutional distillation\nlayers are employed within the encoder to progressively downsample the temporal dimension, retaining the most\ninformative components of the precursor history while improving scalability. From a dynamical systems perspective, the model learns a non-Markovian, data-driven approximation of the mapping\nˆz(t + τ) ≈Nθ π(t), π(t −∆t), . . . , where the attention mechanism adaptively identifies and weights the most dynamically informative segments of\nthe precursor history for long-horizon forecasting of extreme events. Look-back window length: We performed a systematic study of the look-back length ∆to evaluate its effect\non predictive performance. Specifically, we tested look-back windows of ∆= 2τ, 3τ, 4τ, and 5τ, where τ denotes\nthe characteristic time scale. The results showed that a look-back length of 4τ provided the most accurate time\nseries predictions, achieving the best performance under the AUC criterion. Throughout all experiments, the label\nlength was fixed to nℓ= ∆2 .",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 54,
+    "total_chunks": 68,
+    "char_count": 1111,
+    "word_count": 162,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "543c506d-7610-4bc0-a030-b8f770ee1a1b",
+    "text": "Hyperparameter tuning. In addition to the look-back window analysis, we performed hyperparameter tuning\nto optimize the model performance. The hyperparameters considered included the attention factor, embedding\ndimension, number of attention heads, numbers of encoder and decoder layers, feedforward network dimension,\nand dropout rate. The attention factor was fixed to 7, which provides a favorable balance between contextual\nrepresentation in the probabilistic attention mechanism and computational efficiency. The embedding dimension\nwas varied in {128, 256, 512}, while the number of attention heads was fixed at 8. The number of encoder layers\nwas chosen from {2, 3, 4}, and the number of decoder layers from {2, 3}. The feedforward network dimension was\nvaried in {256, 512, 1024}, and the dropout rate in {0.05, 0.1, 0.2}. Hyperparameter tuning was conducted using a random search strategy for prediction horizons τ ∈{10, 15}. Model\nperformance was evaluated using the AUC metric. Based on predictive performance and computational efficiency,\nthe final configuration was selected as follows: attention factor = 7, embedding dimension = 256, attention heads\n= 8, encoder layers = 3, decoder layers = 3, feedforward dimension = 1024, and dropout rate = 0.1. This\nconfiguration yielded stable training and consistently strong performance across prediction horizons.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 55,
+    "total_chunks": 68,
+    "char_count": 1370,
+    "word_count": 201,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "09164354-2da5-4754-8211-59d40cccd5ac",
+    "text": "Appendix B: Details on the Results B.1 Reconstruction of the approximated dynamics, ˆF B.1.1 Deep learning models used for the approximation To reconstruct the data-driven operator ˆF of the Kolmogorov flow, we evaluate several neural architectures designed\nto infer the time derivative of the velocity field directly from state snapshots, ˆ˙u = ˆF(u). The operator ˆF is learned\non a 64 × 64 spectral grid defining the discrete state space Ud, using velocity field snapshots spanning 30,000 time\nunits with temporal resolution ∆t = 0.1. Seventy percent of the data are used for training and the remaining\nthirty percent for testing, and all results reported in this section correspond to the test set.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 56,
+    "total_chunks": 68,
+    "char_count": 702,
+    "word_count": 113,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "aa7564fa-ac0a-4cfd-9a93-8c8324c2fd5c",
+    "text": "All models are trained using the Adam optimizer with a batch size of 64, learning rate 10−3, and weight decay 10−4. A step-based learning-rate scheduler is applied to stabilize training by reducing the learning rate after a prescribed\nnumber of epochs. Training is carried out for 300 epochs, with periodic checkpointing to ensure reproducibility\nand limit overfitting.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 57,
+    "total_chunks": 68,
+    "char_count": 369,
+    "word_count": 57,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "1681c819-591d-4527-81d6-952a9289bd09",
+    "text": "Fourier Neural Operator (FNO). We employ the FNO [46] to approximate the nonlinear operator F : Ud →Ud, ˆ˙u = ˆF(u), owing to its ability to learn mappings between function spaces while preserving nonlocal and multiscale structure. In the present setting, the FNO provides a discrete realization of an operator-learning framework acting on the\nfinite-dimensional subspace Ud induced by the spectral discretization. Each FNO layer updates the feature field u(l) according to",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 58,
+    "total_chunks": 68,
+    "char_count": 473,
+    "word_count": 72,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "eb6ec0e9-8052-4f2e-9945-db8182375d3e",
+    "text": "u(l+1)(x) = σ W u(l)(x) + F−1 R · F(u(l)) (x) , where F and F−1 denote the Fourier and inverse Fourier transforms, R is a learnable linear operator acting on a\ntruncated set of Fourier modes, W is a pointwise linear map, and σ is a nonlinear activation. This formulation\nenables efficient representation of global interactions through spectral convolutions while retaining local nonlinear\neffects via pointwise operations. In our implementation, the FNO employs nmodes = (32, 32) retained Fourier modes, 7 Fourier layers, and 64 hidden\nchannels per layer, with two input and output channels corresponding to the velocity components. The model\nis trained using paired state–derivative samples, D, to minimize the discrepancy between predicted and reference\ntime derivatives.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 59,
+    "total_chunks": 68,
+    "char_count": 773,
+    "word_count": 120,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "d5ed16b1-fb81-45d6-8767-6bf9dc03c8ad",
+    "text": "Residual UNet++ (ResUNet++). We employ the ResUNet++ architecture [28], an encoder–decoder convolutional network with residual and skip connections, to approximate the nonlinear operator ˆF. The architecture\nincorporates residual blocks for stable training, squeeze-and-excitation (SE) modules for adaptive channel-wise\nfeature recalibration, and attention mechanisms to enhance multiscale representation and focus on dynamically\nrelevant regions. Multiscale context is captured through an Atrous Spatial Pyramid Pooling (ASPP) module at the bottleneck,\nenabling aggregation of information across multiple receptive fields and supporting simultaneous representation of\nlarge-scale vortical structures and fine-scale gradients. The decoder employs attention-guided feature fusion and\nresidual refinement to recover spatial detail while preserving global coherence. A final 1 × 1 convolution projects\nthe decoded features to the predicted time derivative ˆ˙u. In our implementation, the network consists of four encoder stages with channel dimensions (64, 128, 256, 512) and\nthree decoder stages with (128, 64, 32). The ASPP module uses dilation rates {1, 6, 12, 18}.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 60,
+    "total_chunks": 68,
+    "char_count": 1165,
+    "word_count": 155,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "88ec1fcc-54f8-4eca-8eec-be70eb97c18b",
+    "text": "All convolutions use\n3 × 3 kernels with batch normalization and SiLU activations, replacing the original ReLU to improve smoothness\nand differentiability, which is beneficial for downstream Jacobian and OTD computations. Overall, ResUNet++ provides a strong inductive bias for multiscale flow reconstruction, balancing global context\nand fine-scale spatial detail in the approximation of the Kolmogorov flow dynamics. Residual CNN (ResCNN). As a baseline model, we employ a ResCNN to approximate the nonlinear operator\nˆF using purely local spatial interactions. The architecture consists of an initial convolutional layer followed by six\nresidual convolutional blocks, each containing two 3×3 convolutions with batch normalization, Tanh activation, and\ndropout. Residual skip connections enable the network to learn incremental corrections to the input representation,\nimproving optimization stability and preserving spatial information across layers. The network terminates with a final convolution projecting the features to the two output channels corresponding\nto the predicted time derivatives ˆ˙uh. While the ResCNN lacks explicit mechanisms to capture global or multiscale\ninteractions, it provides a lightweight and computationally efficient baseline that isolates the role of local nonlinear\nfeature extraction in modeling the Kolmogorov flow dynamics. In chaotic PDE systems, like the Kolmogorv flow, the target field field exhibits strongly multiscale\nbehavior, containing both low-frequency, large-scale structures and high-frequency, small-scale fluctuations.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 61,
+    "total_chunks": 68,
+    "char_count": 1573,
+    "word_count": 210,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "f3a869b7-a201-43cb-8d91-05f6ba321382",
+    "text": "When\ntraining using the standard mean-square error or L2 loss, the metric quantifies only the average difference in\namplitude across space. Consequently, the model can reproduce the coarse, low-frequency patterns while neglecting\nfine-scale gradients or oscillations. However, these high-frequency details are precisely those that encode key\nphysical quantities such as energy dissipation, vorticity, and turbulent structures.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 62,
+    "total_chunks": 68,
+    "char_count": 426,
+    "word_count": 56,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "4096bef6-7f77-435f-b546-8660df3e9247",
+    "text": "Minimizing only the L2 loss\ntherefore leads to over-smoothed predictions and a loss of physical accuracy. To address this limitation, several studies [9, 17, 31] have introduced Sobolev training, in which the loss function\nincorporates higher-order derivatives of the prediction error. This approach penalizes discrepancies not only in\nthe field values but also in their spatial derivatives, thereby improving the representation of fine-scale physical\nfeatures. We adopt the same principle in the training of our model.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 63,
+    "total_chunks": 68,
+    "char_count": 519,
+    "word_count": 76,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "54b64a6a-4148-465f-8470-aa70d59504f5",
+    "text": "For a function f(x), the Sobolev norm of order k with exponent p is defined as k !1/p\n∥f∥k,p = X ∥f (i)∥pp\ni=0 For p = 2, this becomes the Hilbert-space Sobolev norm, denoted by Hk, ∥f∥2Hk = X ∥Dαf∥2L2.\n|α|≤k where the operator D denotes a partial derivative of the function f and α specifies the order of differentiation.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 64,
+    "total_chunks": 68,
+    "char_count": 322,
+    "word_count": 61,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "ce404e69-54de-4179-a702-13c9b77a0417",
+    "text": "In\nexpanded form, this becomes ∥f∥2Hk = ∥f∥2L2 + ∥∇f∥2L2 + ∥D2f∥2L2 + · · · + ∥Dkf∥2L2. In Fourier space, the same norm can be written as: ∥f∥2Hk = X 1 + n2 + · · · + n2k |ˆf(n)|2.\nn=−∞ where ˆf(n) denotes the Fourier coefficients of f. In this formulation, higher frequencies n are weighted more\nheavily, since derivatives amplify the high-frequency content of the signal. The order k determines which spatial features are emphasized: k = 0 corresponds to L2 norm, which captures\ncoarse-scale structures, k = 1 (the H1 norm) includes first derivatives, associated with velocity gradients, and\nk = 2 (the H2 norm) includes second derivatives, capturing vorticity, and dissipation effects. In our framework, we define\nf = ˆFh(uh) −Fh(uh) as the discrepancy between the predicted and reference velocity time derivatives. The training objective is then\nformulated as\nLSob = ∥f∥2H2. This choice encourages the model to capture high-frequency information and higher-order spatial derivatives and\nmoments, ensuring that it learns the correct small-scale flow structures, maintains accurate gradient information,\nand preserves the physical interpretability of the predicted dynamics.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 65,
+    "total_chunks": 68,
+    "char_count": 1176,
+    "word_count": 184,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "514a8d11-13ef-4acb-9518-f5ca99d84789",
+    "text": "B.1.2 Evaluate reconstruction accuracy The accuracy of the learned map ˆF is assessed by comparing predicted time derivatives with their reference values\non the test dataset. Three neural architectures are considered: a FNO, ResUNet++, and a ResCNN. Errors are\nquantified using Sobolev norms of order k = 0, 1, and 2, capturing discrepancies in field magnitude as well as firstand second-order spatial derivatives. Hk = ∥ˆ˙u −˙u∥Hk, k = 0, 1, 2,",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 66,
+    "total_chunks": 68,
+    "char_count": 445,
+    "word_count": 72,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "bf30099c-e327-47d0-8fda-da796e7746d1",
+    "text": "These metrics provide a scale-aware measure of each model's ability to reproduce both large-scale flow structures\nand fine-scale gradients. As reported in Table 1, the FNO attains the lowest error across all Sobolev orders, indicating the most accurate approximation of the underlying dynamical operator. The ResUNet++ yields consistently\nhigher, yet comparable, errors, while the ResCNN exhibits errors nearly an order of magnitude larger, reflecting\nlimited capacity to represent the multiscale and nonlocal interactions inherent to the flow. Model H0 H1 H2\nFNO 2 × 10−3 4 × 10−3 1.4 × 10−2\nResUNet++ 7 × 10−3 1 × 10−2 2.4 × 10−2\nResCNN 7 × 10−1 9 × 10−1 1.3 × 100 Table 1: Comparison of learned operator accuracy measured in Sobolev norms ∥ˆ˙u −˙u∥Hk for k = 0, 1, 2. Figure 9 illustrates representative comparisons between reference and reconstructed time derivatives of the vorticity field, ˙ω = ∇× ˙u , at successive time instants. The vorticity derivative serves as a compact diagnostic\nof instantaneous dynamics, highlighting coherent vortical structures and sharp gradients associated with nonlinear interactions. The FNO and ResUNet++ accurately reproduce the dominant structures and their temporal\nevolution, whereas the ResCNN systematically underestimates local gradients, resulting in smoother fields and increased localized error. Absolute-error maps remain approximately an order of magnitude smaller than the signal\namplitude, confirming faithful recovery of the dominant flow features. The corresponding energy spectra (bottom row of Fig. 9) provide a complementary, scale-wise assessment. The\nspectra obtained from the FNO and ResUNet++ closely match the reference solution over a broad range of\nwavenumbers, with deviations confined to the highest, dissipative scales. These discrepancies reflect the inherent\nsmoothness bias of neural architectures, arising from spectral truncation, convolutional filtering, and smooth\nactivation functions.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 67,
+    "total_chunks": 68,
+    "char_count": 1962,
+    "word_count": 286,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "b8f9d81f-fac9-45ad-be1d-49853b40cdbe",
+    "text": "In the present configuration, the Kolmogorov flow is dominated by low-frequency modes,\nand all models adequately resolve the dynamically relevant scales. For flows exhibiting stronger high-frequency\nactivity or slower spectral decay, architectures specifically designed to enhance fine-scale spectral fidelity may be\nrequired [34]. The results indicate that ˆF provides an accurate, stable, and physically consistent approximation of the true dynamics. This fidelity enables reliable evaluation of the Jacobian–vector products Lv, which serve as the foundation\nfor the computation of the OTD modes. B.2 Data-driven OTD modes Figure 10 compares the curl of the first six OTD modes of the Kolmogorov flow, ∇× vi(t) for i = 1, . . . , 6, at two\ninstants. For each mode, the top two rows show the equation-based and the data-driven prediction; the third row\nshows the absolute error; the bottom row reports the energy spectra at each time.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 68,
+    "total_chunks": 68,
+    "char_count": 935,
+    "word_count": 146,
+    "chunking_strategy": "semantic"
+  },
+  {
+    "chunk_id": "3b78fc0f-41a0-45a7-9632-f1efb7b84775",
+    "text": "The leading OTD modes, v1 and v2, represent the most energetic directions of instantaneous instability growth\nand are characterized by large, coherent vortical structures. The data-driven predictions accurately reproduce their\nspatial organization and temporal evolution between t and t + 7, preserving orientation and circulation strength. Errors are weak and localized near vortex peripheries, consistent with small phase offsets rather than amplitude\ndiscrepancies, and the predicted and reference energy spectra show near-perfect agreement across the resolved\nwavenumber range. Higher-order modes (v3–v6) exhibit increasingly intricate and less coherent flow patterns associated with weaker instability directions. While localized discrepancies become more pronounced with increasing mode index—particularly\nnear regions of strong gradients and filamentary structures—the data-driven modes retain the correct overall structure and temporal evolution. The corresponding energy spectra remain in close agreement with the reference\nsolution across the energy-containing and inertial ranges, indicating that the essential multiscale structure of the\ninstability subspace is preserved.",
+    "paper_id": "2603.10777",
+    "title": "Dynamics-Informed Deep Learning for Predicting Extreme Events",
+    "authors": [
+      "Eirini Katsidoniotaki",
+      "Themistoklis P. Sapsis"
+    ],
+    "published_date": "2026-03-11",
+    "primary_category": "",
+    "arxiv_url": "http://arxiv.org/abs/2603.10777v1",
+    "chunk_index": 69,
+    "total_chunks": 68,
+    "char_count": 1184,
+    "word_count": 152,
+    "chunking_strategy": "semantic"
+  }
+]
\ No newline at end of file