new

Get trending papers in your email inbox!

Subscribe

byAK and the research community

Mar 14

Implicit factorized transformer approach to fast prediction of turbulent channel flows

Transformer neural operators have recently become an effective approach for surrogate modeling of systems governed by partial differential equations (PDEs). In this paper, we introduce a modified implicit factorized transformer (IFactFormer-m) model which replaces the original chained factorized attention with parallel factorized attention. The IFactFormer-m model successfully performs long-term predictions for turbulent channel flow, whereas the original IFactFormer (IFactFormer-o), Fourier neural operator (FNO), and implicit Fourier neural operator (IFNO) exhibit a poor performance. Turbulent channel flows are simulated by direct numerical simulation using fine grids at friction Reynolds numbers Re_{tau}approx 180,395,590, and filtered to coarse grids for training neural operator. The neural operator takes the current flow field as input and predicts the flow field at the next time step, and long-term prediction is achieved in the posterior through an autoregressive approach. The results show that IFactFormer-m, compared to other neural operators and the traditional large eddy simulation (LES) methods including dynamic Smagorinsky model (DSM) and the wall-adapted local eddy-viscosity (WALE) model, reduces prediction errors in the short term, and achieves stable and accurate long-term prediction of various statistical properties and flow structures, including the energy spectrum, mean streamwise velocity, root mean square (rms) values of fluctuating velocities, Reynolds shear stress, and spatial structures of instantaneous velocity. Moreover, the trained IFactFormer-m is much faster than traditional LES methods. By analyzing the attention kernels, we elucidate the reasons why IFactFormer-m converges faster and achieves a stable and accurate long-term prediction compared to IFactFormer-o. Code and data are available at: https://github.com/huiyu-2002/IFactFormer-m.

Robust Spectral Anomaly Detection in EELS Spectral Images via Three Dimensional Convolutional Variational Autoencoders

We introduce a Three-Dimensional Convolutional Variational Autoencoder (3D-CVAE) for automated anomaly detection in Electron Energy Loss Spectroscopy Spectrum Imaging (EELS-SI) data. Our approach leverages the full three-dimensional structure of EELS-SI data to detect subtle spectral anomalies while preserving both spatial and spectral correlations across the datacube. By employing negative log-likelihood loss and training on bulk spectra, the model learns to reconstruct bulk features characteristic of the defect-free material. In exploring methods for anomaly detection, we evaluated both our 3D-CVAE approach and Principal Component Analysis (PCA), testing their performance using Fe L-edge peak shifts designed to simulate material defects. Our results show that 3D-CVAE achieves superior anomaly detection and maintains consistent performance across various shift magnitudes. The method demonstrates clear bimodal separation between normal and anomalous spectra, enabling reliable classification. Further analysis verifies that lower dimensional representations are robust to anomalies in the data. While performance advantages over PCA diminish with decreasing anomaly concentration, our method maintains high reconstruction quality even in challenging, noise-dominated spectral regions. This approach provides a robust framework for unsupervised automated detection of spectral anomalies in EELS-SI data, particularly valuable for analyzing complex material systems.

Accelerating Transformers with Spectrum-Preserving Token Merging

Increasing the throughput of the Transformer architecture, a foundational component used in numerous state-of-the-art models for vision and language tasks (e.g., GPT, LLaVa), is an important problem in machine learning. One recent and effective strategy is to merge token representations within Transformer models, aiming to reduce computational and memory requirements while maintaining accuracy. Prior works have proposed algorithms based on Bipartite Soft Matching (BSM), which divides tokens into distinct sets and merges the top k similar tokens. However, these methods have significant drawbacks, such as sensitivity to token-splitting strategies and damage to informative tokens in later layers. This paper presents a novel paradigm called PiToMe, which prioritizes the preservation of informative tokens using an additional metric termed the energy score. This score identifies large clusters of similar tokens as high-energy, indicating potential candidates for merging, while smaller (unique and isolated) clusters are considered as low-energy and preserved. Experimental findings demonstrate that PiToMe saved from 40-60\% FLOPs of the base models while exhibiting superior off-the-shelf performance on image classification (0.5\% average performance drop of ViT-MAE-H compared to 2.6\% as baselines), image-text retrieval (0.3\% average performance drop of CLIP on Flickr30k compared to 4.5\% as others), and analogously in visual questions answering with LLaVa-7B. Furthermore, PiToMe is theoretically shown to preserve intrinsic spectral properties of the original token space under mild conditions

Low-energy Injection and Nonthermal Particle Acceleration in Relativistic Magnetic Turbulence

Relativistic magnetic turbulence has been proposed as a process for producing nonthermal particles in high-energy astrophysics. Particle energization may be contributed by both magnetic reconnection and turbulent fluctuations, but their interplay is poorly understood. It has been suggested that during magnetic reconnection the parallel electric field dominates particle acceleration up to the lower bound of the power-law particle spectrum, but recent studies show that electric fields perpendicular to magnetic field can play an important, if not dominant role. In this study, we carry out 2D fully kinetic particle-in-cell simulations of magnetically dominated decaying turbulence in a relativistic pair plasma. For a fixed magnetization parameter sigma_0=20, we find that the injection energy {varepsilon}_{rm inj} converges with increasing domain size to {varepsilon}_{rm inj}simeq 10m_ec^2. In contrast, the power-law index, the cut-off energy, and the power-law extent increase steadily with domain size. We trace a large number of particles and evaluate the contributions of the work done by the parallel (W_parallel) and perpendicular (W_perp) electric fields during both the injection phase and the post-injection phase. We find that during the injection phase, the W_perp contribution increases with domain size, suggesting that it may eventually dominate injection for a sufficiently large domain. In contrast, both components contribute equally during the post-injection phase, insensitive to the domain size. For high energy ({varepsilon}varepsilon_{rm inj}) particles, W_perp dominates the subsequent energization. These findings may improve our understanding of nonthermal particles and their emissions in astrophysical plasmas.

Separating source-intrinsic and Lorentz invariance violation induced delays in the very high energy emission of blazar flares

Aims: The aim of the present study is to explore how to disentangle energy-dependent time delays due to a possible Lorentz invariance violation (LIV) at Planck scale from intrinsic delays expected in standard blazar flares. Methods: We first characterise intrinsic time delays in BL Lacs and Flat Spectrum Radio Quasars in standard one-zone time-dependent synchrotron self-Compton or external Compton models, during flares produced by particle acceleration and cooling processes. We simulate families of flares with both intrinsic and external LIV-induced energy-dependent delays. Discrimination between intrinsic and LIV delays is then investigated in two different ways. A technique based on Euclidean distance calculation between delays obtained in the synchrotron and in the inverse-Compton spectral bumps is used to assess their degree of correlation. A complementary study is performed using spectral hardness versus intensity diagrams in both energy ranges. Results: We show that the presence of non-negligible LIV effects, which essentially act only at very high energies (VHE), can drastically reduce the strong correlation expected between the X-ray and the VHE gamma-ray emission in leptonic scenarios. The LIV phenomenon can then be hinted at measuring the Euclidean distance d_{E} from simultaneous X-ray and gamma-ray flare monitoring. Large values of minimal distance d_{E,min} would directly indicate the influence of non-intrinsic time delays possibly due to LIV in SSC flares. LIV effects can also significantly modify the VHE hysteresis patterns in hardness-intensity diagrams and even change their direction of rotation as compared to the X-ray behaviour. Both observables could be used to discriminate between LIV and intrinsic delays, provided high quality flare observations are available.

The Mira-Titan Universe IV. High Precision Power Spectrum Emulation

Modern cosmological surveys are delivering datasets characterized by unprecedented quality and statistical completeness; this trend is expected to continue into the future as new ground- and space-based surveys come online. In order to maximally extract cosmological information from these observations, matching theoretical predictions are needed. At low redshifts, the surveys probe the nonlinear regime of structure formation where cosmological simulations are the primary means of obtaining the required information. The computational cost of sufficiently resolved large-volume simulations makes it prohibitive to run very large ensembles. Nevertheless, precision emulators built on a tractable number of high-quality simulations can be used to build very fast prediction schemes to enable a variety of cosmological inference studies. We have recently introduced the Mira-Titan Universe simulation suite designed to construct emulators for a range of cosmological probes. The suite covers the standard six cosmological parameters {omega_m,omega_b, sigma_8, h, n_s, w_0} and, in addition, includes massive neutrinos and a dynamical dark energy equation of state, {omega_{nu}, w_a}. In this paper we present the final emulator for the matter power spectrum based on 111 cosmological simulations, each covering a (2.1Gpc)^3 volume and evolving 3200^3 particles. An additional set of 1776 lower-resolution simulations and TimeRG perturbation theory results for the power spectrum are used to cover scales straddling the linear to mildly nonlinear regimes. The emulator provides predictions at the two to three percent level of accuracy over a wide range of cosmological parameters and is publicly released as part of this paper.

On the statistical theory of self-gravitating collisionless dark matter flow: Scale and redshift variation of velocity and density distributions

This paper studies the scale and redshift variation of density and velocity distributions in self-gravitating collisionless dark matter flow by a halo-based non-projection approach. All particles are divided into halo and out-of-halo particles for redshift variation of distributions. Without projecting particle fields onto a structured grid, the scale variation is analyzed by identifying all particle pairs on different scales r. We demonstrate that: i) Delaunay tessellation can be used to reconstruct the density field. The density correlation, spectrum, and dispersion functions were obtained, modeled, and compared with the N-body simulation; ii) the velocity distributions are symmetric on both small and large scales and are non-symmetric with a negative skewness on intermediate scales due to the inverse energy cascade at a constant rate varepsilon_u; iii) On small scales, the even order moments of pairwise velocity Delta u_L follow a two-thirds law (-varepsilon_ur)^{2/3}, while the odd order moments follow a linear scaling langle(Delta u_L)^{2n+1}rangle=(2n+1)langle(Delta u_L)^{2n}ranglelangleDelta u_Lrangler; iv) The scale variation of the velocity distributions was studied for longitudinal velocities u_L or u_L^{'}, pairwise velocity (velocity difference) Delta u_L=u_L^{'}-u_L and velocity sum Sigma u_L=u^{'}_L+u_L. Fully developed velocity fields are never Gaussian on any scale, despite that they can initially be Gaussian; v) On small scales, u_L and Sigma u_L can be modeled by a X distribution to maximize the system entropy; vi) On large scales, Delta u_L and Sigma u_L can be modeled by a logistic or a X distribution; vii) the redshift variation of the velocity distributions follows the evolution of the X distribution involving a shape parameter alpha(z) decreasing with time.

The nature of an imaginary quasi-periodic oscillation in the soft-to-hard transition of MAXI J1820+070

A recent study shows that if the power spectra (PS) of accreting compact objects consist of a combination of Lorentzian functions that are coherent in different energy bands but incoherent with each other, the same is true for the Real and Imaginary parts of the cross spectrum (CS). Using this idea, we discovered imaginary quasi-periodic oscillations (QPOs) in NICER observations of the black hole candidate MAXI J1820+070. The imaginary QPOs appear as narrow features with a small Real and large Imaginary part in the CS but are not significantly detected in the PS when they overlap in frequency with other variability components. The coherence function drops and the phase lags increase abruptly at the frequency of the imaginary QPO. We show that the multi-Lorentzian model that fits the PS and CS of the source in two energy bands correctly reproduces the lags and the coherence, and that the narrow drop of the coherence is caused by the interaction of the imaginary QPO with other variability components. The imaginary QPO appears only in the decay of the outburst, during the transition from the high-soft to the low-hard state of MAXI J1820+070, and its frequency decreases from approximately 5 Hz to around 1 Hz as the source spectrum hardens. We also analysed the earlier observations of the transition, where no narrow features were seen, and we identified a QPO in the PS that appears to evolve into the imaginary QPO as the source hardens. As for the type-B and C QPOs in this source, the rms spectrum of the imaginary QPO increases with energy. The lags of the imaginary QPO are similar to those of the type-B and C QPOs above 2 keV but differ from the lags of those other QPOs below that energy. While the properties of this imaginary QPO resemble those of type-C QPOs, we cannot rule out that it is a new type of QPO.

A mechanism to generate varying speed of light via Higgs-dilaton coupling: Theory and cosmological applications

We allow the Higgs field Phi to interact with a dilaton field chi of the background spacetime via the coupling chi^2,Phi^daggerPhi. Upon spontaneous gauge symmetry breaking, the Higgs VEV becomes proportional to chi. While traditionally this linkage is employed to make the Planck mass and particle masses dependent on chi, we present an textit alternative mechanism: the Higgs VEV will be used to construct Planck's constant hbar and speed of light c. Specifically, each open set vicinity of a given point x^* on the spacetime manifold is equipped with a replica of the Glashow-Weinberg-Salam action operating with its own effective values of hbar_* and c_* per hbar_*proptochi^{-1/2}(x^*) and c_*proptochi^{1/2}(x^*), causing these ``fundamental constants'' to vary alongside the dynamical field chi. Moreover, in each open set around x^*, the prevailing value chi(x^*) determines the length and time scales for physical processes occurring in this region as lproptochi^{-1}(x^*) and tauproptochi^{-3/2}(x^*). This leads to an textit anisotropic relation tau^{-1}propto l^{-3/2} between the rate of clocks and the length of rods, resulting in a distinct set of novel physical phenomena. For late-time cosmology, the variation of c along the trajectory of light waves from distant supernovae towards the Earth-based observer necessitates modifications to the Lema\^itre redshift relation and the Hubble law. These modifications are capable of: (1) Accounting for the Pantheon Catalog of SNeIa through a declining speed of light in an expanding Einstein--de Sitter universe, thus avoiding the need for dark energy; (2) Revitalizing Blanchard-Douspis-Rowan-Robinson-Sarkar's CMB power spectrum analysis that bypassed dark energy [A&A 412, 35 (2003)]; and (3) Resolving the H_0 tension without requiring a dynamical dark energy component.

Galaxy Spectra neural Network (GaSNet). II. Using Deep Learning for Spectral Classification and Redshift Predictions

Large sky spectroscopic surveys have reached the scale of photometric surveys in terms of sample sizes and data complexity. These huge datasets require efficient, accurate, and flexible automated tools for data analysis and science exploitation. We present the Galaxy Spectra Network/GaSNet-II, a supervised multi-network deep learning tool for spectra classification and redshift prediction. GaSNet-II can be trained to identify a customized number of classes and optimize the redshift predictions for classified objects in each of them. It also provides redshift errors, using a network-of-networks that reproduces a Monte Carlo test on each spectrum, by randomizing their weight initialization. As a demonstration of the capability of the deep learning pipeline, we use 260k Sloan Digital Sky Survey spectra from Data Release 16, separated into 13 classes including 140k galactic, and 120k extragalactic objects. GaSNet-II achieves 92.4% average classification accuracy over the 13 classes (larger than 90% for the majority of them), and an average redshift error of approximately 0.23% for galaxies and 2.1% for quasars. We further train/test the same pipeline to classify spectra and predict redshifts for a sample of 200k 4MOST mock spectra and 21k publicly released DESI spectra. On 4MOST mock data, we reach 93.4% accuracy in 10-class classification and an average redshift error of 0.55% for galaxies and 0.3% for active galactic nuclei. On DESI data, we reach 96% accuracy in (star/galaxy/quasar only) classification and an average redshift error of 2.8% for galaxies and 4.8% for quasars, despite the small sample size available. GaSNet-II can process ~40k spectra in less than one minute, on a normal Desktop GPU. This makes the pipeline particularly suitable for real-time analyses of Stage-IV survey observations and an ideal tool for feedback loops aimed at night-by-night survey strategy optimization.

Galaxy Spectra neural Networks (GaSNets). I. Searching for strong lens candidates in eBOSS spectra using Deep Learning

With the advent of new spectroscopic surveys from ground and space, observing up to hundreds of millions of galaxies, spectra classification will become overwhelming for standard analysis techniques. To prepare for this challenge, we introduce a family of deep learning tools to classify features in one-dimensional spectra. As the first application of these Galaxy Spectra neural Networks (GaSNets), we focus on tools specialized at identifying emission lines from strongly lensed star-forming galaxies in the eBOSS spectra. We first discuss the training and testing of these networks and define a threshold probability, PL, of 95% for the high quality event detection. Then, using a previous set of spectroscopically selected strong lenses from eBOSS, confirmed with HST, we estimate a completeness of ~80% as the fraction of lenses recovered above the adopted PL. We finally apply the GaSNets to ~1.3M spectra to collect a first list of ~430 new high quality candidates identified with deep learning applied to spectroscopy and visually graded as highly probable real events. A preliminary check against ground-based observations tentatively shows that this sample has a confirmation rate of 38%, in line with previous samples selected with standard (no deep learning) classification tools and follow-up by Hubble Space Telescope. This first test shows that machine learning can be efficiently extended to feature recognition in the wavelength space, which will be crucial for future surveys like 4MOST, DESI, Euclid, and the Chinese Space Station Telescope (CSST).

Signatures of the Shock Interaction as an Additional Power Source in the Nebular Spectra of SN 2023ixf

Red supergiants may lose significant mass through steady winds and episodic eruptions in the final 100-1000 years before the core collapses, shaping their circumstellar environment. Interaction between supernova (SN) ejecta and distant circumstellar material (CSM) can generate shocks, which can energize the ejecta and serve as a key power source during the nebular phase of the SN. In the present work, we investigate the nebular spectrum of SN 2023ixf, observed one year post-explosion (at +363 d) with the recently commissioned WEAVE instrument on the 4.2m William Herschel Telescope. This marks the first supernova spectrum captured with WEAVE. In this spectrum, Halpha exhibits a peculiar evolution, flanked by blueward and redward broad components centred at simpm 5650,km,s^{-1} from the rest velocity of Halpha, which are seen for only a few SNe to date. These features indicate energy deposition from shocks generated by the interaction of ejecta with a CSM expelled nearly 350 - 640 years pre-explosion. Comparisons of the +363 d spectrum with model spectra from the literature, that include varying shock powers, suggest a shock power of at least sim 5 times 10 ^{40},erg,s^{-1} at this epoch. Additionally, analysis of the [O I] doublet, along with other prominent emission lines, provides evidence for clumpiness, dust formation, and asymmetry within the ejecta and/or the surrounding CSM. These emission lines also helped to constrain the oxygen mass (approx0.19^{scriptscriptstyle +0.08}_{scriptscriptstyle -0.04} M_odot), He-core mass (<3 M_odot) and the zero-age main sequence mass (lesssim 12 M_odot) of the progenitor of SN 2023ixf. The comparison with other Type II SNe highlights SN 2023ixf's unique shock interaction signatures and evidence of dust formation, setting it apart in terms of evolution and dynamics.

Probing X-ray Timing and Spectral Variability in the Blazar PKS 2155-304 Over a Decade of XMM-Newton Observations

Blazars, a class of active galactic nuclei (AGN) powered by supermassive black holes, are known for their remarkable variability across multiple timescales and wavelengths. With advancements in both ground- and space-based telescopes, our understanding of AGN central engines has significantly improved. However, the mechanisms driving this variability remain elusive, and continue to fascinate both theorists and observers alike. The primary objective of this study is to constrain the X-ray variability properties of the TeV blazar PKS 2155-304. We conduct a comprehensive X-ray spectral and timing analysis, focusing on both long-term and intra-day variability. This analysis uses data from 22 epochs of XMM-Newton EPIC-pn observations, collected over 15 years (2000-2014). To investigate the variability of the source, we applied both timing and spectral analyses. For the timing analysis, we estimated fractional variability, variability amplitude, minimum variability timescales, flux distribution, and power spectral density (PSD). In the spectral analysis, we fitted the X-ray spectra using power-law, log-parabola, and broken power-law (BPL) models to determine the best-fitting parameters. Additionally, we studied the hardness ratio (HR). We observed moderate intra-day variability in most of the light curves. Seven out of the twenty-two observations showed a clear bimodal flux distribution, indicating the presence of two distinct flux states. Our analysis revealed a variable power-law PSD slope. Most HR plots did not show significant variation with flux, except for one observation (OBSID 0124930501), where HR increased with flux (Count/s). The fitted X-ray spectra favored the BPL model for the majority of observations. The findings of this work shed light on the intraday variability of blazars, providing insights into the non-thermal jet processes that drive the observed flux variations.