sBERTsci
The model is based on uncased_L-2_H-512_A-8 and trained with the tsdae on a dataset consisting of 20k downloaded and preprocessed physics papers from arxiv, covering all fields of physics in proportion to the number of papers published each year on arxiv
Usage (Sentence-Transformers)
Using this model becomes easy when you have sentence-transformers installed:
pip install -U sentence-transformers
1. Calculating similarity between texts
# Use text corpora from the different fields of physics
## Genetated by DeepSeek R1 1776
# Astrophysics
ast_texts = [
'The universe is a vast expanse filled with celestial wonders—galaxies, stars, planets—but visible matter accounts for less than 5% of its total content. The rest comprises dark matter (27%) and dark energy (68%), two enigmatic components that shape the cosmos yet elude direct detection.',
'Experiments like the Large Hadron Collider (LHC) seek dark matter particles such as WIMPs (Weakly Interacting Massive Particles). Meanwhile, projects like the Vera C. Rubin Observatory aim to map dark matter distribution via sky surveys.',
'Upcoming technologies like the James Webb Space Telescope (JWST) could uncover clues by studying early galaxies. Understanding dark matter and dark energy remains pivotal to solving cosmic puzzles like the universe’s ultimate fate—whether it will expand endlessly or collapse in a "Big Crunch.',
]
# Laser physics
las_texts = [
'Lasers (Light Amplification by Stimulated Emission of Radiation) are ubiquitous in modern technology due to their coherence and precision.',
'Lasers rely on stimulated emission—a process where photons trigger excited atoms to emit identical photons—creating a cascading effect within an optical cavity mirrors amplify this light into a focused beam requiring population inversion (more atoms in excited states than ground states). Common types include gas lasers (e.g., CO₂), solid-state lasers (Nd:YAG), and semiconductor diodes used in everyday devices.',
'In medicine lasers enable minimally invasive surgeries such as LASIK eye correction cancer treatments via photodynamic therapy In manufacturing ultrafast lasers cut materials with nanometer precision enabling microelectronics fabrication Fiber-optic communication systems use laser pulses for high-speed data transfer across continents Quantum cascade lasers operating in terahertz ranges are advancing spectroscopy security scanning',
]
# Hydrodynamics
hyd_texts = [
'Hydrodynamics is the study of fluids in motion, governed by principles such as conservation of mass, momentum, and energy. The Navier-Stokes equations form the cornerstone of this field, describing how velocity, pressure, and viscosity interact in fluid flows. Applications range from aerodynamics—optimizing aircraft design to reduce drag—to marine engineering, where understanding wave dynamics improves ship stability and offshore platform resilience. Turbulence remains a central challenge; its chaotic nature complicates predictions despite advances in computational fluid dynamics (CFD). Modern research leverages high-performance computing to simulate complex flows, such as atmospheric weather patterns or blood circulation in biomedical devices. Innovations like microfluidic systems also rely on hydrodynamic principles for lab-on-a-chip technologies used in diagnostics and drug delivery.',
'In environmental science, hydrodynamics plays a critical role in modeling pollutant dispersion in rivers and oceans. For instance, oil spill trajectories are predicted using Lagrangian particle-tracking models combined with hydrodynamic data from satellites and sensors. Similarly, coastal engineers use shallow-water equations to simulate tsunami propagation and storm surges, aiding disaster preparedness. Renewable energy sectors benefit too: tidal turbines harness kinetic energy from water currents, while pumped-storage hydropower systems balance grid demand. Recent breakthroughs include bio-inspired designs mimicking fish locomotion to create efficient underwater drones. However, scaling laboratory results to real-world scenarios requires addressing nonlinear interactions between fluid layers and boundaries—a frontier explored through advanced visualization techniques like particle image velocimetry.',
'The interplay between hydrodynamics and thermodynamics is vital in industrial processes such as cooling systems or chemical reactors. Multiphase flows—where gases, liquids, or solids coexist—are studied to optimize heat exchangers or catalytic converters. In astrophysics, hydrodynamic simulations model supernova explosions or accretion disks around black holes. Challenges persist in non-Newtonian fluids (e.g., polymers or blood), where viscosity varies with stress, demanding tailored mathematical frameworks. Emerging machine learning methods now augment traditional CFD by accelerating simulations and identifying flow patterns from sparse data. As climate change intensifies hydrological cycles, hydrodynamic models will grow indispensable for managing water resources and mitigating flood risks in urban landscapes.',
]
# Materials science
mat_texts = [
'Lithium-ion batteries benefit from nanostructured electrodes boosting capacity charging speeds Solid-state batteries using ceramic electrolytes promise safer higher-density storage Perovskite solar cells with nanoscale layers achieve record efficiencies rivaling silicon',
'Scanning electron microscopy (SEM) transmission electron microscopy (TEM) reveal atomic structures X-ray diffraction analyzes crystal defects guiding material optimization',
'Nanotoxicity risks require rigorous safety assessments Bio-based materials like cellulose nanocrystals offer eco-friendly alternatives Recycling rare-earth elements from e-waste using nanotechnology addresses resource scarcity',
]
# Quantum Computers
qua_texts = [
'Quantum computers leverage quantum-mechanical phenomena like superposition and entanglement to perform calculations exponentially faster than classical computers for specific problems. Qubits—quantum bits—encode information as 0s, 1s, or both simultaneously (superposition), enabling parallel processing of vast datasets. Entanglement links qubits such that their states correlate instantaneously across distances, even when separated spatially. Current architectures include superconducting circuits (IBM Quantum), trapped ions (Honeywell), and photonic systems (Xanadu). Applications span cryptography with Shor’s algorithm threatening RSA encryption to drug discovery via molecular simulations impractical on classical hardware.',
'Despite their potential, quantum computers face significant hurdles: qubit coherence times are fleeting due to environmental noise (decoherence), necessitating error-correcting codes like surface codes or topological qubits. Scaling remains another challenge; today’s devices have fewer than 1,000 qubits—insufficient for practical advantage over classical systems. Hybrid quantum-classical algorithms like VQE (Variational Quantum Eigensolver) bridge this gap by offloading select tasks to quantum processors while relying on classical optimization. Meanwhile, quantum supremacy milestones—such as Google’s 2019 Sycamore experiment—demonstrate feasibility but lack immediate real-world utility.',
'The future of quantum computing hinges on material science breakthroughs to stabilize qubits and reduce error rates. Photonic quantum computing promises room-temperature operation using light pulses but struggles with gate fidelity. Quantum networking aims to connect processors via quantum entanglement distribution for secure communication (quantum internet). Industries from finance to logistics anticipate optimization gains through quantum algorithms like Grover’s search or QAOA (Quantum Approximate Optimization Algorithm). As governments invest billions into national initiatives (e.g., U.S.’s National Quantum Initiative Act), interdisciplinary collaboration will drive progress toward fault-tolerant systems capable of revolutionizing fields from AI training to fusion energy modeling.',
]
# Then encode corpuses
from sentence_transformers import SentenceTransformer
usd_md = SentenceTransformer("AnKph/sBERTsci")
total = usd_md.encode(ast_texts +
las_texts +
hyd_texts +
mat_texts +
qua_texts)
# Calculate cosin distance
from scipy.spatial.distance import cosine
total_ar = np.array(total)
csin_res = np.zeros((total_ar.shape[0],total_ar.shape[0]))
for i_in in np.arange(total_ar.shape[0]):
for j_in in np.arange(total_ar.shape[0]):
csin_res[i_in,j_in] = cosine(total[i_in],total[j_in])
# And look at the result:
import matplotlib.pylab as plt
import seaborn as sns
ax = sns.heatmap(csin_res, linewidth=0.5)
plt.show()
- Downloads last month
- 29
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.
Model tree for AnKph/sBERTsci
Base model
google-bert/bert-base-uncased