Spaces:

nomadicsynth
/

inkling

Running on Zero

App Files Files Community

nomadicsynth commited on Apr 23

Commit

ac1681f

1 Parent(s): f272480

Add detailed background and motivation sections to README

Browse files

Files changed (1) hide show

README.md +19 -1

README.md CHANGED Viewed

@@ -38,6 +38,18 @@ This demo uses the first prototype of the model, trained on a dataset of **10,00
 ---
 ## Why Inkling?
 > Because the right connection is often obvious — once someone points it out.
@@ -46,6 +58,12 @@ Researchers today are overwhelmed by volume. Inkling helps restore those missed-
 ---
 ## Status
-Inkling is in **alpha** and under active development. The current model is hosted via Gradio, with a Hugging Face Space available for live interaction and feedback. Contributions, feedback, and collaboration are welcome.

 ---
+## Background and Motivation
+Scientific progress often depends on connecting ideas across papers, fields, and years of literature. But with the volume of research growing exponentially, it's increasingly difficult for any one person — or even a team — to stay on top of it all. As a result, valuable connections between papers often go unnoticed simply because the right expert never read both.
+In 2024, Luo et al. published a landmark study in *Nature Human Behaviour* showing that **large language models (LLMs) can outperform human experts** in predicting the results of neuroscience experiments by integrating knowledge across the scientific literature. Their model, **BrainGPT**, demonstrated how tuning a general-purpose LLM (like Mistral-7B) on domain-specific data could synthesize insights that surpass human forecasting ability. Notably, the authors found that models as small as 7B parameters performed well — an insight that influenced the foundation for this project.
+Inspired by this work — and a YouTube breakdown by physicist and science communicator Sabine Hossenfelder — this project began as an attempt to explore similar methods of knowledge integration at the level of paper-pair relationships. The goal: to train a model that could recognize and reason about **conceptual, methodological, or application-level connections** between research papers, even when those links might be overlooked due to fragmentation in the literature.
+Originally conceived as a perplexity-ranking experiment using LLMs directly (mirroring Luo et al.'s evaluation method), the project gradually evolved into what it is now — **Inkling**, a reasoning-aware embedding model fine-tuned on LLM-rated abstract pairings, and built to help researchers uncover links that would be obvious — *if only someone had the time to read everything*.
+---
 ## Why Inkling?
 > Because the right connection is often obvious — once someone points it out.
 ---
+## Citation
+> Luo, X., Rechardt, A., Sun, G. et al. Large language models surpass human experts in predicting neuroscience results. *Nat Hum Behav* **9**, 305–315 (2025). [https://doi.org/10.1038/s41562-024-02046-9](https://doi.org/10.1038/s41562-024-02046-9)
+---
 ## Status
+Inkling is in **alpha** and under active development. The current model is hosted via Gradio, with a Hugging Face Space available for live interaction and feedback. Contributions, feedback, and collaboration are welcome.