Spaces:
Running
on
Zero
Running
on
Zero
Commit
Β·
ac1681f
1
Parent(s):
f272480
Add detailed background and motivation sections to README
Browse files
README.md
CHANGED
@@ -38,6 +38,18 @@ This demo uses the first prototype of the model, trained on a dataset of **10,00
|
|
38 |
|
39 |
---
|
40 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41 |
## Why Inkling?
|
42 |
|
43 |
> Because the right connection is often obvious β once someone points it out.
|
@@ -46,6 +58,12 @@ Researchers today are overwhelmed by volume. Inkling helps restore those missed-
|
|
46 |
|
47 |
---
|
48 |
|
|
|
|
|
|
|
|
|
|
|
|
|
49 |
## Status
|
50 |
|
51 |
-
Inkling is in **alpha** and under active development. The current model is hosted via Gradio, with a Hugging Face Space available for live interaction and feedback. Contributions, feedback, and collaboration are welcome.
|
|
|
38 |
|
39 |
---
|
40 |
|
41 |
+
## Background and Motivation
|
42 |
+
|
43 |
+
Scientific progress often depends on connecting ideas across papers, fields, and years of literature. But with the volume of research growing exponentially, it's increasingly difficult for any one person β or even a team β to stay on top of it all. As a result, valuable connections between papers often go unnoticed simply because the right expert never read both.
|
44 |
+
|
45 |
+
In 2024, Luo et al. published a landmark study in *Nature Human Behaviour* showing that **large language models (LLMs) can outperform human experts** in predicting the results of neuroscience experiments by integrating knowledge across the scientific literature. Their model, **BrainGPT**, demonstrated how tuning a general-purpose LLM (like Mistral-7B) on domain-specific data could synthesize insights that surpass human forecasting ability. Notably, the authors found that models as small as 7B parameters performed well β an insight that influenced the foundation for this project.
|
46 |
+
|
47 |
+
Inspired by this work β and a YouTube breakdown by physicist and science communicator Sabine Hossenfelder β this project began as an attempt to explore similar methods of knowledge integration at the level of paper-pair relationships. The goal: to train a model that could recognize and reason about **conceptual, methodological, or application-level connections** between research papers, even when those links might be overlooked due to fragmentation in the literature.
|
48 |
+
|
49 |
+
Originally conceived as a perplexity-ranking experiment using LLMs directly (mirroring Luo et al.'s evaluation method), the project gradually evolved into what it is now β **Inkling**, a reasoning-aware embedding model fine-tuned on LLM-rated abstract pairings, and built to help researchers uncover links that would be obvious β *if only someone had the time to read everything*.
|
50 |
+
|
51 |
+
---
|
52 |
+
|
53 |
## Why Inkling?
|
54 |
|
55 |
> Because the right connection is often obvious β once someone points it out.
|
|
|
58 |
|
59 |
---
|
60 |
|
61 |
+
## Citation
|
62 |
+
|
63 |
+
> Luo, X., Rechardt, A., Sun, G. et al. Large language models surpass human experts in predicting neuroscience results. *Nat Hum Behav* **9**, 305β315 (2025). [https://doi.org/10.1038/s41562-024-02046-9](https://doi.org/10.1038/s41562-024-02046-9)
|
64 |
+
|
65 |
+
---
|
66 |
+
|
67 |
## Status
|
68 |
|
69 |
+
Inkling is in **alpha** and under active development. The current model is hosted via Gradio, with a Hugging Face Space available for live interaction and feedback. Contributions, feedback, and collaboration are welcome.
|