Spaces:

nomadicsynth
/

inkling

Running on Zero

App Files Files Community

nomadicsynth commited on Apr 23

Commit

d89b95a

1 Parent(s): ac1681f

Update README

Browse files

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -44,7 +44,7 @@ Scientific progress often depends on connecting ideas across papers, fields, and
 In 2024, Luo et al. published a landmark study in *Nature Human Behaviour* showing that **large language models (LLMs) can outperform human experts** in predicting the results of neuroscience experiments by integrating knowledge across the scientific literature. Their model, **BrainGPT**, demonstrated how tuning a general-purpose LLM (like Mistral-7B) on domain-specific data could synthesize insights that surpass human forecasting ability. Notably, the authors found that models as small as 7B parameters performed well — an insight that influenced the foundation for this project.
-Inspired by this work — and a YouTube breakdown by physicist and science communicator Sabine Hossenfelder — this project began as an attempt to explore similar methods of knowledge integration at the level of paper-pair relationships. The goal: to train a model that could recognize and reason about **conceptual, methodological, or application-level connections** between research papers, even when those links might be overlooked due to fragmentation in the literature.
 Originally conceived as a perplexity-ranking experiment using LLMs directly (mirroring Luo et al.'s evaluation method), the project gradually evolved into what it is now — **Inkling**, a reasoning-aware embedding model fine-tuned on LLM-rated abstract pairings, and built to help researchers uncover links that would be obvious — *if only someone had the time to read everything*.

 In 2024, Luo et al. published a landmark study in *Nature Human Behaviour* showing that **large language models (LLMs) can outperform human experts** in predicting the results of neuroscience experiments by integrating knowledge across the scientific literature. Their model, **BrainGPT**, demonstrated how tuning a general-purpose LLM (like Mistral-7B) on domain-specific data could synthesize insights that surpass human forecasting ability. Notably, the authors found that models as small as 7B parameters performed well — an insight that influenced the foundation for this project.
+Inspired by this work — and a YouTube breakdown by physicist and science communicator **Sabine Hossenfelder**, titled *["AIs Predict Research Results Without Doing Research"](https://www.youtube.com/watch?v=Qgrl3JSWWDE)* — this project began as an attempt to explore similar methods of knowledge integration at the level of paper-pair relationships. Her clear explanation and commentary sparked the idea to apply this paradigm not just to forecasting outcomes, but to identifying latent connections between published studies.
 Originally conceived as a perplexity-ranking experiment using LLMs directly (mirroring Luo et al.'s evaluation method), the project gradually evolved into what it is now — **Inkling**, a reasoning-aware embedding model fine-tuned on LLM-rated abstract pairings, and built to help researchers uncover links that would be obvious — *if only someone had the time to read everything*.