Nicolay Rusnachenko

nicolay-r

https://nicolay-r.github.io/

AI & ML interests

Information Retrieval・Medical Multimodal NLP (🖼+📝) Research Fellow @BU_Research・software developer http://arekit.io・PhD in NLP

Recent Activity

reacted to mmhamdy's post with 👀 about 10 hours ago

⛓ Evaluating Long Context #2: SCROLLS and ZeroSCROLLS In this series of posts about tracing the history of long context evaluation, we started with Long Range Arena (LRA). Introduced in 2020, Long Range Arens (LRA) is one of the earliest benchmarks designed to tackle the challenge of long context evaluation. But it wasn't introduced to evaluate LLMs, but rather the transformer architecture in general. 📜 The SCROLLS benchmark, introduced in 2022, addresses this gap in NLP/LLM research. SCROLLS challenges models with tasks that require reasoning over extended sequences (according to 2022 standards). So, what does it offer? 1️⃣ Long Text Focus: SCROLLS (unlike LRA) focus mainly on text and contain inputs with thousands of words, testing models' ability to synthesize information across lengthy documents. 2️⃣ Diverse Tasks: Includes summarization, question answering, and natural language inference across domains like literature, science, and business. 3️⃣ Unified Format: All datasets are available in a text-to-text format, facilitating easy evaluation and comparison of models. Building on SCROLLS, ZeroSCROLLS takes long text evaluation to the next level by focusing on zero-shot learning. Other features include: 1️⃣ New Tasks: Introduces tasks like sentiment aggregation and sorting book chapter summaries. 2️⃣ Leaderboard: A live leaderboard encourages continuous improvement and competition among researchers. 💡 What are some other landmark benchmarks in the history of long context evaluation? Feel free to share your thoughts and suggestions in the comments. - SCROLLS Paper: https://huggingface.co/papers/2201.03533 - ZeroSCROLLS Paper: https://huggingface.co/papers/2305.14196

reacted to sequelbox's post with 🧠 about 10 hours ago

Raiden is here! 63k creative-reasoning and analytic-reasoning prompts answered by DeepSeek's 685b R1 model! - All prompts from https://huggingface.co/datasets/microsoft/orca-agentinstruct-1M-v1 and all responses from https://huggingface.co/deepseek-ai/DeepSeek-R1 - A deep look at R1's reasoning skills! Use as you will. Get it now: https://huggingface.co/datasets/sequelbox/Raiden-DeepSeek-R1 for everyone :)

reacted to sequelbox's post with 👀 about 10 hours ago

View all activity

Organizations

None yet

nicolay-r's activity

liked 2 models 1 day ago

facebook/mgenre-wiki

Text2Text Generation • Updated Jan 24, 2023 • 558 • 28

sapienzanlp/relik-entity-linking-base

Updated Aug 7, 2024 • 94 • 2

liked a dataset 7 days ago

open-thoughts/OpenThoughts-114k

Viewer • Updated about 10 hours ago • 228k • 47.1k • 409

liked a Space 8 days ago

502

Qwen2.5 Max Demo

🐢

Send messages for chatbot responses

liked a model 13 days ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

Text Generation • Updated 3 days ago • 515k • 398

liked a model 16 days ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated 3 days ago • 2.94M • • 8.3k

liked a Space 28 days ago

337

Open Medical-LLM Leaderboard

🥇

Browse and submit LLM evaluations

liked a model 28 days ago

johnsnowlabs/JSL-MedLlama-3-8B-v2.0

Text Generation • Updated Apr 30, 2024 • 12.1k • 30

liked a model 4 months ago

meta-llama/Llama-3.2-3B-Instruct

Text Generation • Updated Oct 24, 2024 • 1.59M • • 968

liked a model 7 months ago

hyy-33/hyy33-WASSA-2024-Track-2

Updated Jul 9, 2024 • 2

liked 6 models 8 months ago

liked 4 models 10 months ago

xtuner/llava-phi-3-mini-hf

Image-to-Text • Updated Apr 25, 2024 • 3.95k • 49

xtuner/llava-llama-3-8b-v1_1

Image-Text-to-Text • Updated Apr 28, 2024 • 53 • 120

AIRI-Institute/OmniFusion

Updated Apr 10, 2024 • 56

google-bert/bert-base-uncased

Fill-Mask • Updated Feb 19, 2024 • 85.8M • 2.1k