Thomas Yiu's picture

Thomas Yiu

legolasyiu

·

AI & ML interests

Artificial intelligence is, SFT, DPO fine tuning, ORPO fine tuning

Recent Activity

upvoted a collection 1 day ago

updated a model 6 days ago

EpistemeAI/ReasoningCore-1.0-3B-Instruct-r01-Reflect-Math

liked a Space 6 days ago

open-llm-leaderboard/comparator

View all activity

Organizations

legolasyiu's activity

upvoted a collection 1 day ago

Gemma 3

All versions of Google's new multimodal models in 1B, 4B, 12B, and 27B sizes. In GGUF, dynamic 4-bit and 16-bit formats. • 25 items • Updated about 4 hours ago • 29

upvoted 2 collections about 1 month ago

Reasoning models

Reasoning models • 15 items • Updated 13 days ago • 2

DeepSeek R1 (All Versions)

DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 1 day ago • 209

upvoted an article about 1 month ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 803

upvoted a collection 3 months ago

Polypsyche Core

Why limit AGI to one “super brain” when you can have multiple minds in one? Psyche of Logic, Psyche of Emotion, Psyche of Improv . • 5 items • Updated Dec 16, 2024 • 1

upvoted 3 collections 4 months ago

self-learning, self-reflect AI

1 item • Updated Nov 26, 2024 • 1

agent

2 items • Updated Jan 2 • 2

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated 21 days ago • 246

upvoted 2 papers 5 months ago

DA-Code: Agent Data Science Code Generation Benchmark for Large Language Models

Paper • 2410.07331 • Published Oct 9, 2024 • 5

DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?

Paper • 2409.07703 • Published Sep 12, 2024 • 67

upvoted a collection 6 months ago

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 576

upvoted a paper 6 months ago

KTO: Model Alignment as Prospect Theoretic Optimization

Paper • 2402.01306 • Published Feb 2, 2024 • 16

upvoted 5 collections 7 months ago

Fireball-12B

Collections of Fireball-12B • 5 items • Updated Aug 29, 2024 • 1

Fireball-Llama-3.1 collections

Fine-tuned Llama 3.1 with different approaches • 3 items • Updated Aug 19, 2024 • 1

EpistemeAI's codegemma-2-9b ggufs

EpistemeAI's fine-tune Gemma 2 9B gguf • 4 items • Updated Aug 19, 2024 • 1

Fireball-Llama-3.1 collection

5 items • Updated Oct 29, 2024 • 1

Direct Preference Optimization Datasets

Datasets suitable for DPO based on having 'chosen', 'rejected', and 'prompt' columns. Created using librarian-bots/dataset-column-search-api • 5007 items • Updated Feb 8 • 6

upvoted 2 papers about 1 year ago

Masked Audio Generation using a Single Non-Autoregressive Transformer

Paper • 2401.04577 • Published Jan 9, 2024 • 43

Simple and Controllable Music Generation

Paper • 2306.05284 • Published Jun 8, 2023 • 149