b1-R1-1.5B-ONNX / README.md
fjmgAI's picture
Update README.md
5ba7106 verified
metadata
library_name: transformers.js
base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
license: apache-2.0
datasets:
  - Kukedlc/dpo-orpo-spanish-15k
language:
  - en
  - es

Fine-Tuned Model

fjmgAI/b1-R1-1.5B-ONNX

Base Model

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Fine-Tuning Method

Fine-tuning was performed using unsloth, an efficient fine-tuning framework optimized for low-resource environments and Huggingface's TRL library. Using ONNx runtime to transform the resulting model weights and make it compatible with Transformers.js.

Dataset

Kukedlc/dpo-orpo-spanish-15k

Description

A Spanish-language dataset containing 15,000 examples, designed for Direct Preference Optimization (DPO) or Outcome-Regularized Preference Optimization (ORPO).

Adaptation

The dataset was adapted to a reasoning-based format for GPRO, enhancing its ability to guide preference-based decision-making during fine-tuning. This adaptation ensures better alignment with instruction-following tasks in Spanish.

Fine-Tuning Details

  • The model was trained using the GPRO algorithm, leveraging structured preference data to refine its response generation.
  • The focus was on retaining the model's instructional abilities while improving its understanding and generation of Spanish text.

Usage (Transformers.js)

If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

npm i @huggingface/transformers

Example: Text-generation w/ fjmgAI/b1-R1-1.5B-ONNX

import { pipeline, TextStreamer } from "@huggingface/transformers";

// Create a text generation pipeline
const generator = await pipeline(
  "text-generation",
  "fjmgAI/b1-R1-1.5B-ONNX",
  { dtype: "q4f16" },
);

// Define the list of messages
const messages = [
  { role: "user", content:  "Resuelve esta ecuación: x^2 - 3x + 2 = 0" },
];

// Create text streamer
const streamer = new TextStreamer(generator.tokenizer, {
  skip_prompt: true,
  // callback_function: (text) => { }, // Optional callback function
})

// Generate a response
const output = await generator(messages, { max_new_tokens: 512, do_sample: false, streamer });
console.log(output[0].generated_text.at(-1).content);
See example output
<think>
To solve the quadratic equation \( x^2 - 3x + 2 = 0 \), I'll start by factoring the left-hand side. I need to find two numbers that multiply to 2 and add up to -3. These numbers are -1 and -2.

Next, I'll rewrite the equation as \( (x - 1)(x - 2) = 0 \). 

Using the zero product property, I'll set each factor equal to zero:
1. \( x - 1 = 0 \) leads to \( x = 1 \).
2. \( x - 2 = 0 \) leads to \( x = 2 \).

Therefore, the solutions to the equation are \( x = 1 \) and \( x = 2 \).
</think>

To solve the quadratic equation:

\[
x^2 - 3x + 2 = 0
\]

**Step 1: Factor the Quadratic**

We look for two numbers that multiply to \( +2 \) and add up to \( -3 \). These numbers are \( -1 \) and \( -2 \).

\[
x^2 - 3x + 2 = (x - 1)(x - 2) = 0
\]

**Step 2: Apply the Zero Product Property**

If the product of two factors is zero, at least one of the factors must be zero.

\[
x - 1 = 0 \quad \text{or} \quad x - 2 = 0
\]

**Step 3: Solve for \( x \)**

\[
x = 1 \quad \text{or} \quad x = 2
\]

**Final Answer:**

\[
\boxed{1 \text{ and } 2}
\]

Purpose

This fine-tuned model is intended for Spanish-language applications that require efficient AI that follows instructions using a lightweight reasoning process.

  • Developed by: fjmgAI
  • License: apache-2.0

Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named onnx).