metadata

base_model: AlignmentResearch/Llama-3.3-Tiny-Classifier

Random LoRA Adapter for Reward Model

This is a randomly initialized LoRA adapter for the AlignmentResearch/Llama-3.3-Tiny-Instruct model, specifically designed for use as a reward model.

Details

Base model: AlignmentResearch/Llama-3.3-Tiny-Classifier
Adapter type: Reward
Seed: 0
LoRA rank: 16
LoRA alpha: 32
target modules: all-linear

Usage

from peft import PeftModel
from transformers import AutoModelForSequenceClassification, AutoTokenizer

# Load base model
base_model = AutoModelForSequenceClassification.from_pretrained("AlignmentResearch/Llama-3.3-Tiny-Classifier")
tokenizer = AutoTokenizer.from_pretrained("AlignmentResearch/Llama-3.3-Tiny-Classifier")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "AlignmentResearch/Llama-3.3-Tiny-Classifier-lora-reward-0")

This reward adapter was created for testing purposes and contains random weights.