Echo-IE-3B-v0.1 / README.md
Rakuto's picture
End of training
e1f7016 verified
|
raw
history blame
4.38 kB
metadata
library_name: transformers
license: llama3.2
base_model: meta-llama/Llama-3.2-3B-Instruct
tags:
  - trl
  - orpo
  - generated_from_trainer
model-index:
  - name: Echo-IE-3B-v0.1
    results: []

Echo-IE-3B-v0.1

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1513
  • Rewards/chosen: -0.0341
  • Rewards/rejected: -0.2967
  • Rewards/accuracies: 0.9688
  • Rewards/margins: 0.2626
  • Logps/rejected: -2.9667
  • Logps/chosen: -0.3407
  • Logits/rejected: 0.9754
  • Logits/chosen: 0.9495
  • Nll Loss: 0.1436
  • Log Odds Ratio: -0.0679
  • Log Odds Chosen: 3.8381

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 6e-06
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 10
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen Nll Loss Log Odds Ratio Log Odds Chosen
0.4124 1.0 44 0.3215 -0.0650 -0.1203 1.0 0.0553 -1.2029 -0.6500 0.4436 0.4952 0.2843 -0.3461 0.9531
0.2262 2.0 88 0.2173 -0.0463 -0.1815 1.0 0.1352 -1.8149 -0.4629 0.6564 0.6969 0.2000 -0.1529 2.1688
0.1567 3.0 132 0.1848 -0.0402 -0.2383 0.9688 0.1981 -2.3833 -0.4018 0.8186 0.8245 0.1734 -0.0986 2.9871
0.1483 4.0 176 0.1688 -0.0372 -0.2683 0.9688 0.2311 -2.6830 -0.3718 0.9081 0.8980 0.1594 -0.0814 3.4098
0.1313 5.0 220 0.1597 -0.0355 -0.2806 0.9688 0.2451 -2.8056 -0.3550 0.9172 0.9010 0.1511 -0.0746 3.6020
0.1173 6.0 264 0.1558 -0.0348 -0.2900 0.9688 0.2552 -2.9003 -0.3481 0.9633 0.9417 0.1476 -0.0712 3.7352
0.131 7.0 308 0.1525 -0.0342 -0.2935 0.9688 0.2592 -2.9346 -0.3424 0.9745 0.9506 0.1446 -0.0690 3.7927
0.1097 8.0 352 0.1516 -0.0341 -0.2956 0.9688 0.2614 -2.9556 -0.3411 0.9658 0.9406 0.1438 -0.0684 3.8234
0.0973 9.0 396 0.1512 -0.0340 -0.2965 0.9688 0.2625 -2.9653 -0.3403 0.9691 0.9441 0.1434 -0.0682 3.8372
0.1161 10.0 440 0.1513 -0.0341 -0.2967 0.9688 0.2626 -2.9667 -0.3407 0.9754 0.9495 0.1436 -0.0679 3.8381

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.0
  • Tokenizers 0.19.1