File size: 2,554 Bytes
0878a3a e606fe2 0878a3a 9f90915 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
---
language:
- en
datasets:
- mindchain/wikitext2
- yahma/alpaca-cleaned
metrics:
- perplexity
- accuracy
base_model:
- TinyLlama/TinyLlama_v1.1
model-index:
- name: TinyLlama_v1.1_mix_wikitext_alpaca_1bit_BitDistiller_baseline
results:
- task:
type: multiple-choice
name: QA Benchmarking
dataset:
type: allenai/arc
name: ARC-Challenge
config: challenge
split: test
metrics:
- type: accuracy
name: Accuracy
value: 0.2150170648464164
- type: accuracy
name: Normalized Accuracy
value: 0.24744027303754265
- task:
type: multiple-choice
name: QA Benchmarking
dataset:
type: hellaswag
name: HellaSwag
split: test
metrics:
- type: accuracy
name: Accuracy
value: 0.2568213503286198
- type: accuracy
name: Normalized Accuracy
value: 0.253359888468433
- task:
type: multiple-choice
name: QA Benchmarking
dataset:
type: piqa
name: PIQA
split: validation
metrics:
- type: accuracy
name: Accuracy
value: 0.5282916213275299
- type: accuracy
name: Normalized Accuracy
value: 0.5027203482845702
- task:
type: multiple-choice
name: QA Benchmarking
dataset:
type: winogrande
name: Winogrande
split: test
metrics:
- type: accuracy
name: Accuracy
value: 0.5122336227308603
- task:
type: multiple-choice
name: QA Benchmarking
dataset:
type: aggregated
name: QA-Avg
metrics:
- type: accuracy
name: QA Average
value: 0.3780991480835666
---
# TinyLlama_v1.1_1bit_BitDistiller
This is a 1-bit quantized version of TinyLlama v1.1, trained using BitDistiller with asymmetric quantization and self-distillation (CAKLD) to optimize accuracy retention under extreme compression. The model is fine-tuned on WikiText-2 and Alpaca-cleaned datasets and evaluated on multiple-choice QA benchmarks.
Key Features:
- 1-bit quantization for ultra-efficient inference.
- Asymmetric weight clipping to reduce precision loss.
- CAKLD knowledge distillation to preserve performance.
- Tested on ARC-Challenge, HellaSwag, PIQA, and Winogrande. |