Move evaluation to front
Browse files
README.md
CHANGED
@@ -22,14 +22,23 @@ pipeline_tag: summarization
|
|
22 |
## **Model Details**
|
23 |
This is a **LoRA fine-tuned adapter** built on [**meta-llama/Llama-3.2-1B-Instruct**](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct). It is designed for scientific paper summarization tasks and leverages **Low-Rank Adaptation (LoRA)** to enhance model performance efficiently while maintaining a low computational overhead.
|
24 |
|
25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
26 |
|
27 |
## **Dataset**
|
28 |
The model was fine-tuned on the [**armanc/scientific_papers**](https://huggingface.co/datasets/armanc/scientific_papers) dataset. Below are the details of the dataset splits:
|
29 |
- **Training Set**: 20K samples
|
30 |
- **Validation Set**: 6K samples
|
|
|
31 |
|
32 |
-
---
|
33 |
|
34 |
## **LoRA Configuration**
|
35 |
- **Trainable Parameters**: 850K (~7% of base model parameters)
|
@@ -46,19 +55,6 @@ The model was fine-tuned on the [**armanc/scientific_papers**](https://huggingfa
|
|
46 |
- **Training Duration**: 28 hours
|
47 |
- **Training Scripts**: [gabe-zhang/paper2summary](https://github.com/gabe-zhang/paper2summary)
|
48 |
|
49 |
-
---
|
50 |
-
|
51 |
-
## **Evaluation**
|
52 |
-
The model was evaluated on a **6K-sample test set** using **ROUGE scores** with the following settings:
|
53 |
-
- **Decoding Strategy**: Beam search (beam size = 4)
|
54 |
-
|
55 |
-
### **Performance Comparison**
|
56 |
-
| Model | ROUGE-1 | ROUGE-2 | ROUGE-3 | ROUGE-L |
|
57 |
-
|---------------------------|----------|----------|----------|----------|
|
58 |
-
| **Llama-3.2-1B-Instruct** | 36.69 | 7.47 | 1.95 | 19.36 |
|
59 |
-
| **Llama-PaperSummarization-LoRA** | **41.56** | **11.31** | **2.67** | **21.86** |
|
60 |
-
|
61 |
-
---
|
62 |
|
63 |
## **License**
|
64 |
This repository contains a **LoRA fine-tuned adapter** derived from the Llama 3.2 model.
|
|
|
22 |
## **Model Details**
|
23 |
This is a **LoRA fine-tuned adapter** built on [**meta-llama/Llama-3.2-1B-Instruct**](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct). It is designed for scientific paper summarization tasks and leverages **Low-Rank Adaptation (LoRA)** to enhance model performance efficiently while maintaining a low computational overhead.
|
24 |
|
25 |
+
|
26 |
+
## **Performance Comparison**
|
27 |
+
| Model | ROUGE-1 | ROUGE-2 | ROUGE-3 | ROUGE-L |
|
28 |
+
|---------------------------|----------|----------|----------|----------|
|
29 |
+
| **Llama-3.2-1B-Instruct** | 36.69 | 7.47 | 1.95 | 19.36 |
|
30 |
+
| **Llama-PaperSummarization-LoRA** | **41.56** | **11.31** | **2.67** | **21.86** |
|
31 |
+
|
32 |
+
The model was evaluated on a **6K-sample test set** using **ROUGE scores** with the following settings:
|
33 |
+
- **Decoding Strategy**: Beam search (beam size = 4)
|
34 |
+
|
35 |
|
36 |
## **Dataset**
|
37 |
The model was fine-tuned on the [**armanc/scientific_papers**](https://huggingface.co/datasets/armanc/scientific_papers) dataset. Below are the details of the dataset splits:
|
38 |
- **Training Set**: 20K samples
|
39 |
- **Validation Set**: 6K samples
|
40 |
+
- **Test Set**: 6K samples
|
41 |
|
|
|
42 |
|
43 |
## **LoRA Configuration**
|
44 |
- **Trainable Parameters**: 850K (~7% of base model parameters)
|
|
|
55 |
- **Training Duration**: 28 hours
|
56 |
- **Training Scripts**: [gabe-zhang/paper2summary](https://github.com/gabe-zhang/paper2summary)
|
57 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
58 |
|
59 |
## **License**
|
60 |
This repository contains a **LoRA fine-tuned adapter** derived from the Llama 3.2 model.
|