gabe-zhang commited on
Commit
cc5927f
·
verified ·
1 Parent(s): ff19ed3

Move evaluation to front

Browse files
Files changed (1) hide show
  1. README.md +11 -15
README.md CHANGED
@@ -22,14 +22,23 @@ pipeline_tag: summarization
22
  ## **Model Details**
23
  This is a **LoRA fine-tuned adapter** built on [**meta-llama/Llama-3.2-1B-Instruct**](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct). It is designed for scientific paper summarization tasks and leverages **Low-Rank Adaptation (LoRA)** to enhance model performance efficiently while maintaining a low computational overhead.
24
 
25
- ---
 
 
 
 
 
 
 
 
 
26
 
27
  ## **Dataset**
28
  The model was fine-tuned on the [**armanc/scientific_papers**](https://huggingface.co/datasets/armanc/scientific_papers) dataset. Below are the details of the dataset splits:
29
  - **Training Set**: 20K samples
30
  - **Validation Set**: 6K samples
 
31
 
32
- ---
33
 
34
  ## **LoRA Configuration**
35
  - **Trainable Parameters**: 850K (~7% of base model parameters)
@@ -46,19 +55,6 @@ The model was fine-tuned on the [**armanc/scientific_papers**](https://huggingfa
46
  - **Training Duration**: 28 hours
47
  - **Training Scripts**: [gabe-zhang/paper2summary](https://github.com/gabe-zhang/paper2summary)
48
 
49
- ---
50
-
51
- ## **Evaluation**
52
- The model was evaluated on a **6K-sample test set** using **ROUGE scores** with the following settings:
53
- - **Decoding Strategy**: Beam search (beam size = 4)
54
-
55
- ### **Performance Comparison**
56
- | Model | ROUGE-1 | ROUGE-2 | ROUGE-3 | ROUGE-L |
57
- |---------------------------|----------|----------|----------|----------|
58
- | **Llama-3.2-1B-Instruct** | 36.69 | 7.47 | 1.95 | 19.36 |
59
- | **Llama-PaperSummarization-LoRA** | **41.56** | **11.31** | **2.67** | **21.86** |
60
-
61
- ---
62
 
63
  ## **License**
64
  This repository contains a **LoRA fine-tuned adapter** derived from the Llama 3.2 model.
 
22
  ## **Model Details**
23
  This is a **LoRA fine-tuned adapter** built on [**meta-llama/Llama-3.2-1B-Instruct**](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct). It is designed for scientific paper summarization tasks and leverages **Low-Rank Adaptation (LoRA)** to enhance model performance efficiently while maintaining a low computational overhead.
24
 
25
+
26
+ ## **Performance Comparison**
27
+ | Model | ROUGE-1 | ROUGE-2 | ROUGE-3 | ROUGE-L |
28
+ |---------------------------|----------|----------|----------|----------|
29
+ | **Llama-3.2-1B-Instruct** | 36.69 | 7.47 | 1.95 | 19.36 |
30
+ | **Llama-PaperSummarization-LoRA** | **41.56** | **11.31** | **2.67** | **21.86** |
31
+
32
+ The model was evaluated on a **6K-sample test set** using **ROUGE scores** with the following settings:
33
+ - **Decoding Strategy**: Beam search (beam size = 4)
34
+
35
 
36
  ## **Dataset**
37
  The model was fine-tuned on the [**armanc/scientific_papers**](https://huggingface.co/datasets/armanc/scientific_papers) dataset. Below are the details of the dataset splits:
38
  - **Training Set**: 20K samples
39
  - **Validation Set**: 6K samples
40
+ - **Test Set**: 6K samples
41
 
 
42
 
43
  ## **LoRA Configuration**
44
  - **Trainable Parameters**: 850K (~7% of base model parameters)
 
55
  - **Training Duration**: 28 hours
56
  - **Training Scripts**: [gabe-zhang/paper2summary](https://github.com/gabe-zhang/paper2summary)
57
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58
 
59
  ## **License**
60
  This repository contains a **LoRA fine-tuned adapter** derived from the Llama 3.2 model.