|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- armanc/scientific_papers |
|
language: |
|
- en |
|
base_model: |
|
- google/flan-t5-small |
|
tags: |
|
- summarization |
|
- research-papers |
|
- arxiv |
|
- t5 |
|
--- |
|
|
|
# arxiv-summarization |
|
This model is a fine-tuned version of [`google/flan-t5-small`](https://huggingface.co/google/flan-t5-small) on a dataset of armanc/scientific_papers (arxiv). It is optimized for **summarizing scientific abstracts**. |
|
|
|
## Model Details |
|
- **Base Model:** `google/flan-t5-small` |
|
- **Training Data:** Arxiv Research Papers (`article` → `abstract`) |
|
- **Fine-Tuned Task:** Text Summarization |
|
- **Use Case:** Generate shorter summaries of long research papers |
|
- **License:** Apache 2.0 |
|
|
|
## How to Use |
|
```python |
|
from transformers import T5ForConditionalGeneration, T5Tokenizer |
|
|
|
model = T5ForConditionalGeneration.from_pretrained("Talina06/arxiv-summarization") |
|
tokenizer = T5Tokenizer.from_pretrained("Talina06/arxiv-summarization") |
|
|
|
text = "Summarize: Deep learning is being used to advance medical research, particularly in cancer detection." |
|
inputs = tokenizer(text, return_tensors="pt") |
|
summary_ids = model.generate(**inputs) |
|
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True) |
|
|
|
print("Generated Summary:", summary) |
|
``` |
|
|
|
## Training Details |
|
- **Training Data:** 100k+ Arxiv research papers |
|
- **Training Framework:** Hugging Face Transformers |
|
- **Hyperparameters:** |
|
- Learning Rate: `5e-5` |
|
- Batch Size: `8` |
|
- Epochs: `10` |
|
- **Hardware Used:** TPU & GPU |
|
|
|
## Limitations |
|
- ❌ May struggle with **very technical** papers (e.g., complex math formulas). |
|
|
|
## Example Summaries |
|
| **Original Abstract** | **Generated Summary** | |
|
|----------------------|----------------------| |
|
| "Deep learning has transformed many fields... We propose a new CNN for cancer detection..." | "A CNN model is proposed for cancer detection using deep learning." | |
|
| "Quantum computing has shown potential for cryptographic applications..." | "Quantum computing can be used in cryptography." | |
|
|