arxiv-summarization
This model is a fine-tuned version of google/flan-t5-small
on a dataset of armanc/scientific_papers (arxiv). It is optimized for summarizing scientific abstracts.
Model Details
- Base Model:
google/flan-t5-small
- Training Data: Arxiv Research Papers (
article
→abstract
) - Fine-Tuned Task: Text Summarization
- Use Case: Generate shorter summaries of long research papers
- License: Apache 2.0
How to Use
from transformers import T5ForConditionalGeneration, T5Tokenizer
model = T5ForConditionalGeneration.from_pretrained("Talina06/arxiv-summarization")
tokenizer = T5Tokenizer.from_pretrained("Talina06/arxiv-summarization")
text = "Summarize: Deep learning is being used to advance medical research, particularly in cancer detection."
inputs = tokenizer(text, return_tensors="pt")
summary_ids = model.generate(**inputs)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print("Generated Summary:", summary)
Training Details
- Training Data: 100k+ Arxiv research papers
- Training Framework: Hugging Face Transformers
- Hyperparameters:
- Learning Rate:
5e-5
- Batch Size:
8
- Epochs:
10
- Learning Rate:
- Hardware Used: TPU & GPU
Limitations
- ❌ May struggle with very technical papers (e.g., complex math formulas).
Example Summaries
Original Abstract | Generated Summary |
---|---|
"Deep learning has transformed many fields... We propose a new CNN for cancer detection..." | "A CNN model is proposed for cancer detection using deep learning." |
"Quantum computing has shown potential for cryptographic applications..." | "Quantum computing can be used in cryptography." |
- Downloads last month
- 124
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.
Model tree for Talina06/arxiv-summarization
Base model
google/flan-t5-small