File size: 2,102 Bytes
5a9fc1b 3fc01cd 51c93fe 315969f 0038f7e a5602ad 0f22ec4 f6c74b9 a5602ad 6d2801b 3775d97 d96c778 3775d97 354bc1c 3775d97 8042c99 4b658e0 077b6cd 6d2801b a5602ad 6d2801b 8042c99 1a8668c 8042c99 3775d97 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
---
license: mit
pipeline_tag: summarization
---
# ANNOUNCEMENT:
!!! The previous pipeline has been destroyed by a system update of hugging face. !!!
In the updated pipeline, the generated text may show like:
```
Today was a beautiful day, I have done all my homework. I wonder is this I need? Thank you so much for your help.TL;DRThank You
```
please filter the result by TL;DR.
## Model Description
This is the project for the course Multimedia System in Leiden University 2023 fall. The model is part of self-designed software for the final project that summarizes short texts.
The original summarization model is from https://towardsdatascience.com/text-summarization-with-gpt2-and-layer-ai-599625085d8e .
## Training Datasets
- Amazon review dataset https://www.kaggle.com/datasets/kritanjalijain/amazon-reviews?select=amazon_review_polarity_csv.tgz
- Twitter crawler dataset https://www.kaggle.com/datasets/tripathiharsh/training
- Emotion analysis dataset https://huggingface.co/datasets/dair-ai/emotion
- Kindle review dataset https://www.kaggle.com/datasets/meetnagadia/amazon-kindle-book-review-for-sentiment-analysis/data
## TODO
- Improve training data quality
- Model analysis
## Notification
Due to the technique support of hugging face, this project can only functioning with Interface code below. When you first call the API, it may take more than 1 minutes to download the model and functioning. Please wait carefully, thank you!
## Fine-tune Code
You can check "text_summary.py" in the uploaded documents, and download the model to test and modify this task.
## Interface(legacy)
please see the announcement above.
```python
import requests
API_URL = "https://api-inference.huggingface.co/models/Lin0He/text-summary-gpt2-short"
headers = {"Authorization": "Bearer <Your API tokens>"}
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
output = query({
"inputs": "Today was a beautiful day. I have down all my homework, and baked some cookies with my friends. What a lovely day."
})
print(output)
``` |