|
--- |
|
license: mit |
|
pipeline_tag: summarization |
|
--- |
|
|
|
# ANNOUNCEMENT: |
|
|
|
!!! The previous pipeline has been destroyed by a system update of hugging face. !!! |
|
In the updated pipeline, the generated text may show like: |
|
``` |
|
Today was a beautiful day, I have done all my homework. I wonder is this I need? Thank you so much for your help.TL;DRThank You |
|
``` |
|
please filter the result by TL;DR. |
|
|
|
## Model Description |
|
This is the project for the course Multimedia System in Leiden University 2023 fall. The model is part of self-designed software for the final project that summarizes short texts. |
|
The original summarization model is from https://towardsdatascience.com/text-summarization-with-gpt2-and-layer-ai-599625085d8e . |
|
|
|
## Training Datasets |
|
- Amazon review dataset https://www.kaggle.com/datasets/kritanjalijain/amazon-reviews?select=amazon_review_polarity_csv.tgz |
|
- Twitter crawler dataset https://www.kaggle.com/datasets/tripathiharsh/training |
|
- Emotion analysis dataset https://huggingface.co/datasets/dair-ai/emotion |
|
- Kindle review dataset https://www.kaggle.com/datasets/meetnagadia/amazon-kindle-book-review-for-sentiment-analysis/data |
|
|
|
## TODO |
|
- Improve training data quality |
|
- Model analysis |
|
|
|
## Notification |
|
Due to the technique support of hugging face, this project can only functioning with Interface code below. When you first call the API, it may take more than 1 minutes to download the model and functioning. Please wait carefully, thank you! |
|
|
|
## Fine-tune Code |
|
You can check "text_summary.py" in the uploaded documents, and download the model to test and modify this task. |
|
|
|
## Interface(legacy) |
|
please see the announcement above. |
|
```python |
|
import requests |
|
|
|
API_URL = "https://api-inference.huggingface.co/models/Lin0He/text-summary-gpt2-short" |
|
headers = {"Authorization": "Bearer <Your API tokens>"} |
|
|
|
def query(payload): |
|
response = requests.post(API_URL, headers=headers, json=payload) |
|
return response.json() |
|
|
|
output = query({ |
|
"inputs": "Today was a beautiful day. I have down all my homework, and baked some cookies with my friends. What a lovely day." |
|
}) |
|
print(output) |
|
``` |