File size: 2,102 Bytes
5a9fc1b
3fc01cd
51c93fe
315969f
0038f7e
 
a5602ad
0f22ec4
 
f6c74b9
 
 
 
a5602ad
6d2801b
3775d97
d96c778
3775d97
 
354bc1c
 
 
 
3775d97
 
 
 
8042c99
4b658e0
 
 
077b6cd
 
6d2801b
a5602ad
 
6d2801b
8042c99
 
 
1a8668c
8042c99
 
 
 
 
 
 
 
 
3775d97
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---
license: mit
pipeline_tag: summarization
---

# ANNOUNCEMENT:

!!! The previous pipeline has been destroyed by a system update of hugging face. !!!
In the updated pipeline, the generated text may show like: 
```
Today was a beautiful day, I have done all my homework. I wonder is this I need? Thank you so much for your help.TL;DRThank You
```
please filter the result by TL;DR.

## Model Description
This is the project for the course Multimedia System in Leiden University 2023 fall. The model is part of self-designed software for the final project that summarizes short texts.
The original summarization model is from https://towardsdatascience.com/text-summarization-with-gpt2-and-layer-ai-599625085d8e .

## Training Datasets
- Amazon review dataset https://www.kaggle.com/datasets/kritanjalijain/amazon-reviews?select=amazon_review_polarity_csv.tgz
- Twitter crawler dataset https://www.kaggle.com/datasets/tripathiharsh/training
- Emotion analysis dataset https://huggingface.co/datasets/dair-ai/emotion
- Kindle review dataset https://www.kaggle.com/datasets/meetnagadia/amazon-kindle-book-review-for-sentiment-analysis/data

## TODO
- Improve training data quality
- Model analysis

## Notification
Due to the technique support of hugging face, this project can only functioning with Interface code below. When you first call the API, it may take more than 1 minutes to download the model and functioning. Please wait carefully, thank you!

## Fine-tune Code
You can check "text_summary.py" in the uploaded documents, and download the model to test and modify this task.

## Interface(legacy)
please see the announcement above.
```python
import requests

API_URL = "https://api-inference.huggingface.co/models/Lin0He/text-summary-gpt2-short"
headers = {"Authorization": "Bearer <Your API tokens>"}

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()
    
output = query({
    "inputs": "Today was a beautiful day. I have down all my homework, and baked some cookies with my friends. What a lovely day."
})
print(output)
```