amirulhazym commited on
Commit
8848760
·
verified ·
1 Parent(s): 440686c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -80
README.md CHANGED
@@ -1,80 +0,0 @@
1
- Sentiment Analysis API
2
-
3
- Overview
4
- This mini-project is a web API that classifies text sentiment as Positive, Negative, or Neutral using the cardiffnlp/twitter-roberta-base-sentiment model from Hugging Face. Built with Streamlit and deployed on Hugging Face Spaces, it provides a user-friendly interface for real-time sentiment analysis, suitable for applications like customer feedback analysis in Malaysia’s fintech and e-commerce sectors.
5
-
6
- Live Demo
7
- Will be update here after deployment.
8
-
9
- Features
10
-
11
- Classifies text sentiment with confidence scores and visual breakdowns (progress bar, bar chart).
12
- Example buttons for quick testing with Positive, Negative, and Neutral inputs.
13
- Responsive Streamlit interface with error handling and model information.
14
- Optimized model loading with @st.cache_resource for efficient deployment.
15
-
16
- Installation
17
-
18
- Clone the repository:
19
- git clone https://github.com/amirulhazym/sentiment-analysis-api.git
20
- cd sentiment-analysis-api
21
-
22
-
23
- Create and activate a virtual environment:
24
- python -m venv sa-env
25
- .\sa-env\Scripts\activate # On Windows
26
-
27
-
28
- Install dependencies:
29
- pip install -r requirements.txt
30
-
31
-
32
- Run the app locally:
33
- streamlit run app.py
34
-
35
-
36
-
37
- Usage
38
-
39
- Access the app via the live URL or locally.
40
- Enter text in the text area or click example buttons (Positive, Negative, Neutral).
41
- Click "Analyze Sentiment" to view the prediction, confidence score, progress bar, and sentiment breakdown chart.
42
- Expand the "About the Model" section for details on the underlying BERT model.
43
-
44
- Model Details
45
-
46
- Model: cardiffnlp/twitter-roberta-base-sentiment (RoBERTa-base)
47
- Training Data: ~58M tweets, fine-tuned on TweetEval benchmark
48
- Classes: Negative (LABEL_0), Neutral (LABEL_1), Positive (LABEL_2)
49
- Performance: ~85% accuracy on tweet_eval test set (100 samples)
50
- Limitations: Optimized for short, English, Twitter-like texts; may vary on long or non-English inputs.
51
-
52
- Metrics
53
-
54
- Accuracy: ~70% on tweet_eval test set (100 samples).
55
- Precision/Recall: Qualitatively aligns with model’s reported performance; full metrics pending further testing due to 1-day constraint.
56
-
57
- Relevance to Malaysia/Singapore
58
- This API supports sentiment analysis for customer feedback in fintech (e.g., Grab, CIMB) and e-commerce (e.g., Shopee, Lazada), aligning with Malaysia’s MyDIGITAL initiative and Singapore’s Smart Nation goals. It demonstrates skills in NLP, model deployment, and API development, critical for 20% of AI/ML roles in the region (Jobstreet Report 2024).
59
-
60
- Limitations
61
-
62
- Limited to single-text input; no batch processing.
63
- English-focused; performance on Bahasa Malaysia is suboptimal (e.g., "Saya suka produk ini!" misclassified as Neutral).
64
- May require fine-tuning for domain-specific applications (e.g., Malaysian social media).
65
-
66
- Future Improvements
67
-
68
- Fine-tune on Malaysia-specific data (e.g., Malay tweets from brands like AirAsia).
69
- Add support for Bahasa Malaysia to address local language needs.
70
- Implement batch input processing for scalability in high-traffic scenarios.
71
- Enhance with user feedback mechanism for continuous improvement.
72
-
73
- Credits
74
-
75
- Hugging Face Transformers for the pre-trained model.
76
- Streamlit for the web interface.
77
- PyTorch for the deep learning framework.
78
-
79
- Author
80
- Amirulhazym, AI/ML Enthusiast