malavika4089 commited on
Commit
d16540a
·
verified ·
1 Parent(s): ff1888c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +101 -1
README.md CHANGED
@@ -10,5 +10,105 @@ pinned: false
10
  license: mit
11
  short_description: It is to get the model to tell truth about real world news
12
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
10
  license: mit
11
  short_description: It is to get the model to tell truth about real world news
12
  ---
13
+ # Misinformation Detection Tool
14
+
15
+ ## Overview
16
+
17
+ Misinformation has become a significant issue in today's digital age, influencing public opinion and spreading unreliable news. This project addresses the problem by building a robust **Misinformation Detecting Tool** using advanced **Hugging Face Transformers**. The system is capable of identifying whether a given news article or statement is genuine or fake.
18
+
19
+ ## Problem Statement
20
+
21
+ The rapid spread of misinformation through online platforms and social media has created the need for reliable tools to combat fake news. Identifying fake news manually is time-consuming and prone to bias. This project automates the detection of fake news using natural language processing (NLP) techniques, ensuring scalability and accuracy.
22
+
23
+ ## Objective
24
+
25
+ The objective of this project is to develop and deploy a machine learning model capable of analyzing textual data and accurately classifying it as either **real** or **fake** news. The solution is deployed using Hugging Face Transformers to make it accessible and scalable.
26
+
27
+ ## Features
28
+
29
+ - **Deep Learning Model**: Built on Hugging Face Transformers for state-of-the-art text classification.
30
+ - **Scalable Deployment**: Deployed on Hugging Face for seamless integration and access.
31
+ - **Real-Time Prediction**: Provides instant results for news articles or headlines.
32
+
33
+ ## Methodology
34
+
35
+ 1. **Data Collection**:
36
+ - Collected datasets from Kaggle and other reliable sources containing labeled news articles.
37
+ 2. **Data Preprocessing**:
38
+
39
+ - Cleaned and tokenized text data.
40
+ - Removed stop words, special characters, and performed lemmatization.
41
+
42
+ 3. **Model Selection**:
43
+
44
+ - Used a pre-trained transformer model (e.g., BERT, RoBERTa) from Hugging Face.
45
+ - Fine-tuned the model on the fake news dataset.
46
+
47
+ 4. **Training**:
48
+
49
+ - Split the dataset into training and validation sets.
50
+ - Used PyTorch backend for training with optimization techniques.
51
+
52
+ 5. **Evaluation**:
53
+
54
+ - Measured performance using metrics like accuracy, precision, recall, and F1-score.
55
+ - Validated the model with a test dataset to ensure generalizability.
56
+
57
+ 6. **Deployment**:
58
+ - Deployed the model on Hugging Face for public access.
59
+ - API created for real-time predictions.
60
+
61
+ ## Scope
62
+
63
+ - **Immediate Use**: Detects fake news effectively from textual inputs such as headline or article links.
64
+ - **Future Enhancements**:
65
+ - Incorporating language detection and translation for multilingual support.
66
+ - Extending the dataset to include more diverse topics and sources.
67
+ - Integration with video and audio analysis for multimedia content.
68
+ - Expanded database for fact-checking and knowledge graphs.
69
+
70
+ ## Installation and Usage
71
+
72
+ ### Local Setup
73
+
74
+ 1. Clone the repository:
75
+
76
+ ```bash
77
+ git lfs install
78
+ git clone https://huggingface.co/spaces/malavika4089/misinformation-truthteller/tree/main
79
+ cd misinformation-truthteller
80
+ ```
81
+
82
+ 2. Install dependencies:
83
+
84
+ ```bash
85
+ pip install -r requirements.txt
86
+ ```
87
+
88
+ 3. Run the script:
89
+ ```bash
90
+ streamlit run app.py
91
+ ```
92
+
93
+ ### Access Deployed Model
94
+
95
+ The model is deployed on Hugging Face. You can access it [Live link](https://huggingface.co/spaces/malavika4089/misinformation-truthteller).
96
+
97
+ ## Dataset
98
+
99
+ The dataset used for this project was sourced from:
100
+
101
+ - [Kaggle Fake And Real News Dataset](https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset)
102
+
103
+ ## Technologies Used
104
+
105
+ - **Programming Language**: Python
106
+ - **Libraries**: Hugging Face Transformers, PyTorch, Scikit-learn, NumPy, Pandas, streamlit
107
+ - **Deployment**: Hugging Face Spaces,
108
+ - **Tools**: Colab
109
+
110
+ ## License
111
+
112
+ This project is licensed under the [MIT License](LICENSE).
113
+
114