CamiloVega commited on
Commit
9bd40a4
Β·
verified Β·
1 Parent(s): a6f5353

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +159 -1
README.md CHANGED
@@ -10,4 +10,162 @@ pinned: false
10
  license: mit
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  license: mit
11
  ---
12
 
13
+ # All-in-One News Generator
14
+
15
+ An AI-powered application that assists journalists and content creators in generating well-structured news articles by processing multiple types of input sources.
16
+
17
+ Created by [Camilo Vega](https://www.linkedin.com/in/camilo-vega-169084b1/), AI Consultant
18
+
19
+ ![News Generator](https://via.placeholder.com/800x400?text=News+Generator+App)
20
+
21
+ ## Features
22
+
23
+ - **Multi-Source Input Processing**:
24
+ - Audio and video transcription using OpenAI's Whisper model
25
+ - Social media content extraction (text and video)
26
+ - Document analysis (PDF, DOCX, XLSX, CSV)
27
+ - Web content extraction
28
+
29
+ - **Advanced AI Article Generation**:
30
+ - Produces well-structured news articles following journalistic principles
31
+ - Automatically answers the 5 Ws (Who, What, When, Where, Why) in the first paragraph
32
+ - Maintains quote integrity with 80% direct quotes
33
+ - Customizable tone (serious, neutral, lighthearted)
34
+ - Adjustable article length
35
+
36
+ - **User-Friendly Interface**:
37
+ - Organized tab-based input system
38
+ - Real-time transcription preview
39
+ - Simple one-click draft generation
40
+
41
+ ## Installation
42
+
43
+ ### Prerequisites
44
+
45
+ - Python 3.8 or higher
46
+ - [OpenAI API Key](https://platform.openai.com/)
47
+ - Required packages (see requirements below)
48
+
49
+ ### Step 1: Clone the repository
50
+
51
+ ```bash
52
+ git clone https://github.com/yourusername/news-generator.git
53
+ cd news-generator
54
+ ```
55
+
56
+ ### Step 2: Create a virtual environment
57
+
58
+ ```bash
59
+ python -m venv venv
60
+ source venv/bin/activate # On Windows, use: venv\Scripts\activate
61
+ ```
62
+
63
+ ### Step 3: Install dependencies
64
+
65
+ ```bash
66
+ pip install -r requirements.txt
67
+ ```
68
+
69
+ ### Step 4: Set up your OpenAI API key
70
+
71
+ ```bash
72
+ # On Linux/Mac
73
+ export OPENAI_API_KEY="your-api-key-here"
74
+
75
+ # On Windows
76
+ set OPENAI_API_KEY="your-api-key-here"
77
+ ```
78
+
79
+ ### Requirements
80
+
81
+ Create a `requirements.txt` file with the following dependencies:
82
+
83
+ ```
84
+ openai
85
+ whisper
86
+ gradio
87
+ pydub
88
+ PyMuPDF
89
+ python-docx
90
+ pandas
91
+ requests
92
+ beautifulsoup4
93
+ moviepy
94
+ yt-dlp
95
+ ```
96
+
97
+ ## Usage
98
+
99
+ ### Starting the application
100
+
101
+ ```bash
102
+ python app.py
103
+ ```
104
+
105
+ The application will be available at `http://127.0.0.1:7860` in your web browser.
106
+
107
+ ### Using the application
108
+
109
+ 1. **Input your requirements**:
110
+ - Enter your news article instructions
111
+ - Describe the key facts of your news story
112
+ - Set the desired word count and tone
113
+
114
+ 2. **Add your sources**:
115
+ - Upload audio/video files for automatic transcription
116
+ - Add social media URLs to extract content
117
+ - Include web URLs for additional information
118
+ - Upload documents (PDF, DOCX, XLSX, CSV) to extract relevant data
119
+
120
+ 3. **Generate your draft**:
121
+ - Click "Generate Draft" to create your news article
122
+ - Review the transcriptions to verify source accuracy
123
+ - Use the generated draft as a starting point for your news story
124
+
125
+ ## Technical Details
126
+
127
+ ### Key Components
128
+
129
+ - **Whisper Model**: Large-scale speech recognition model for accurate audio transcription
130
+ - **yt-dlp**: Library for downloading videos from various platforms
131
+ - **BeautifulSoup**: Web scraping tool for extracting content from URLs
132
+ - **OpenAI API**: Powers the advanced language generation capabilities
133
+ - **Gradio**: Creates the user-friendly web interface
134
+
135
+ ### Architecture
136
+
137
+ The application follows a modular design with specialized functions for different types of content processing:
138
+
139
+ - Audio/video processing pipeline:
140
+ 1. Download or read file
141
+ 2. Convert to audio if needed
142
+ 3. Preprocess audio for quality
143
+ 4. Transcribe using Whisper
144
+
145
+ - Document processing:
146
+ - PDF: Extract text from all pages
147
+ - DOCX: Extract text from all paragraphs
148
+ - XLSX/CSV: Convert to string representation
149
+
150
+ - Web content:
151
+ - Extract text from URLs
152
+ - Process social media content (both text and video)
153
+
154
+ - Knowledge base compilation:
155
+ - Organize all sources into a structured format
156
+ - Prepare transcriptions with proper attribution
157
+ - Format content for AI processing
158
+
159
+ ## License
160
+
161
+ This project is licensed under the MIT License - see the LICENSE file for details.
162
+
163
+ ## Acknowledgments
164
+
165
+ - OpenAI for the Whisper and GPT models
166
+ - Gradio team for the web interface framework
167
+ - All open-source libraries utilized in this project
168
+
169
+ ---
170
+
171
+ Β© 2025 Camilo Vega. All Rights Reserved.