Spaces:

Shakespeared101
/

news-summarise-tts

Sleeping

App Files Files Community

Shakespeared101 commited on Mar 24

Commit

3bf50ea

0 Parent(s):

Reinitialize repository without large files

Browse files

Files changed (10) hide show

.github/workflows/sync_to_huggingface_space.yml +18 -0
.gitignore +0 -0
README.md +141 -0
api.py +25 -0
app.py +84 -0
requirements.txt +0 -0
scrapes.py +81 -0
sentiV_v2.py +153 -0
tts_hindi_edgetts.py +21 -0
utils.py +151 -0

.github/workflows/sync_to_huggingface_space.yml ADDED Viewed

	@@ -0,0 +1,18 @@

+name: Sync to Hugging Face hub
+on:
+  push:
+    branches: [main]
+  workflow_dispatch:
+jobs:
+  sync-to-hub:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v3
+        with:
+          fetch-depth: 0
+          lfs: true
+      - name: Push to hub
+        env:
+          HF_TOKEN: ${{ secrets.HF_TOKEN }}
+        run: git push -f https://Shakespeared101:[email protected]/spaces/Shakespeared101/news-summarise-tts main

.gitignore ADDED Viewed

Binary file (146 Bytes). View file

README.md ADDED Viewed

	@@ -0,0 +1,141 @@

+---
+title: "News Summarizer & TTS"
+emoji: "📰"
+colorFrom: "blue"
+colorTo: "green"
+sdk: "streamlit"
+app_file: "app.py"
+pinned: false
+---
+# News Summarisation and Hindi TTS Application
+## Project Overview
+This project is a web-based application that extracts news articles from multiple sources for a given company, summarizes the articles using advanced NLP techniques (with both Transformer-based and fallback methods), performs sentiment analysis with visual graphs, translates the generated summary to Hindi, and finally converts the Hindi summary into an audio file via text-to-speech (TTS). The application is built using FastAPI for the backend and Streamlit for the frontend, ensuring a smooth and interactive user experience.
+## Features
+- **News Extraction:**
+  Extracts news articles from multiple sources using web scraping techniques.
+- **Summarization:**
+  Generates a combined summary using a Transformer-based summarizer (with fallback to Sumy if needed).
+- **Sentiment Analysis:**
+  Analyzes the sentiment of the news content and visualizes the comparative sentiment (Positive, Negative, Neutral) as a bar graph using matplotlib.
+- **Translation:**
+  Translates the summary from English to Hindi using googletrans for improved quality.
+- **Text-to-Speech (TTS):**
+  Converts the Hindi summary into an audio file using Edge TTS.
+## Setup Instructions
+### Dependencies
+Install all required packages using the command below:
+```bash
+pip install fastapi uvicorn streamlit transformers newspaper3k beautifulsoup4 edge-tts selenium webdriver-manager spacy nltk sumy sacremoses requests googletrans==4.0.0-rc1 matplotlib
+python -m spacy download en_core_web_sm
+python -c "import nltk; nltk.download('vader_lexicon'); nltk.download('punkt')"
+```
+### Running the FastAPI Backend
+In your project directory, run:
+```bash
+uvicorn api:app --reload
+```
+This will start the backend server at [http://127.0.0.1:8000](http://127.0.0.1:8000).
+### Running the Streamlit Frontend
+In another terminal (or a new tab), run:
+```bash
+streamlit run streamlit_app.py
+```
+This will launch the web interface where you can input a company name and interact with the application.
+## Project Structure
+- **`api.py`**
+  Contains the FastAPI application which exposes endpoints for processing news, generating summaries, performing sentiment analysis, translating summaries to Hindi, and creating TTS audio.
+- **`utils.py`**
+  Houses utility functions for:
+  - Extracting articles from news URLs.
+  - Generating combined summaries using Transformer models with Sumy as a fallback.
+  - Translating text to Hindi using googletrans.
+  - Performing comparative sentiment analysis and generating a matplotlib bar chart.
+  - Generating TTS audio from the Hindi summary.
+- **`streamlit_app.py`**
+  Provides a simple and interactive web-based interface using Streamlit. Users can input a company name, view extracted news and summaries, see the sentiment analysis graph, and play the generated TTS audio.
+- **`scrapes.py`**
+  Contains functions for scraping valid news URLs and extracting article content from web pages.
+- **`sentiV_v2.py`**
+  Implements sentiment analysis on the article content using both NLTK’s VADER and Transformer-based methods.
+- **`tts_hindi_edgetts.py`**
+  Utilizes Edge TTS to convert text to speech and saves the output as an audio file.
+- **`.gitignore`**
+  Ensures that large or unnecessary files (like the virtual environment folder `venv/`) are not tracked by Git.
+## Deployment Details
+The application can be deployed on platforms like [Hugging Face Spaces](https://huggingface.co/spaces), Heroku, or Render. For example, if deployed on Hugging Face Spaces:
+- The repository is linked to a new Space.
+- The Streamlit interface is used as the main application.
+- The deployment link (e.g., `https://huggingface.co/spaces/your-username/news-summarisation`) will be provided in the repository README for access.
+## Usage Instructions
+1. **Launch the Application:**
+   Run the FastAPI backend and Streamlit frontend as described above.
+2. **Input a Company Name:**
+   On the Streamlit interface, enter the name of a company (e.g., "Tesla", "Netflix") and click the "Fetch News" button.
+3. **View Results:**
+   - **News Articles:**
+     See a list of extracted news articles along with their metadata (title, URL, date, sentiment, excerpt).
+   - **Sentiment Analysis:**
+     View the comparative sentiment counts and a bar chart visualizing the distribution of positive, negative, and neutral articles.
+   - **Summaries:**
+     Read the combined summary of the news and the translated Hindi summary.
+   - **Audio:**
+     Play the TTS-generated audio of the Hindi summary.
+## Limitations & Future Improvements
+### Limitations:
+- Reliance on web scraping can sometimes result in incomplete article extraction due to website restrictions.
+- The summarization and translation quality might vary based on input length and complexity.
+- TTS accuracy depends on the Edge TTS service and may not always be perfect.
+### Future Improvements:
+- Integrate more robust error handling and fallback mechanisms.
+- Enhance the UI for better user experience.
+- Expand the number of news sources and improve the filtering of relevant content.
+- Implement caching to reduce API call latency.
+- Explore additional TTS options for higher quality audio output.
+## License
+This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
+## Contributing
+Contributions are welcome! Please see the [CONTRIBUTING](CONTRIBUTING.md) file for guidelines on how to contribute to this project.

api.py ADDED Viewed

	@@ -0,0 +1,25 @@

+from fastapi import FastAPI
+from utils import process_news
+app = FastAPI(title="News Summarization & TTS API")
+@app.get("/")
+def read_root():
+    return {"message": "Welcome to the News Summarization & TTS API"}
+@app.get("/news/{company_name}")
+def get_news(company_name: str):
+    """
+    Fetch processed news for a given company.
+    Returns:
+      • A list of articles with title, URL, date, content, sentiment, and score.
+      • A combined summary of all articles.
+      • A Hindi translated summary.
+      • The TTS audio file path.
+      • Comparative sentiment analysis including a visual graph.
+    """
+    return process_news(company_name)
+if __name__ == "__main__":
+    import uvicorn
+    uvicorn.run(app, host="0.0.0.0", port=8000)

app.py ADDED Viewed

	@@ -0,0 +1,84 @@

+import os
+import threading
+import time
+import requests
+import streamlit as st
+import uvicorn
+from fastapi import FastAPI
+from utils import process_news
+import spacy
+try:
+    spacy.load("en_core_web_sm")
+except OSError:
+    import os
+    os.system("python -m spacy download en_core_web_sm")
+# FastAPI app setup
+api = FastAPI(title="News Summarization & TTS API")
+@api.get("/")
+def read_root():
+    return {"message": "Welcome to the News Summarization & TTS API"}
+@api.get("/news/{company_name}")
+def get_news(company_name: str):
+    return process_news(company_name)
+# Function to run FastAPI in a separate thread
+def run_fastapi():
+    uvicorn.run(api, host="0.0.0.0", port=8000)
+# Start FastAPI in a separate thread
+threading.Thread(target=run_fastapi, daemon=True).start()
+# Streamlit app setup
+API_URL = "http://127.0.0.1:8000"  # Since FastAPI runs in the same Space
+st.title("News Summarization and Hindi TTS Application")
+company = st.text_input("Enter Company Name", "")
+if st.button("Fetch News"):
+    if company.strip() == "":
+        st.warning("Please enter a valid company name.")
+    else:
+        with st.spinner("Fetching and processing news..."):
+            time.sleep(2)  # Give FastAPI some time to start
+            try:
+                response = requests.get(f"{API_URL}/news/{company}")
+                if response.status_code == 200:
+                    data = response.json()
+                    st.header(f"News for {data['company']}")
+                    for article in data["articles"]:
+                        st.subheader(article.get("title", "No Title"))
+                        st.markdown(f"**URL:** [Read More]({article.get('url', '#')})")
+                        st.markdown(f"**Date:** {article.get('date', 'N/A')}")
+                        st.markdown(f"**Sentiment:** {article.get('sentiment', 'Neutral')} (Score: {article.get('score', 0):.2f})")
+                        st.markdown(f"**Excerpt:** {article.get('content','')[:300]}...")
+                        st.markdown("---")
+                    st.subheader("Comparative Sentiment Analysis")
+                    comp_sent = data.get("comparative_sentiment", {})
+                    st.write({k: comp_sent[k] for k in ["Positive", "Negative", "Neutral"]})
+                    if "graph" in comp_sent and os.path.exists(comp_sent["graph"]):
+                        st.image(comp_sent["graph"], caption="Sentiment Analysis Graph")
+                    st.subheader("Final Combined Summary")
+                    st.write(data.get("final_summary", "No summary available."))
+                    st.subheader("Hindi Summary")
+                    st.write(data.get("hindi_summary", ""))
+                    st.subheader("Hindi Summary Audio")
+                    audio_path = data.get("tts_audio", None)
+                    if audio_path and os.path.exists(audio_path):
+                        with open(audio_path, "rb") as audio_file:
+                            st.audio(audio_file.read(), format='audio/mp3')
+                    else:
+                        st.error("Audio file not found or TTS generation failed.")
+                else:
+                    st.error("Failed to fetch news from the API. Please try again.")
+            except requests.exceptions.ConnectionError:
+                st.error("API is not running yet. Please wait a moment and try again.")

requirements.txt ADDED Viewed

Binary file (1.26 kB). View file

scrapes.py ADDED Viewed

	@@ -0,0 +1,81 @@

+import requests
+import re
+from bs4 import BeautifulSoup
+from newspaper import Article
+from selenium import webdriver
+from selenium.webdriver.chrome.service import Service
+from selenium.webdriver.chrome.options import Options
+from webdriver_manager.chrome import ChromeDriverManager
+def get_valid_news_urls(company_name):
+    search_url = f'https://www.google.com/search?q={company_name}+news&tbm=nws'
+    headers = {'User-Agent': 'Mozilla/5.0'}
+    response = requests.get(search_url, headers=headers)
+    if response.status_code != 200:
+        print("⚠️ Google News request failed!")
+        return []
+    soup = BeautifulSoup(response.text, 'html.parser')
+    links = []
+    for g in soup.find_all('a', href=True):
+        url_match = re.search(r'(https?://\S+)', g['href'])
+        if url_match:
+            url = url_match.group(1).split('&')[0]
+            if "google.com" not in url:
+                links.append(url)
+    return links[:10]  # Limit to top 10 results
+def extract_article_content(url):
+    try:
+        article = Article(url)
+        article.download()
+        article.parse()
+        return article.text
+    except Exception as e:
+        print(f"⚠️ Newspaper3k failed: {e}")
+    try:
+        response = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
+        if response.status_code != 200:
+            raise Exception("Request failed")
+        soup = BeautifulSoup(response.text, 'html.parser')
+        paragraphs = soup.find_all('p')
+        return '\n'.join(p.text for p in paragraphs if p.text)
+    except Exception as e:
+        print(f"⚠️ BeautifulSoup failed: {e}")
+    try:
+        options = Options()
+        options.add_argument("--headless")  # Run in headless mode
+        options.add_argument("--no-sandbox")
+        options.add_argument("--disable-dev-shm-usage")
+        driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
+        driver.get(url)
+        page_content = driver.page_source
+        driver.quit()
+        soup = BeautifulSoup(page_content, 'html.parser')
+        paragraphs = soup.find_all('p')
+        return '\n'.join(p.text for p in paragraphs if p.text)
+    except Exception as e:
+        print(f"⚠️ Selenium failed: {e}")
+    return None
+def main():
+    company_name = input("Enter company name: ")
+    print(f"\n🔎 Searching news for: {company_name}\n")
+    urls = get_valid_news_urls(company_name)
+    for i, url in enumerate(urls, 1):
+        print(f"\n🔗 Article {i}: {url}\n")
+        content = extract_article_content(url)
+        if content:
+            print("📰 Extracted Content:\n", content[:], "...")
+        else:
+            print("⚠️ Failed to extract content....")
+if __name__ == "__main__":
+    main()

sentiV_v2.py ADDED Viewed

	@@ -0,0 +1,153 @@

+import requests
+import re
+import spacy
+import nltk
+from bs4 import BeautifulSoup
+from newspaper import Article
+from transformers import pipeline
+from selenium import webdriver
+from selenium.webdriver.chrome.service import Service
+from selenium.webdriver.chrome.options import Options
+from webdriver_manager.chrome import ChromeDriverManager
+from nltk.sentiment import SentimentIntensityAnalyzer
+import time
+# Download NLTK resources
+nltk.download('vader_lexicon')
+sia = SentimentIntensityAnalyzer()
+# Load spaCy Named Entity Recognition model
+nlp = spacy.load("en_core_web_sm")
+# Load BERT Sentiment Analyzer
+bert_sentiment = pipeline("sentiment-analysis", model="siebert/sentiment-roberta-large-english")
+def get_valid_news_urls(company_name):
+    search_url = f'https://www.google.com/search?q={company_name}+news&tbm=nws'
+    headers = {'User-Agent': 'Mozilla/5.0'}
+    try:
+        response = requests.get(search_url, headers=headers)
+        response.raise_for_status()
+    except requests.RequestException as e:
+        print(f"⚠️ Google News request failed: {e}")
+        return []
+    soup = BeautifulSoup(response.text, 'html.parser')
+    links = set()
+    for g in soup.find_all('a', href=True):
+        url_match = re.search(r'(https?://\S+)', g['href'])
+        if url_match:
+            url = url_match.group(1).split('&')[0]
+            if "google.com" not in url:  # Ignore Google-related URLs
+                links.add(url)
+    return list(links)[:10]  # Limit to top 10 results
+def extract_article_content(url):
+    try:
+        article = Article(url)
+        article.download()
+        article.parse()
+        if article.text.strip():
+            return article.text
+    except Exception as e:
+        print(f"⚠️ Newspaper3k failed: {e}")
+    try:
+        response = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
+        response.raise_for_status()
+        soup = BeautifulSoup(response.text, 'html.parser')
+        paragraphs = soup.find_all('p')
+        text = '\n'.join(p.text for p in paragraphs if p.text)
+        if text.strip():
+            return text
+    except Exception as e:
+        print(f"⚠️ BeautifulSoup failed: {e}")
+    try:
+        options = Options()
+        options.add_argument("--headless")
+        driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
+        driver.get(url)
+        time.sleep(3)  # Allow time for JavaScript to load content
+        page_content = driver.page_source
+        driver.quit()
+        soup = BeautifulSoup(page_content, 'html.parser')
+        paragraphs = soup.find_all('p')
+        text = '\n'.join(p.text for p in paragraphs if p.text)
+        if text.strip():
+            return text
+    except Exception as e:
+        print(f"⚠️ Selenium failed: {e}")
+    return None
+def filter_relevant_sentences(text, company_name):
+    doc = nlp(text)
+    relevant_sentences = []
+    for sent in text.split('. '):
+        doc_sent = nlp(sent)
+        for ent in doc_sent.ents:
+            if company_name.lower() in ent.text.lower():
+                relevant_sentences.append(sent)
+                break
+    return '. '.join(relevant_sentences) if relevant_sentences else text
+def analyze_sentiment(text):
+    if not text.strip():
+        return "Neutral", 0.0
+    vader_scores = sia.polarity_scores(text)
+    vader_compound = vader_scores['compound']
+    try:
+        bert_result = bert_sentiment(text[:512])[0]  # Limit to 512 tokens
+        bert_label = bert_result['label']
+        bert_score = bert_result['score']
+        bert_value = bert_score if bert_label == "POSITIVE" else -bert_score
+    except Exception as e:
+        print(f"⚠️ BERT sentiment analysis failed: {e}")
+        bert_value = 0.0
+    final_sentiment = (vader_compound + bert_value) / 2
+    if final_sentiment > 0.2:
+        return "Positive", final_sentiment
+    elif final_sentiment < -0.2:
+        return "Negative", final_sentiment
+    else:
+        return "Neutral", final_sentiment
+def main():
+    company_name = input("Enter company name: ")
+    print(f"\n🔎 Searching news for: {company_name}\n")
+    urls = get_valid_news_urls(company_name)
+    if not urls:
+        print("❌ No valid news URLs found.")
+        return
+    seen_articles = set()
+    for i, url in enumerate(urls, 1):
+        if url in seen_articles:
+            continue
+        seen_articles.add(url)
+        print(f"\n🔗 Article {i}: {url}\n")
+        content = extract_article_content(url)
+        if content:
+            filtered_text = filter_relevant_sentences(content, company_name)
+            sentiment, score = analyze_sentiment(filtered_text)
+            print(f"📰 Extracted Content:\n{filtered_text[:500]}...")
+            print(f"📊 Sentiment: {sentiment} (Score: {score:.2f})")
+        else:
+            print("⚠️ Failed to extract content....")
+if __name__ == "__main__":
+    main()

tts_hindi_edgetts.py ADDED Viewed

	@@ -0,0 +1,21 @@

+import edge_tts
+import asyncio
+async def text_to_speech_hindi(text, output_file="news_sample.mp3"):
+    """
+    Convert text to Hindi speech and save as an audio file using Edge TTS.
+    """
+    if not text.strip():
+        print("⚠️ No text provided for TTS.")
+        return
+    print("🎙️ Generating Hindi speech...")
+    communicate = edge_tts.Communicate(text, voice="hi-IN-MadhurNeural")
+    await communicate.save(output_file)
+    print(f"✅ Audio saved as {output_file}")
+    return output_file
+# Example usage
+if __name__ == "__main__":
+    asyncio.run(text_to_speech_hindi("आज की मुख्य खबरें टेस्ला के बारे में हैं।"))

utils.py ADDED Viewed

	@@ -0,0 +1,151 @@

+import asyncio
+import nltk
+import matplotlib.pyplot as plt
+from scrapes import get_valid_news_urls, extract_article_content
+from sentiV_v2 import analyze_sentiment
+from newspaper import Article, Config
+from deep_translator import GoogleTranslator  # Replaced googletrans with deep-translator
+# Helper: Chunk text into smaller parts based on a fixed word count
+def chunk_text_by_words(text, chunk_size=100):
+    words = text.split()
+    return [' '.join(words[i:i+chunk_size]) for i in range(0, len(words), chunk_size)]
+def process_articles(company_name):
+    """Extract articles with metadata from news URLs and only keep those relevant to the company."""
+    urls = get_valid_news_urls(company_name)
+    articles = []
+    # Set up a custom config with a browser user-agent to help avoid 403 errors
+    user_agent = ('Mozilla/5.0 (Windows NT 10.0; Win64; x64) '
+                  'AppleWebKit/537.36 (KHTML, like Gecko) '
+                  'Chrome/92.0.4515.159 Safari/537.36')
+    config = Config()
+    config.browser_user_agent = user_agent
+    config.request_timeout = 10
+    for url in urls:
+        try:
+            art = Article(url, config=config)
+            art.download()
+            art.parse()
+            content = art.text.strip() if art.text.strip() else extract_article_content(url)
+            # Filter out articles that do not mention the company (case-insensitive)
+            if not content or company_name.lower() not in content.lower():
+                continue
+            article_data = {
+                "title": art.title if art.title else "No Title",
+                "url": url,
+                "date": str(art.publish_date) if art.publish_date else "N/A",
+                "content": content
+            }
+            sentiment, score = analyze_sentiment(content)
+            article_data["sentiment"] = sentiment
+            article_data["score"] = score
+            articles.append(article_data)
+        except Exception as e:
+            print(f"Error processing article {url}: {e}")
+    return articles
+def generate_combined_summary(articles):
+    """Generate a combined summary from articles.
+       First attempts to use a transformers pipeline; if it fails, falls back to Sumy."""
+    combined_text = " ".join([article["content"] for article in articles])
+    if not combined_text.strip():
+        return ""
+    # Try using transformers summarizer
+    try:
+        from transformers import pipeline
+        summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
+        summary = summarizer(combined_text, max_length=150, min_length=50, do_sample=False)
+        return summary[0]["summary_text"]
+    except Exception as e:
+        print(f"Transformers summarization failed: {e}")
+        # Fallback using Sumy extraction-based summarization
+        try:
+            from sumy.parsers.plaintext import PlaintextParser
+            from sumy.nlp.tokenizers import Tokenizer
+            from sumy.summarizers.lex_rank import LexRankSummarizer
+            parser = PlaintextParser.from_string(combined_text, Tokenizer("english"))
+            summarizer_sumy = LexRankSummarizer()
+            summary_sentences = summarizer_sumy(parser.document, sentences_count=5)
+            summarized_text = " ".join(str(sentence) for sentence in summary_sentences)
+            return summarized_text if summarized_text else combined_text[:500]
+        except Exception as e2:
+            print(f"Sumy summarization failed: {e2}")
+            return combined_text[:500]
+def translate_to_hindi(text):
+    """Translate English text to Hindi using deep-translator for better quality."""
+    try:
+        translator = GoogleTranslator(source='auto', target='hi')
+        return translator.translate(text)
+    except Exception as e:
+        print(f"Translation failed: {e}")
+        return text
+def comparative_analysis(articles):
+    """Perform comparative sentiment analysis across articles and generate a bar chart."""
+    pos, neg, neu = 0, 0, 0
+    for article in articles:
+        sentiment = article.get("sentiment", "Neutral")
+        if sentiment == "Positive":
+            pos += 1
+        elif sentiment == "Negative":
+            neg += 1
+        else:
+            neu += 1
+    # Create a bar chart using matplotlib
+    labels = ['Positive', 'Negative', 'Neutral']
+    counts = [pos, neg, neu]
+    plt.figure(figsize=(6, 4))
+    bars = plt.bar(labels, counts, color=['green', 'red', 'gray'])
+    plt.title("Comparative Sentiment Analysis")
+    plt.xlabel("Sentiment")
+    plt.ylabel("Number of Articles")
+    for bar, count in zip(bars, counts):
+        height = bar.get_height()
+        plt.text(bar.get_x() + bar.get_width()/2., height, str(count), ha='center', va='bottom')
+    image_path = "sentiment_analysis.png"
+    plt.savefig(image_path)
+    plt.close()
+    return {"Positive": pos, "Negative": neg, "Neutral": neu, "graph": image_path}
+def generate_tts_audio(text, output_file="news_summary.mp3"):
+    """Generate TTS audio file from text using Edge TTS (via tts_hindi_edgetts.py)."""
+    try:
+        from tts_hindi_edgetts import text_to_speech_hindi
+        return asyncio.run(text_to_speech_hindi(text, output_file))
+    except Exception as e:
+        print(f"TTS generation failed: {e}")
+        return None
+def process_news(company_name):
+    """
+    Process news by:
+      • Extracting articles and metadata (only those relevant to the company)
+      • Generating a combined summary of article contents
+      • Translating the summary to Hindi
+      • Generating a Hindi TTS audio file
+      • Performing comparative sentiment analysis with visual output
+    """
+    articles = process_articles(company_name)
+    summary = generate_combined_summary(articles)
+    hindi_summary = translate_to_hindi(summary)
+    tts_audio = generate_tts_audio(hindi_summary)
+    sentiment_distribution = comparative_analysis(articles)
+    result = {
+        "company": company_name,
+        "articles": articles,
+        "comparative_sentiment": sentiment_distribution,
+        "final_summary": summary,
+        "hindi_summary": hindi_summary,
+        "tts_audio": tts_audio  # file path for the generated audio
+    }
+    return result
+if __name__ == "__main__":
+    company = input("Enter company name: ")
+    import json
+    data = process_news(company)
+    print(json.dumps(data, indent=4, ensure_ascii=False))