metadata

title: AI Study Assistant
emoji: 🏆
colorFrom: yellow
colorTo: gray
sdk: gradio
sdk_version: 5.18.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: 'Summarize text, generate questions, and create concept maps '

AI Study Assistant

This project provides an interactive tool that performs multiple tasks related to text analysis. Given a text in either Arabic or English, the tool will:

Summarize the text.
Generate several questions based on the summary.
Create a concept map linking the summary to the generated questions.

The interface is powered by Gradio, and the tasks are achieved using advanced pre-trained models from Hugging Face.

Project Objectives

The goal of this project is to demonstrate a comprehensive tool that can:

Summarize text in both Arabic and English.
Generate relevant questions based on the summarized text.
Create a concept map to visually represent the relationship between the summary and the questions.

By providing an easy-to-use interface, the application aims to assist users in analyzing any piece of text efficiently, with the ability to generate summaries, questions, and concept maps for further insights.

Implemented Pipelines

The project implements three key pipelines:

Text Summarization:
- A pre-trained model (csebuetnlp/mT5_multilingual_XLSum) is used to generate a concise summary of the input text.
- The model tokenizes the input, processes it, and outputs a summary.
Question Generation:
- Using the valhalla/t5-small-e2e-qg model, this pipeline generates multiple questions based on the summary of the input text. It uses text-to-text generation for this task, returning several relevant questions to probe deeper into the text.
Concept Map Generation:
- The generated summary and questions are then used to create a concept map, visualizing the relationships between the summary and the questions. The map is generated using the graphviz library, and the output is displayed as an image.

How to Use the Interface

Open the AI Study Assistant Space.
In the input box, enter a piece of text in either Arabic or English.
Select the language of the text (Arabic or English) from the dropdown menu.
Click "Submit" to receive:
- A Summary of the text.
- A list of Questions generated from the summary.
- A Concept Map that visually links the summary and questions.

You can also use one of the pre-defined example texts by selecting from the examples section.

Model and Pipeline Choices

Summarization Model: The model used for text summarization is csebuetnlp/mT5_multilingual_XLSum. This model is based on the mT5 (multilingual T5) architecture, specifically fine-tuned for summarization tasks in multiple languages, including both Arabic and English. The mT5 model is capable of handling multilingual input and producing high-quality summaries.
Question Generation Model: We use the valhalla/t5-small-e2e-qg model for question generation. This model is trained to generate meaningful questions from the input text, ensuring that the questions are relevant and informative. It utilizes text-to-text generation for this task, which works effectively for both English and Arabic texts.
Concept Map Generation: The concept map is generated using the graphviz library. This library is widely used for creating visual diagrams and graphs. The concept map helps to visualize the relationship between the summary and the generated questions, making it easier for the user to understand the core ideas and key points.

Bilingual Implementation

This project handles both Arabic and English text inputs, allowing for bilingual summarization, question generation, and concept map creation. Here's how it's addressed:

Model Choice: The csebuetnlp/mT5_multilingual_XLSum model supports multiple languages, including Arabic and English. This allows the tool to generate accurate summaries in both languages.
Text Preprocessing: The user is asked to select the language of the input text (Arabic or English), which ensures that the model applies the correct preprocessing steps for tokenization and summarization.
Question Generation: The valhalla/t5-small-e2e-qg model also supports both Arabic and English, ensuring that questions generated from the summary are linguistically appropriate.

This bilingual implementation allows users to work with text in either Arabic or English seamlessly, making it suitable for a wide range of users and text types.

Requirements

To run this project locally, you need Python 3.7 or higher. You also need to install the following Python libraries:

pip install gradio pip install transformers pip install torch pip install graphviz pip install pillow

Gradio: Provides the interactive interface for the application. Transformers: Hugging Face's library for pre-trained models, used for summarization and question generation. Torch: PyTorch, the backend used for model inference. Graphviz: Used to create and render the concept map. Pillow: A Python Imaging Library (PIL) used to handle image output for the concept map.

Example Usage

Input: Text: "Artificial intelligence is a branch of computer science that aims to create intelligent machines that work and react like humans." Language: English Output: Summary: "AI is a branch of computer science aimed at creating intelligent machines that mimic human behavior."

Questions: "What is artificial intelligence?" "What is the goal of artificial intelligence?" "What is the role of computers in artificial intelligence?" Concept Map: A visual map linking the summary to the generated questions.

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference