import streamlit as st # Page configuration st.set_page_config( layout="wide", initial_sidebar_state="auto" ) # Custom CSS for better styling st.markdown(""" """, unsafe_allow_html=True) # Title st.markdown('
Automatically Answer Questions (OPEN BOOK)
', unsafe_allow_html=True) # Introduction Section st.markdown("""

Open-book question answering is a task where a model generates answers based on provided text or documents. Unlike closed-book models, open-book models utilize external sources to produce responses, making them more accurate and versatile in scenarios where the input text provides essential context.

This page explores how to implement an open-book question-answering pipeline using state-of-the-art NLP techniques. We use a T5 Transformer model, which is well-suited for generating detailed answers by leveraging the information contained within the input text.

""", unsafe_allow_html=True) # T5 Transformer Overview st.markdown('
Understanding the T5 Transformer for Open-Book QA
', unsafe_allow_html=True) st.markdown("""

The T5 (Text-To-Text Transfer Transformer) model by Google excels in converting various NLP tasks into a unified text-to-text format. For open-book question answering, the model takes a question and relevant context as input, generating a detailed and contextually appropriate answer.

The T5 model's ability to utilize provided documents makes it especially powerful in applications where the accuracy of the response is enhanced by access to supporting information, such as research tools, educational applications, or any system where the input text contains critical data.

""", unsafe_allow_html=True) # Performance Section st.markdown('
Performance and Benchmarks
', unsafe_allow_html=True) st.markdown("""

In open-book settings, the T5 model has been benchmarked across various datasets, demonstrating its capability to generate accurate and comprehensive answers when given relevant context. Its performance has been particularly strong in tasks requiring a deep understanding of the input text to produce correct and context-aware responses.

Open-book T5 models are especially valuable in applications that require dynamic interaction with content, making them ideal for domains such as customer support, research, and educational technologies.

""", unsafe_allow_html=True) # Implementation Section st.markdown('
Implementing Open-Book Question Answering
', unsafe_allow_html=True) st.markdown("""

The following example demonstrates how to implement an open-book question answering pipeline using Spark NLP. The pipeline includes a document assembler and the T5 model to generate answers based on the input text.

""", unsafe_allow_html=True) st.code(''' from sparknlp.base import * from sparknlp.annotator import * from pyspark.ml import Pipeline from pyspark.sql.functions import col, expr document_assembler = DocumentAssembler()\\ .setInputCol("text")\\ .setOutputCol("documents") t5 = T5Transformer()\\ .pretrained(model_name)\\ .setTask("question:")\\ .setMaxOutputLength(200)\\ .setInputCols(["documents"])\\ .setOutputCol("answers") pipeline = Pipeline().setStages([document_assembler, t5]) data = spark.createDataFrame([["What is the impact of climate change on polar bears?"]]).toDF("text") result = pipeline.fit(data).transform(data) result.select("answers.result").show(truncate=False) ''', language='python') # Example Output st.text(""" +------------------------------------------------+ |answers.result | +------------------------------------------------+ |Climate change significantly affects polar ... | +------------------------------------------------+ """) # Model Info Section st.markdown('
Choosing the Right Model for Open-Book QA
', unsafe_allow_html=True) st.markdown("""

When selecting a model for open-book question answering, it's important to consider the specific needs of your application. Below are some of the available models, each offering different strengths based on their transformer architecture:

Among these models, t5_base and longformer_qa_large_4096_finetuned_triviaqa are highly recommended for their strong performance in generating accurate and contextually rich answers, especially in scenarios with long input texts. For faster responses with an emphasis on efficiency, distilbert_base_cased_qa_squad2 and deberta_v3_xsmall_qa_squad2 are excellent choices. Specialized tasks may benefit from models like albert_qa_xxlarge_tweetqa or roberta_qa_roberta_base_squad2_covid, depending on the domain.

Explore the available models on the Spark NLP Models Hub to find the one that best suits your needs.

""", unsafe_allow_html=True) # Footer # References Section st.markdown('
References
', unsafe_allow_html=True) st.markdown("""
""", unsafe_allow_html=True) st.markdown('
Community & Support
', unsafe_allow_html=True) st.markdown("""
""", unsafe_allow_html=True) st.markdown('
Quick Links
', unsafe_allow_html=True) st.markdown("""
""", unsafe_allow_html=True)