File size: 8,648 Bytes
5da5ab8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
import streamlit as st
# Page configuration
st.set_page_config(
layout="wide",
initial_sidebar_state="auto"
)
# Custom CSS for better styling
st.markdown("""
<style>
.main-title {
font-size: 36px;
color: #4A90E2;
font-weight: bold;
text-align: center;
}
.sub-title {
font-size: 24px;
color: #4A90E2;
margin-top: 20px;
}
.section {
background-color: #f9f9f9;
padding: 15px;
border-radius: 10px;
margin-top: 20px;
}
.section h2 {
font-size: 22px;
color: #4A90E2;
}
.section p, .section ul {
color: #666666;
}
.link {
color: #4A90E2;
text-decoration: none;
}
</style>
""", unsafe_allow_html=True)
# Title
st.markdown('<div class="main-title">Automatically Answer Questions (CLOSED BOOK)</div>', unsafe_allow_html=True)
# Introduction Section
st.markdown("""
<div class="section">
<p>Closed-book question answering is a challenging task where a model is expected to generate accurate answers to questions without access to external information or documents during inference. This approach relies solely on the pre-trained knowledge embedded within the model, making it ideal for scenarios where retrieval-based methods are not feasible.</p>
<p>In this page, we will explore how to implement a pipeline that can automatically answer questions in a closed-book setting using state-of-the-art NLP techniques. We utilize a T5 Transformer model fine-tuned for closed-book question answering, providing accurate and contextually relevant answers to a variety of trivia questions.</p>
</div>
""", unsafe_allow_html=True)
# T5 Transformer Overview
st.markdown('<div class="sub-title">Understanding the T5 Transformer for Closed-Book QA</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<p>The T5 (Text-To-Text Transfer Transformer) model by Google is a versatile transformer-based model designed to handle a wide range of NLP tasks in a unified text-to-text format. For closed-book question answering, T5 is fine-tuned to generate answers directly from its internal knowledge without relying on external sources.</p>
<p>The model processes input questions and, based on its training, generates a text response that is both relevant and accurate. This makes it particularly effective in applications where access to external data sources is limited or impractical.</p>
</div>
""", unsafe_allow_html=True)
# Performance Section
st.markdown('<div class="sub-title">Performance and Benchmarks</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<p>The T5 model has been extensively benchmarked on various question-answering datasets, including natural questions and trivia challenges. In these evaluations, the closed-book variant of T5 has shown strong performance, often producing answers that are correct and contextually appropriate, even when the model is not allowed to reference any external data.</p>
<p>This makes the T5 model a powerful tool for generating answers in applications such as virtual assistants, educational tools, and any scenario where pre-trained knowledge is sufficient to provide responses.</p>
</div>
""", unsafe_allow_html=True)
# Implementation Section
st.markdown('<div class="sub-title">Implementing Closed-Book Question Answering</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<p>The following example demonstrates how to implement a closed-book question answering pipeline using Spark NLP. The pipeline includes a document assembler, a sentence detector to identify questions, and the T5 model to generate answers.</p>
</div>
""", unsafe_allow_html=True)
st.code('''
from sparknlp.base import *
from sparknlp.annotator import *
from pyspark.ml import Pipeline
from pyspark.sql.functions import col, expr
document_assembler = DocumentAssembler()\\
.setInputCol("text")\\
.setOutputCol("documents")
sentence_detector = SentenceDetectorDLModel\\
.pretrained("sentence_detector_dl", "en")\\
.setInputCols(["documents"])\\
.setOutputCol("questions")
t5 = T5Transformer()\\
.pretrained("google_t5_small_ssm_nq")\\
.setTask('trivia question:')\\
.setInputCols(["questions"])\\
.setOutputCol("answers")
pipeline = Pipeline().setStages([document_assembler, sentence_detector, t5])
data = spark.createDataFrame([["What is the capital of France?"]]).toDF("text")
result = pipeline.fit(data).transform(data)
result.select("answers.result").show(truncate=False)
''', language='python')
# Example Output
st.text("""
+---------------------------+
|answers.result |
+---------------------------+
|[Paris] |
+---------------------------+
""")
# Model Info Section
st.markdown('<div class="sub-title">Choosing the Right T5 Model</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<p>Several T5 models are available, each pre-trained on different datasets and tasks. For closed-book question answering, it's important to select a model that has been fine-tuned specifically for this task. The model used in the example, "google_t5_small_ssm_nq," is optimized for answering trivia questions in a closed-book setting.</p>
<p>For more complex or varied question-answering tasks, consider using larger T5 models like T5-Base or T5-Large, which may offer improved accuracy and context comprehension. Explore the available models on the <a class="link" href="https://sparknlp.org/models?annotator=T5Transformer" target="_blank">Spark NLP Models Hub</a> to find the best fit for your application.</p>
</div>
""", unsafe_allow_html=True)
# Footer
# References Section
st.markdown('<div class="sub-title">References</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<ul>
<li><a class="link" href="https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html" target="_blank">Google AI Blog</a>: Exploring Transfer Learning with T5</li>
<li><a class="link" href="https://sparknlp.org/models?annotator=T5Transformer" target="_blank">Spark NLP Model Hub</a>: Explore T5 models</li>
<li>Model used: <a class="link" href="https://sparknlp.org/2022/05/31/google_t5_small_ssm_nq_en_3_0.html" target="_blank">google_t5_small_ssm_nq</a></li>
<li><a class="link" href="https://github.com/google-research/text-to-text-transfer-transformer" target="_blank">GitHub</a>: T5 Transformer repository</li>
<li><a class="link" href="https://arxiv.org/abs/1910.10683" target="_blank">T5 Paper</a>: Detailed insights from the developers</li>
</ul>
</div>
""", unsafe_allow_html=True)
st.markdown('<div class="sub-title">Community & Support</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<ul>
<li><a class="link" href="https://sparknlp.org/" target="_blank">Official Website</a>: Documentation and examples</li>
<li><a class="link" href="https://join.slack.com/t/spark-nlp/shared_invite/zt-198dipu77-L3UWNe_AJ8xqDk0ivmih5Q" target="_blank">Slack</a>: Live discussion with the community and team</li>
<li><a class="link" href="https://github.com/JohnSnowLabs/spark-nlp" target="_blank">GitHub</a>: Bug reports, feature requests, and contributions</li>
<li><a class="link" href="https://medium.com/spark-nlp" target="_blank">Medium</a>: Spark NLP articles</li>
<li><a class="link" href="https://www.youtube.com/channel/UCmFOjlpYEhxf_wJUDuz6xxQ/videos" target="_blank">YouTube</a>: Video tutorials</li>
</ul>
</div>
""", unsafe_allow_html=True)
st.markdown('<div class="sub-title">Quick Links</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<ul>
<li><a class="link" href="https://sparknlp.org/docs/en/quickstart" target="_blank">Getting Started</a></li>
<li><a class="link" href="https://nlp.johnsnowlabs.com/models" target="_blank">Pretrained Models</a></li>
<li><a class="link" href="https://github.com/JohnSnowLabs/spark-nlp/tree/master/examples/python/annotation/text/english" target="_blank">Example Notebooks</a></li>
<li><a class="link" href="https://sparknlp.org/docs/en/install" target="_blank">Installation Guide</a></li>
</ul>
</div>
""", unsafe_allow_html=True)
|