Update app.py
Browse files
app.py
CHANGED
@@ -226,7 +226,8 @@ bi_encoder_type = st.sidebar.selectbox(
|
|
226 |
top_k = st.sidebar.slider("Number of Top Hits Generated",min_value=1,max_value=5,value=2)
|
227 |
|
228 |
st.markdown(
|
229 |
-
"""
|
|
|
230 |
- The idea behind semantic search is to embed all entries in your corpus, whether they be sentences, paragraphs, or documents, into a vector space. At search time, the query is embedded into the same vector space and the closest embeddings from your corpus are found. These entries should have a high semantic overlap with the query.
|
231 |
- The all-* models where trained on all available training data (more than 1 billion training pairs) and are designed as general purpose models. The all-mpnet-base-v2 model provides the best quality, while all-MiniLM-L6-v2 is 5 times faster and still offers good quality. The models used have been trained on broad datasets, however, if your document/corpus is specialised, such as for science or economics, the results returned might be unsatisfactory.""")
|
232 |
|
@@ -240,7 +241,6 @@ st.markdown(
|
|
240 |
st.markdown(
|
241 |
"""Code and App Inspiration Source: [Sentence Transformers](https://www.sbert.net/examples/applications/retrieve_rerank/README.html)""")
|
242 |
|
243 |
-
|
244 |
st.markdown(
|
245 |
"""Quick summary of the purposes of a Bi and Cross-encoder below, the image and info were adapted from [www.sbert.net](https://www.sbert.net/examples/applications/semantic-search/README.html):""")
|
246 |
|
@@ -289,7 +289,7 @@ if search:
|
|
289 |
if bi_encoder_type:
|
290 |
|
291 |
with st.spinner(
|
292 |
-
text=f"Loading {bi_encoder_type}
|
293 |
):
|
294 |
corpus_embeddings = bi_encoder(bi_encoder_type,passages)
|
295 |
cross_encoder = cross_encoder()
|
|
|
226 |
top_k = st.sidebar.slider("Number of Top Hits Generated",min_value=1,max_value=5,value=2)
|
227 |
|
228 |
st.markdown(
|
229 |
+
"""
|
230 |
+
- The app supports asymmetric Semantic search which seeks to improve search accuracy of documents/URL by understanding the content of the search query in contrast to traditional search engines which only find documents based on lexical matches.
|
231 |
- The idea behind semantic search is to embed all entries in your corpus, whether they be sentences, paragraphs, or documents, into a vector space. At search time, the query is embedded into the same vector space and the closest embeddings from your corpus are found. These entries should have a high semantic overlap with the query.
|
232 |
- The all-* models where trained on all available training data (more than 1 billion training pairs) and are designed as general purpose models. The all-mpnet-base-v2 model provides the best quality, while all-MiniLM-L6-v2 is 5 times faster and still offers good quality. The models used have been trained on broad datasets, however, if your document/corpus is specialised, such as for science or economics, the results returned might be unsatisfactory.""")
|
233 |
|
|
|
241 |
st.markdown(
|
242 |
"""Code and App Inspiration Source: [Sentence Transformers](https://www.sbert.net/examples/applications/retrieve_rerank/README.html)""")
|
243 |
|
|
|
244 |
st.markdown(
|
245 |
"""Quick summary of the purposes of a Bi and Cross-encoder below, the image and info were adapted from [www.sbert.net](https://www.sbert.net/examples/applications/semantic-search/README.html):""")
|
246 |
|
|
|
289 |
if bi_encoder_type:
|
290 |
|
291 |
with st.spinner(
|
292 |
+
text=f"Loading {bi_encoder_type} bi-encoder and embedding document into vector space. This might take a few seconds depending on the length of your document..."
|
293 |
):
|
294 |
corpus_embeddings = bi_encoder(bi_encoder_type,passages)
|
295 |
cross_encoder = cross_encoder()
|