nickmuchi commited on
Commit
d9e6ed8
·
1 Parent(s): 7f6b4d5

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +5 -4
app.py CHANGED
@@ -17,6 +17,7 @@ import validators
17
  import nltk
18
  import warnings
19
  import streamlit as st
 
20
 
21
  nltk.download('punkt')
22
 
@@ -225,9 +226,9 @@ bi_encoder_type = st.sidebar.selectbox(
225
  top_k = st.sidebar.slider("Number of Top Hits Generated",min_value=1,max_value=5,value=2)
226
 
227
  st.markdown(
228
- """-The app supports asymmetric Semantic search which seeks to improve search accuracy of documents/URL by understanding the content of the search query in contrast to traditional search engines which only find documents based on lexical matches.
229
- -The idea behind semantic search is to embed all entries in your corpus, whether they be sentences, paragraphs, or documents, into a vector space. At search time, the query is embedded into the same vector space and the closest embeddings from your corpus are found. These entries should have a high semantic overlap with the query.
230
- -The all-* models where trained on all available training data (more than 1 billion training pairs) and are designed as general purpose models. The all-mpnet-base-v2 model provides the best quality, while all-MiniLM-L6-v2 is 5 times faster and still offers good quality. The models used have been trained on broad datasets, however, if your document/corpus is specialised, such as for science or economics, the results returned might be unsatisfactory.""")
231
 
232
  st.markdown("""There models available to choose from:""")
233
 
@@ -247,7 +248,7 @@ st.markdown(
247
  """- Bi-Encoder (Retrieval): The Bi-encoder is responsible for independently embedding the sentences and search queries into a vector space. The result is then passed to the cross-encoder for checking the relevance/similarity between the query and sentences.
248
  - Cross-Encoder (Re-Ranker): A re-ranker based on a Cross-Encoder can substantially improve the final results for the user. The query and a possible document is passed simultaneously to transformer network, which then outputs a single score between 0 and 1 indicating how relevant the document is for the given query. The cross-encoder further boost the performance, especially when you search over a corpus for which the bi-encoder was not trained for.""")
249
 
250
- st.image('encoder.png', caption='Retrieval and Re-Rank')
251
 
252
  st.markdown("""
253
  In order to use the app:
 
17
  import nltk
18
  import warnings
19
  import streamlit as st
20
+ from PIL import Image
21
 
22
  nltk.download('punkt')
23
 
 
226
  top_k = st.sidebar.slider("Number of Top Hits Generated",min_value=1,max_value=5,value=2)
227
 
228
  st.markdown(
229
+ """- The app supports asymmetric Semantic search which seeks to improve search accuracy of documents/URL by understanding the content of the search query in contrast to traditional search engines which only find documents based on lexical matches.
230
+ - The idea behind semantic search is to embed all entries in your corpus, whether they be sentences, paragraphs, or documents, into a vector space. At search time, the query is embedded into the same vector space and the closest embeddings from your corpus are found. These entries should have a high semantic overlap with the query.
231
+ - The all-* models where trained on all available training data (more than 1 billion training pairs) and are designed as general purpose models. The all-mpnet-base-v2 model provides the best quality, while all-MiniLM-L6-v2 is 5 times faster and still offers good quality. The models used have been trained on broad datasets, however, if your document/corpus is specialised, such as for science or economics, the results returned might be unsatisfactory.""")
232
 
233
  st.markdown("""There models available to choose from:""")
234
 
 
248
  """- Bi-Encoder (Retrieval): The Bi-encoder is responsible for independently embedding the sentences and search queries into a vector space. The result is then passed to the cross-encoder for checking the relevance/similarity between the query and sentences.
249
  - Cross-Encoder (Re-Ranker): A re-ranker based on a Cross-Encoder can substantially improve the final results for the user. The query and a possible document is passed simultaneously to transformer network, which then outputs a single score between 0 and 1 indicating how relevant the document is for the given query. The cross-encoder further boost the performance, especially when you search over a corpus for which the bi-encoder was not trained for.""")
250
 
251
+ st.image(Image.open('encoder.png'), caption='Retrieval and Re-Rank')
252
 
253
  st.markdown("""
254
  In order to use the app: