nickmuchi commited on
Commit
dff451a
·
1 Parent(s): 974da6a

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +14 -10
app.py CHANGED
@@ -223,26 +223,30 @@ top_k = st.sidebar.slider("Number of Top Hits Generated",min_value=1,max_value=5
223
 
224
  st.markdown(
225
  """
226
- - The app supports asymmetric Semantic search which seeks to improve search accuracy of documents/URL by understanding the content of the search query in contrast to traditional search engines which only find documents based on lexical matches.
227
- - The idea behind semantic search is to embed all entries in your corpus, whether they be sentences, paragraphs, or documents, into a vector space. At search time, the query is embedded into the same vector space and the closest embeddings from your corpus are found. These entries should have a high semantic overlap with the query.
228
- - The all-* models where trained on all available training data (more than 1 billion training pairs) and are designed as general purpose models. The all-mpnet-base-v2 model provides the best quality, while all-MiniLM-L6-v2 is 5 times faster and still offers good quality. The models used have been trained on broad datasets, however, if your document/corpus is specialised, such as for science or economics, the results returned might be unsatisfactory.""")
229
 
230
  st.markdown("""There models available to choose from:""")
231
 
232
  st.markdown(
233
- """Model Source:
234
- - Bi-Encoders - [multi-qa-mpnet-base-dot-v1](https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-dot-v1), [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2), [multi-qa-MiniLM-L6-cos-v1](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1) and [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
235
- - Cross-Encoder - [cross-encoder/ms-marco-MiniLM-L-12-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2)""")
 
236
 
237
  st.markdown(
238
- """Code and App Inspiration Source: [Sentence Transformers](https://www.sbert.net/examples/applications/retrieve_rerank/README.html)""")
 
239
 
240
  st.markdown(
241
- """Quick summary of the purposes of a Bi and Cross-encoder below, the image and info were adapted from [www.sbert.net](https://www.sbert.net/examples/applications/semantic-search/README.html):""")
 
242
 
243
  st.markdown(
244
- """- Bi-Encoder (Retrieval): The Bi-encoder is responsible for independently embedding the sentences and search queries into a vector space. The result is then passed to the cross-encoder for checking the relevance/similarity between the query and sentences.
245
- - Cross-Encoder (Re-Ranker): A re-ranker based on a Cross-Encoder can substantially improve the final results for the user. The query and a possible document is passed simultaneously to transformer network, which then outputs a single score between 0 and 1 indicating how relevant the document is for the given query. The cross-encoder further boost the performance, especially when you search over a corpus for which the bi-encoder was not trained for.""")
 
246
 
247
  st.image(Image.open('encoder.png'), caption='Retrieval and Re-Rank')
248
 
 
223
 
224
  st.markdown(
225
  """
226
+ - The app supports asymmetric Semantic search which seeks to improve search accuracy of documents/URL by understanding the content of the search query in contrast to traditional search engines which only find documents based on lexical matches.
227
+ - The idea behind semantic search is to embed all entries in your corpus, whether they be sentences, paragraphs, or documents, into a vector space. At search time, the query is embedded into the same vector space and the closest embeddings from your corpus are found. These entries should have a high semantic overlap with the query.
228
+ - The all-* models where trained on all available training data (more than 1 billion training pairs) and are designed as general purpose models. The all-mpnet-base-v2 model provides the best quality, while all-MiniLM-L6-v2 is 5 times faster and still offers good quality. The models used have been trained on broad datasets, however, if your document/corpus is specialised, such as for science or economics, the results returned might be unsatisfactory.""")
229
 
230
  st.markdown("""There models available to choose from:""")
231
 
232
  st.markdown(
233
+ """
234
+ Model Source:
235
+ - Bi-Encoders - [multi-qa-mpnet-base-dot-v1](https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-dot-v1), [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2), [multi-qa-MiniLM-L6-cos-v1](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1) and [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
236
+ - Cross-Encoder - [cross-encoder/ms-marco-MiniLM-L-12-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2)""")
237
 
238
  st.markdown(
239
+ """
240
+ Code and App Inspiration Source: [Sentence Transformers](https://www.sbert.net/examples/applications/retrieve_rerank/README.html)""")
241
 
242
  st.markdown(
243
+ """
244
+ Quick summary of the purposes of a Bi and Cross-encoder below, the image and info were adapted from [www.sbert.net](https://www.sbert.net/examples/applications/semantic-search/README.html):""")
245
 
246
  st.markdown(
247
+ """
248
+ - Bi-Encoder (Retrieval): The Bi-encoder is responsible for independently embedding the sentences and search queries into a vector space. The result is then passed to the cross-encoder for checking the relevance/similarity between the query and sentences.
249
+ - Cross-Encoder (Re-Ranker): A re-ranker based on a Cross-Encoder can substantially improve the final results for the user. The query and a possible document is passed simultaneously to transformer network, which then outputs a single score between 0 and 1 indicating how relevant the document is for the given query. The cross-encoder further boost the performance, especially when you search over a corpus for which the bi-encoder was not trained for.""")
250
 
251
  st.image(Image.open('encoder.png'), caption='Retrieval and Re-Rank')
252