Spaces:
Running
Running
title: Hebrew Dentsit | |
emoji: 🏢 | |
colorFrom: blue | |
colorTo: red | |
sdk: gradio | |
sdk_version: 5.10.0 | |
app_file: app.py | |
pinned: false | |
short_description: A RAG agent Hebrew Speaking Dentist | |
Do you want to consult with a Dentist? Speaking Hebrew? Consulting with Dentist can be expensive... This is why I had built a Hebrew RAG Dentist Agent, which you can talk to. | |
Warning: The Agent (Chatbot) can still hallucinate and make up "fake" facts and shouldn’t be an alternative for an expert Dentist. the use of this Chatbot is on your responsibility only. | |
This RAG Agent based on Q&A data collected from 3 top Israeli forums. Data was collected using scraper, and saved into a SQL DB. Then, the titles & questions were embedded into vectors using free 'MPA/sambert' HuggingFace Encoder Model (this model found to be performing well on Hebrew Medical Jargon). The Vectors were stored a hundread at a time, into NoSQL Pinecone Vector Database, with answer_id as metadata. | |
The answers were converted into vector embedding using the same free Encoder ('MPA/sambert'), and stored in Pinecone with different key and with the answer as metadata | |
Now, all is left is the the RAG Agent which is composed from a Retriever, Reranker, and a Generator: | |
4) The Retriever embeds the user question (using the free 'MPA/sambert' HuggingFace Encoder Model) uses an ANN search with a cosine similarity metric and the top_k variable equals to 50. | |
5) The Reranker fetches the answers vectors suing their list of top_k ids and answers as metadata in a second scan from the PineCone database resorts the answers, then cosine similarity is calculated using the sklearn method. Afterwards, it selects the the top_n (equal to 5) answers, when each answer should be similar to the question embedding with a threshold of 0.7 or higher. | |
6) The Generator used is from a paid API -Anthropic Claude Sonnet 3.5 - a decoder that is not trained over the medical jargon - however with the right prompt and the right context the results are pretty good. | |
The whole work from inception to completion was done by me (Eli Borodach) | |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |