tony-42069 commited on
Commit
b3d3a90
Β·
1 Parent(s): 80e51c6

Add project documentation

Browse files
Files changed (1) hide show
  1. README.md +122 -0
README.md ADDED
@@ -0,0 +1,122 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Commercial Real Estate Knowledge Assistant
2
+
3
+ ![Commercial Lending 101](Dataset/commercial-lending-101.png)
4
+
5
+ A sophisticated Retrieval-Augmented Generation (RAG) chatbot that transforms how professionals understand commercial real estate concepts. Built with Azure OpenAI and modern Python technologies, this assistant processes commercial real estate documentation and provides accurate, context-aware answers to your questions.
6
+
7
+ ## πŸš€ Deployments
8
+ - **Live Demo**: [Try it on Hugging Face Spaces](https://huggingface.co/spaces/tony-42069/cre-chatbot-rag)
9
+
10
+ ## 🌟 Key Features
11
+ - **Multi-Document Support**: Process and analyze multiple PDF documents simultaneously
12
+ - **Intelligent PDF Processing**: Advanced document analysis and text extraction
13
+ - **Azure OpenAI Integration**: Leveraging GPT-3.5 Turbo for accurate, contextual responses
14
+ - **Semantic Search**: Using Azure OpenAI embeddings for precise context retrieval
15
+ - **Vector Storage**: Efficient document indexing with ChromaDB
16
+ - **Modern UI**: Beautiful chat interface with message history and source tracking
17
+ - **Enterprise-Ready**: Comprehensive logging and error handling
18
+
19
+ ## 🎯 Use Cases
20
+ - **Training & Education**: Help new CRE professionals understand industry concepts
21
+ - **Quick Reference**: Instant access to definitions and explanations
22
+ - **Document Analysis**: Extract insights from CRE documentation
23
+ - **Knowledge Base**: Build and query your own CRE knowledge repository
24
+
25
+ ## πŸš€ Quick Start
26
+
27
+ ### Prerequisites
28
+ - Python 3.8+
29
+ - Azure OpenAI Service access with:
30
+ - `gpt-35-turbo` model deployment
31
+ - `text-embedding-ada-002` model deployment
32
+
33
+ ### Installation
34
+ 1. Clone the repository:
35
+ ```bash
36
+ git clone https://github.com/tony-42069/cre-chatbot-rag.git
37
+ cd cre-chatbot-rag
38
+ ```
39
+
40
+ 2. Create and activate virtual environment:
41
+ ```bash
42
+ python -m venv venv
43
+ venv\Scripts\activate
44
+ ```
45
+
46
+ 3. Install dependencies:
47
+ ```bash
48
+ pip install -r requirements.txt
49
+ ```
50
+
51
+ 4. Create `.env` file with Azure OpenAI credentials:
52
+ ```env
53
+ AZURE_OPENAI_ENDPOINT=your_endpoint_here
54
+ AZURE_OPENAI_KEY=your_key_here
55
+ AZURE_OPENAI_DEPLOYMENT_NAME=your_gpt_deployment_name
56
+ AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME=text-embedding-ada-002
57
+ ```
58
+
59
+ 5. Run the application:
60
+ ```bash
61
+ streamlit run app/main.py
62
+ ```
63
+
64
+ ## πŸ”Œ Embedding
65
+ To embed this chatbot in your website, use the following HTML code:
66
+
67
+ ```html
68
+ <iframe
69
+ src="https://tony-42069-cre-chatbot-rag.hf.space"
70
+ frameborder="0"
71
+ width="850px"
72
+ height="450px"
73
+ ></iframe>
74
+ ```
75
+
76
+ ## πŸ’‘ Features
77
+
78
+ ### Modern Chat Interface
79
+ - Clean, professional design
80
+ - Persistent chat history
81
+ - Source context tracking
82
+ - Multiple document management
83
+ - Real-time processing feedback
84
+
85
+ ### Advanced RAG Implementation
86
+ - Semantic chunking of documents
87
+ - Azure OpenAI embeddings for accurate retrieval
88
+ - Context-aware answer generation
89
+ - Multi-document knowledge base
90
+ - Source attribution for answers
91
+
92
+ ### Enterprise Security
93
+ - Secure credential management
94
+ - Azure OpenAI integration
95
+ - Local vector storage with ChromaDB
96
+ - Comprehensive error handling
97
+ - Detailed logging system
98
+
99
+ ## πŸ› οΈ Technical Stack
100
+ - **Frontend**: Streamlit
101
+ - **Language Models**: Azure OpenAI (GPT-3.5 Turbo)
102
+ - **Embeddings**: Azure OpenAI (text-embedding-ada-002)
103
+ - **Vector Store**: ChromaDB
104
+ - **PDF Processing**: PyPDF2
105
+ - **Framework**: LangChain
106
+
107
+ ## πŸ“š Documentation
108
+ - [Azure OpenAI Service](https://azure.microsoft.com/en-us/products/cognitive-services/openai-service/)
109
+ - [Streamlit](https://streamlit.io/)
110
+ - [LangChain](https://python.langchain.com/)
111
+ - [ChromaDB](https://www.trychroma.com/)
112
+
113
+ ## 🀝 Contributing
114
+ Contributions are welcome! Please feel free to submit a Pull Request.
115
+
116
+ ## πŸ“„ License
117
+ This project is licensed under the MIT License - see the LICENSE file for details.
118
+
119
+ ## πŸ™ Acknowledgments
120
+ - Azure OpenAI team for providing the powerful language models
121
+ - LangChain community for the excellent RAG framework
122
+ - Streamlit team for the amazing web framework