wisdom196473 commited on
Commit
ef7b203
·
1 Parent(s): 656e047

update README

Browse files
Files changed (2) hide show
  1. .ipynb_checkpoints/README-checkpoint.md +67 -16
  2. README.md +57 -17
.ipynb_checkpoints/README-checkpoint.md CHANGED
@@ -1,21 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
1
  # Amazon E-commerce Visual Assistant
2
 
3
- A multimodal AI assistant that helps users search and explore Amazon products through natural language and image-based interactions.
4
 
5
- ## Features
6
 
7
- - Text and image-based product search
8
- - Product comparisons and recommendations
9
- - Visual product recognition
10
- - Detailed product information retrieval
11
- - Price analysis and comparison
12
 
13
- ## Technologies Used
14
 
15
- - FashionCLIP for visual understanding
16
- - Mistral-7B Language Model for text generation
17
- - FAISS for efficient similarity search
18
- - Streamlit for the user interface
19
 
20
  ## Setup and Installation
21
 
@@ -35,11 +42,55 @@ pip install -r requirements.txt
35
  streamlit run amazon_app.py
36
  ```
37
 
38
- ## Project Structure
39
 
40
- - `amazon_app.py`: Main Streamlit application
41
- - `model.py`: Core AI model implementations
42
- - `requirements.txt`: Project dependencies
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
 
44
  ## Future Directions
45
 
 
1
+ ---
2
+ title: Amazon E-commerce Visual Assistant
3
+ emoji: 🛍️
4
+ colorFrom: blue
5
+ colorTo: green
6
+ sdk: streamlit
7
+ sdk_version: "1.28.0"
8
+ app_file: amazon_app.py
9
+ pinned: false
10
+ ---
11
+
12
  # Amazon E-commerce Visual Assistant
13
 
14
+ A multimodal AI assistant leveraging the Amazon Product Dataset 2020 to provide comprehensive product search and recommendations through natural language and image-based interactions[1].
15
 
16
+ ## Project Overview
17
 
18
+ This conversational AI system combines advanced language and vision models to enhance e-commerce customer support, enabling accurate, context-aware responses to product-related queries[1].
 
 
 
 
19
 
20
+ ## Project Structure
21
 
22
+ - `amazon_app.py`: Main Streamlit application
23
+ - `model.py`: Core AI model implementations
24
+ - `Vision_AI.ipynb`: EDA, Embedding Model, LLM
25
+ - `requirements.txt`: Project dependencies
26
 
27
  ## Setup and Installation
28
 
 
42
  streamlit run amazon_app.py
43
  ```
44
 
45
+ ## Technical Architecture
46
 
47
+ ### Data Processing & Storage
48
+ - Standardized text fields and normalized numeric attributes
49
+ - Enhanced metadata indices for categories, price ranges, keywords, brands
50
+ - Validated image quality and managed duplicates
51
+ - Structured data storage in Parquet format[1]
52
+
53
+ ### Model Components
54
+ - **Vision-Language Integration**: FashionCLIP for multimodal embedding generation
55
+ - **Vector Search**: FAISS with hybrid retrieval combining embedding similarity and metadata filtering
56
+ - **Language Model**: Mistral-7B with 4-bit quantization
57
+ - **RAG Framework**: Context-enhanced response generation[1]
58
+
59
+ ### Performance Metrics
60
+ - Recall@1: 0.6385
61
+ - Recall@10: 0.9008
62
+ - Precision@1: 0.6385
63
+ - NDCG@10: 0.7725[1]
64
+
65
+ ## Implementation Details
66
+
67
+ ### Core Features
68
+ - Text and image-based product search
69
+ - Product comparisons and recommendations
70
+ - Visual product recognition
71
+ - Detailed product information retrieval
72
+ - Price analysis and comparison[1]
73
+
74
+ ### Technologies Used
75
+ - FashionCLIP for visual understanding
76
+ - Mistral-7B Language Model (4-bit quantized)
77
+ - FAISS for similarity search
78
+ - Google Vertex AI for vector storage
79
+ - Streamlit for user interface[1]
80
+
81
+ ## Challenges & Solutions
82
+
83
+ ### Technical Challenges Addressed
84
+ - Image processing with varying quality
85
+ - GPU memory optimization
86
+ - Efficient embedding storage
87
+ - Query response accuracy[1]
88
+
89
+ ### Implemented Solutions
90
+ - Robust image validation pipeline
91
+ - 4-bit model quantization
92
+ - Optimized batch processing
93
+ - Enhanced metadata enrichment[1]
94
 
95
  ## Future Directions
96
 
README.md CHANGED
@@ -11,22 +11,18 @@ pinned: false
11
 
12
  # Amazon E-commerce Visual Assistant
13
 
14
- A multimodal AI assistant that helps users search and explore Amazon products through natural language and image-based interactions.
15
 
16
- ## Features
17
 
18
- - Text and image-based product search
19
- - Product comparisons and recommendations
20
- - Visual product recognition
21
- - Detailed product information retrieval
22
- - Price analysis and comparison
23
 
24
- ## Technologies Used
25
 
26
- - FashionCLIP for visual understanding
27
- - Mistral-7B Language Model for text generation
28
- - FAISS for efficient similarity search
29
- - Streamlit for the user interface
30
 
31
  ## Setup and Installation
32
 
@@ -46,14 +42,58 @@ pip install -r requirements.txt
46
  streamlit run amazon_app.py
47
  ```
48
 
49
- ## Project Structure
50
 
51
- - `amazon_app.py`: Main Streamlit application
52
- - `model.py`: Core AI model implementations
53
- - `requirements.txt`: Project dependencies
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
  ## Future Directions
56
 
57
  - [ ] Fine-Tune FashionClip embedding model based on the specific domain data
58
  - [ ] Fine-Tune large language model to improve its generalization capabilities
59
- - [ ] Develop feedback loops for continuous improvement
 
11
 
12
  # Amazon E-commerce Visual Assistant
13
 
14
+ A multimodal AI assistant leveraging the Amazon Product Dataset 2020 to provide comprehensive product search and recommendations through natural language and image-based interactions[1].
15
 
16
+ ## Project Overview
17
 
18
+ This conversational AI system combines advanced language and vision models to enhance e-commerce customer support, enabling accurate, context-aware responses to product-related queries[1].
 
 
 
 
19
 
20
+ ## Project Structure
21
 
22
+ - `amazon_app.py`: Main Streamlit application
23
+ - `model.py`: Core AI model implementations
24
+ - `Vision_AI.ipynb`: EDA, Embedding Model, LLM
25
+ - `requirements.txt`: Project dependencies
26
 
27
  ## Setup and Installation
28
 
 
42
  streamlit run amazon_app.py
43
  ```
44
 
45
+ ## Technical Architecture
46
 
47
+ ### Data Processing & Storage
48
+ - Standardized text fields and normalized numeric attributes
49
+ - Enhanced metadata indices for categories, price ranges, keywords, brands
50
+ - Validated image quality and managed duplicates
51
+ - Structured data storage in Parquet format[1]
52
+
53
+ ### Model Components
54
+ - **Vision-Language Integration**: FashionCLIP for multimodal embedding generation
55
+ - **Vector Search**: FAISS with hybrid retrieval combining embedding similarity and metadata filtering
56
+ - **Language Model**: Mistral-7B with 4-bit quantization
57
+ - **RAG Framework**: Context-enhanced response generation[1]
58
+
59
+ ### Performance Metrics
60
+ - Recall@1: 0.6385
61
+ - Recall@10: 0.9008
62
+ - Precision@1: 0.6385
63
+ - NDCG@10: 0.7725[1]
64
+
65
+ ## Implementation Details
66
+
67
+ ### Core Features
68
+ - Text and image-based product search
69
+ - Product comparisons and recommendations
70
+ - Visual product recognition
71
+ - Detailed product information retrieval
72
+ - Price analysis and comparison[1]
73
+
74
+ ### Technologies Used
75
+ - FashionCLIP for visual understanding
76
+ - Mistral-7B Language Model (4-bit quantized)
77
+ - FAISS for similarity search
78
+ - Google Vertex AI for vector storage
79
+ - Streamlit for user interface[1]
80
+
81
+ ## Challenges & Solutions
82
+
83
+ ### Technical Challenges Addressed
84
+ - Image processing with varying quality
85
+ - GPU memory optimization
86
+ - Efficient embedding storage
87
+ - Query response accuracy[1]
88
+
89
+ ### Implemented Solutions
90
+ - Robust image validation pipeline
91
+ - 4-bit model quantization
92
+ - Optimized batch processing
93
+ - Enhanced metadata enrichment[1]
94
 
95
  ## Future Directions
96
 
97
  - [ ] Fine-Tune FashionClip embedding model based on the specific domain data
98
  - [ ] Fine-Tune large language model to improve its generalization capabilities
99
+ - [ ] Develop feedback loops for continuous improvement