uparekh01151 commited on
Commit
5cc5417
Β·
1 Parent(s): c9b6ebc

remove: redundant documentation files

Browse files

- Remove README_HF_SPACES.md and DEPLOYMENT_SUMMARY.md
- Keep README.md as single source of documentation
- Streamline project documentation structure

Files changed (2) hide show
  1. DEPLOYMENT_SUMMARY.md +0 -93
  2. README_HF_SPACES.md +0 -197
DEPLOYMENT_SUMMARY.md DELETED
@@ -1,93 +0,0 @@
1
- # DataEngEval - Deployment Summary
2
-
3
- ## πŸš€ Ready for Hugging Face Spaces Deployment
4
-
5
- ### Space Details
6
- - **Space Name**: `DataEngEval`
7
- - **URL**: `https://huggingface.co/spaces/your-username/DataEngEval`
8
- - **SDK**: Gradio
9
- - **Hardware**: CPU Basic
10
-
11
- ### βœ… Code Status: READY
12
-
13
- #### Required Files Present
14
- - βœ… `app.py` - Main Gradio application
15
- - βœ… `requirements.txt` - Lightweight dependencies (no heavy ML libs)
16
- - βœ… `config/` - All configuration files
17
- - βœ… `src/` - Source code modules
18
- - βœ… `tasks/` - Multi-use-case datasets
19
- - βœ… `prompts/` - SQL templates
20
-
21
- #### HF Spaces Optimized
22
- - βœ… **No heavy dependencies**: No torch, transformers, accelerate
23
- - βœ… **Remote inference**: Uses Hugging Face Inference API
24
- - βœ… **Mock mode**: Works without API keys
25
- - βœ… **Lightweight**: Fast deployment and startup
26
-
27
- ### 🎯 Multi-Use-Case Support
28
-
29
- #### 1. SQL Generation
30
- - **Dataset**: NYC Taxi Small
31
- - **Dialects**: Presto, BigQuery, Snowflake
32
- - **Metrics**: Correctness, execution, result matching
33
-
34
- #### 2. Code Generation
35
- - **Python**: Algorithms, data structures, OOP
36
- - **Go**: Algorithms, HTTP handlers, concurrency
37
- - **Metrics**: Syntax, compilation, execution, quality
38
-
39
- #### 3. Documentation Generation
40
- - **Technical Docs**: API docs, function docs, installation guides
41
- - **API Documentation**: OpenAPI, GraphQL, REST endpoints
42
- - **Metrics**: Accuracy, completeness, clarity, format compliance
43
-
44
- ### πŸ”‘ HF_TOKEN Setup
45
-
46
- #### Get Your Token
47
- 1. Go to [Hugging Face Settings](https://huggingface.co/settings/tokens)
48
- 2. Click "New token"
49
- 3. Choose "Read" access
50
- 4. Copy the token
51
-
52
- #### Add to Space
53
- 1. Go to Space Settings β†’ Secrets
54
- 2. Add `HF_TOKEN` with your token
55
- 3. **Without token**: App works in mock mode (perfect for demos!)
56
-
57
- ### πŸš€ Deployment Steps
58
-
59
- #### Option A: Git Push (Recommended)
60
- ```bash
61
- # Initialize git
62
- git init
63
- git add .
64
- git commit -m "Initial commit for DataEngEval"
65
-
66
- # Add HF Space as remote
67
- git remote add hf https://huggingface.co/spaces/your-username/DataEngEval
68
-
69
- # Push to HF
70
- git push hf main
71
- ```
72
-
73
- #### Option B: Direct Upload
74
- - Upload all files via HF Spaces web interface
75
-
76
- ### πŸ“Š What You'll Get
77
-
78
- #### Without HF_TOKEN (Mock Mode)
79
- - βœ… Full functionality demonstration
80
- - βœ… Realistic code generation (mock)
81
- - βœ… Complete evaluation pipeline
82
- - βœ… Leaderboard and metrics
83
- - βœ… Perfect for demos and testing
84
-
85
- #### With HF_TOKEN (Real Models)
86
- - βœ… Real Hugging Face model inference
87
- - βœ… Actual code generation from models
88
- - βœ… Production-ready evaluation
89
- - βœ… Real performance metrics
90
-
91
- ### πŸŽ‰ Ready to Deploy!
92
-
93
- Your DataEngEval Space is **100% ready** for deployment! πŸš€
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
README_HF_SPACES.md DELETED
@@ -1,197 +0,0 @@
1
- # Hugging Face Spaces Deployment Guide
2
-
3
- This guide explains how to deploy the NL→SQL Leaderboard on Hugging Face Spaces.
4
-
5
- ## πŸš€ Quick Deployment
6
-
7
- ### Step 1: Create a New Space
8
-
9
- 1. Go to [Hugging Face Spaces](https://huggingface.co/spaces)
10
- 2. Click "Create new Space"
11
- 3. Fill in the details:
12
- - **Space name**: `DataEngEval` (or your preferred name)
13
- - **License**: Choose appropriate license
14
- - **Visibility**: Public or Private
15
- - **SDK**: **Gradio**
16
- - **Hardware**: CPU Basic (sufficient for this app)
17
-
18
- ### Step 2: Upload Your Code
19
-
20
- #### Option A: Git Clone and Push
21
- ```bash
22
- # Clone your repository
23
- git clone <your-repo-url>
24
- cd dataeng-leaderboard
25
-
26
- # Add Hugging Face Space as remote
27
- git remote add hf https://huggingface.co/spaces/your-username/DataEngEval
28
-
29
- # Push to Hugging Face
30
- git push hf main
31
- ```
32
-
33
- #### Option B: Direct Upload
34
- 1. Upload all files to your Space using the web interface
35
- 2. Make sure to include all files from the project structure
36
-
37
- ### Step 3: Configure Environment (Optional)
38
-
39
- 1. Go to your Space settings
40
- 2. Add secrets if needed:
41
- - `HF_TOKEN`: Your Hugging Face API token (for real model inference)
42
- 3. The app will work without tokens using mock mode
43
-
44
- ### Step 4: Deploy
45
-
46
- The Space will automatically build and deploy. You'll see the URL once ready.
47
-
48
- ## πŸ“ Required Files for Deployment
49
-
50
- Make sure these files are present in your Space:
51
-
52
- ```
53
- β”œβ”€β”€ app.py # βœ… Main application
54
- β”œβ”€β”€ requirements.txt # βœ… Dependencies
55
- β”œβ”€β”€ config/
56
- β”‚ └── models.yaml # βœ… Model configurations
57
- β”œβ”€β”€ src/
58
- β”‚ β”œβ”€β”€ evaluator.py # βœ… Evaluation logic
59
- β”‚ β”œβ”€β”€ models_registry.py # βœ… Model interfaces
60
- β”‚ └── scoring.py # βœ… Scoring logic
61
- β”œβ”€β”€ tasks/ # βœ… Datasets
62
- β”‚ β”œβ”€β”€ nyc_taxi_small/
63
- β”‚ β”œβ”€β”€ tpch_tiny/
64
- β”‚ └── ecommerce_orders_small/
65
- β”œβ”€β”€ prompts/ # βœ… SQL templates
66
- β”‚ β”œβ”€β”€ template_presto.txt
67
- β”‚ β”œβ”€β”€ template_bigquery.txt
68
- β”‚ └── template_snowflake.txt
69
- └── README.md # βœ… Documentation
70
- ```
71
-
72
- ## πŸ”§ Configuration
73
-
74
- ### Model Configuration
75
-
76
- Edit `config/models.yaml` to add/remove models:
77
-
78
- ```yaml
79
- models:
80
- - name: "Your Model"
81
- provider: "huggingface"
82
- model_id: "your/model-id"
83
- params:
84
- max_new_tokens: 256
85
- temperature: 0.1
86
- description: "Your model description"
87
- ```
88
-
89
- ### Environment Variables
90
-
91
- Set these in your Space settings:
92
-
93
- - `HF_TOKEN`: Hugging Face API token (optional)
94
- - `MOCK_MODE`: Set to "true" to force mock mode
95
-
96
- ## πŸš€ Features
97
-
98
- ### Automatic Features
99
- - **Auto-deployment**: Changes pushed to Git trigger automatic rebuilds
100
- - **Persistent storage**: Leaderboard results persist across deployments
101
- - **Mock mode**: Works without API keys for demos
102
- - **Remote inference**: No heavy model downloads
103
-
104
- ### Performance Optimizations
105
- - Lightweight dependencies
106
- - Remote model inference
107
- - Efficient DuckDB execution
108
- - Minimal memory footprint
109
-
110
- ## πŸ› Troubleshooting
111
-
112
- ### Common Issues
113
-
114
- **Build fails**: Check that all required files are present and `requirements.txt` is correct
115
-
116
- **App doesn't start**: Verify `app.py` is in the root directory
117
-
118
- **Models not working**: Check `config/models.yaml` format and model IDs
119
-
120
- **Datasets not loading**: Ensure all dataset files are in `tasks/` directory
121
-
122
- ### Debug Mode
123
-
124
- To debug locally before deploying:
125
-
126
- ```bash
127
- # Install dependencies
128
- pip install -r requirements.txt
129
-
130
- # Run locally
131
- gradio app.py
132
-
133
- # Test with mock mode
134
- export MOCK_MODE=true
135
- gradio app.py
136
- ```
137
-
138
- ## πŸ“Š Monitoring
139
-
140
- ### Space Logs
141
- - Check the "Logs" tab in your Space for runtime errors
142
- - Monitor memory usage in the "Settings" tab
143
-
144
- ### Performance
145
- - CPU usage should be minimal (remote inference)
146
- - Memory usage should be low (no local models)
147
- - Response times depend on Hugging Face Inference API
148
-
149
- ## πŸ”„ Updates
150
-
151
- ### Updating Your Space
152
- 1. Make changes to your code
153
- 2. Commit and push to your Space's Git repository
154
- 3. The Space will automatically rebuild
155
-
156
- ### Adding New Models
157
- 1. Edit `config/models.yaml`
158
- 2. Push changes to your Space
159
- 3. New models will be available immediately
160
-
161
- ### Adding New Datasets
162
- 1. Create new folder in `tasks/`
163
- 2. Add required files (`schema.sql`, `loader.py`, `cases.yaml`)
164
- 3. Push changes to your Space
165
-
166
- ## 🎯 Best Practices
167
-
168
- ### Code Organization
169
- - Keep all source code in `src/` directory
170
- - Use relative imports
171
- - Minimize dependencies in `requirements.txt`
172
-
173
- ### Performance
174
- - Use Hugging Face Inference API for models
175
- - Avoid local model loading
176
- - Keep datasets small for faster evaluation
177
-
178
- ### User Experience
179
- - Provide clear error messages
180
- - Use mock mode for demos
181
- - Include comprehensive documentation
182
-
183
- ## πŸ“š Additional Resources
184
-
185
- - [Hugging Face Spaces Documentation](https://huggingface.co/docs/hub/spaces)
186
- - [Gradio Documentation](https://gradio.app/docs/)
187
- - [Hugging Face Inference API](https://huggingface.co/docs/api-inference)
188
-
189
- ## πŸ†˜ Support
190
-
191
- If you encounter issues:
192
-
193
- 1. Check the Space logs for errors
194
- 2. Verify all required files are present
195
- 3. Test locally before deploying
196
- 4. Check Hugging Face Spaces status page
197
- 5. Review the troubleshooting section above