davanstrien HF Staff commited on
Commit
5e7afcf
Β·
1 Parent(s): ff7e103

update readme

Browse files
Files changed (1) hide show
  1. README.md +87 -10
README.md CHANGED
@@ -9,9 +9,9 @@ pinned: false
9
 
10
  # UV Scripts
11
 
12
- **Ready-to-run data processing scripts for the ML community**
13
 
14
- Run powerful ML workflows with a single command - no setup required.
15
 
16
  ## What are UV scripts?
17
 
@@ -22,18 +22,95 @@ Perfect for:
22
  - πŸ’» **Local processing** on your machine
23
  - πŸ”„ **Reproducible pipelines** that work anywhere
24
 
25
- ## Example
26
 
27
  ```bash
28
- # Convert PDFs to a dataset
29
- uv run https://huggingface.co/datasets/uv-scripts/dataset-creation/resolve/main/pdf-to-dataset.py \
30
- /path/to/pdfs \
31
- username/my-dataset
 
 
 
 
 
32
  ```
33
 
34
- ## Browse Scripts
35
 
36
  | Script Collection | Description | GPU Required |
37
  |-------------------|-------------|--------------|
38
- | [dataset-creation](https://huggingface.co/datasets/uv-scripts/dataset-creation) | Create datasets from PDFs and other files | ❌ |
39
- | More coming soon... | | |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
  # UV Scripts
11
 
12
+ **Ready-to-run ML tools powered by UV - zero setup, maximum power**
13
 
14
+ Run state-of-the-art ML workflows with a single command. From OCR to classification, all scripts work instantly with `uv run`.
15
 
16
  ## What are UV scripts?
17
 
 
22
  - πŸ’» **Local processing** on your machine
23
  - πŸ”„ **Reproducible pipelines** that work anywhere
24
 
25
+ ## πŸš€ Quick Example
26
 
27
  ```bash
28
+ # Extract text from images with state-of-the-art OCR
29
+ uv run https://huggingface.co/datasets/uv-scripts/ocr/raw/main/nanonets-ocr.py \
30
+ your-image-dataset \
31
+ your-extracted-text
32
+
33
+ # Or run on GPU with HF Jobs (no local GPU needed!)
34
+ hf jobs uv run --flavor l4x1 \
35
+ https://huggingface.co/datasets/uv-scripts/ocr/raw/main/nanonets-ocr.py \
36
+ your-images your-text
37
  ```
38
 
39
+ ## πŸ“š Browse Scripts
40
 
41
  | Script Collection | Description | GPU Required |
42
  |-------------------|-------------|--------------|
43
+ | [ocr](https://huggingface.co/datasets/uv-scripts/ocr) | Extract text from images with VLMs (LaTeX, tables, forms) | βœ… |
44
+ | [classification](https://huggingface.co/datasets/uv-scripts/classification) | Text classification with guaranteed valid outputs | βœ… |
45
+ | [dataset-creation](https://huggingface.co/datasets/uv-scripts/dataset-creation) | Create datasets from PDFs and files | ❌ |
46
+ | [vllm](https://huggingface.co/datasets/uv-scripts/vllm) | High-performance inference with vLLM | βœ… |
47
+
48
+ ## 🎯 Why UV Scripts?
49
+
50
+ ### Zero Setup
51
+ No virtual environments, no dependency conflicts, no installation steps. UV handles everything automatically when you run the script.
52
+
53
+ ### Production Ready
54
+ These aren't demos - they're production-quality tools used by the community for real ML workflows.
55
+
56
+ ### GPU Optimized
57
+ Seamlessly run on local GPUs or scale to cloud with HF Jobs. Same script, different compute.
58
+
59
+ ### Community Driven
60
+ Browse scripts, contribute your own, and learn from the best practices of the ML community.
61
+
62
+ ## 🌟 Featured Scripts
63
+
64
+ ### OCR Any Document Dataset
65
+ Extract text from images with state-of-the-art accuracy:
66
+ ```bash
67
+ # Handles LaTeX, tables, forms, handwriting
68
+ uv run https://huggingface.co/datasets/uv-scripts/ocr/raw/main/nanonets-ocr.py \
69
+ your-images extracted-text
70
+ ```
71
+
72
+ ### Classify with Guaranteed Valid Outputs
73
+ Text classification that always returns valid labels:
74
+ ```bash
75
+ # Uses vLLM's structured generation - no invalid outputs!
76
+ uv run https://huggingface.co/datasets/uv-scripts/classification/raw/main/classify-dataset.py \
77
+ --input-dataset imdb --column text \
78
+ --labels "positive,negative" --output-dataset imdb-classified
79
+ ```
80
+
81
+ ## πŸš€ Getting Started
82
+
83
+ 1. **Install UV** (one-time setup):
84
+ ```bash
85
+ curl -LsSf https://astral.sh/uv/install.sh | sh
86
+ ```
87
+
88
+ 2. **Run any script**:
89
+ ```bash
90
+ uv run https://huggingface.co/datasets/uv-scripts/[collection]/raw/main/[script].py
91
+ ```
92
+
93
+ 3. **Or use HF Jobs** (no local GPU needed):
94
+ ```bash
95
+ hf jobs uv run --flavor l4x1 [script-url] [args]
96
+ ```
97
+
98
+ ## 🀝 Contributing
99
+
100
+ We welcome scripts that:
101
+ - Solve real ML problems
102
+ - Include clear documentation
103
+ - Follow UV best practices
104
+ - Work on both local and cloud
105
+
106
+ Submit your scripts as PRs to the relevant collection or propose a new collection!
107
+
108
+ ## πŸ“– Learn More
109
+
110
+ - [UV Documentation](https://docs.astral.sh/uv/)
111
+ - [HF Jobs Guide](https://huggingface.co/docs/hub/spaces-gpu-jobs)
112
+ - [Script Examples](https://github.com/astral-sh/uv/tree/main/scripts)
113
+
114
+ ---
115
+
116
+ *UV Scripts is a community project showcasing the power of [UV](https://github.com/astral-sh/uv) for ML workflows.*