Feature Extraction
music
sander-wood commited on
Commit
355625c
·
verified ·
1 Parent(s): f319dd8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -7
README.md CHANGED
@@ -164,7 +164,7 @@ pip install -r requirements.txt
164
  ```
165
 
166
  ### **Overview of `clamp3_*.py` Scripts**
167
- CLaMP 3 provides scripts for **semantic similarity calculation**, **semantic search**, and **retrieval performance evaluation** across five modalities. Simply provide the file path, and the script will automatically detect the modality and extract the relevant features.
168
 
169
  Supported formats include:
170
  - **Audio**: `.mp3`, `.wav`
@@ -179,6 +179,14 @@ Supported formats include:
179
 
180
  > **Note**: All files in a folder must belong to the same modality for processing.
181
 
 
 
 
 
 
 
 
 
182
  #### **[`clamp3_score.py`](https://github.com/sanderwood/clamp3/blob/main/clamp3_score.py) - Semantic Similarity Calculation**
183
 
184
  This script calculates semantic similarity between query and reference files. By default, it uses **pairwise mode**, but you can switch to **group mode** using the `--group` flag.
@@ -223,22 +231,26 @@ python clamp3_score.py <query_dir> <ref_dir> [--group]
223
  python clamp3_score.py query_dir ref_dir --group
224
  ```
225
 
226
- #### **[`clamp3_search.py`](https://github.com/sanderwood/clamp3/blob/main/clamp3_search.py) - Semantic Search**
227
 
228
- Run retrieval tasks by comparing a query file to reference files in `ref_dir`. The query and `ref_dir` can be **any modality**, so there are **25 possible retrieval combinations**, e.g., text-to-music, image-to-text, music-to-music, music-to-text (zero-shot music classification), etc.
229
 
230
  ```bash
231
- python clamp3_search.py <query_file> <ref_dir> [--top_k TOP_K]
232
  ```
233
 
234
- #### **[`clamp3_eval.py`](https://github.com/sanderwood/clamp3/blob/main/clamp3_eval.py) - Retrieval Performance Evaluation**
235
 
236
- Evaluates **CLaMP3's retrieval performance** on a paired dataset using metrics like **MRR** and **Hit@K**. Works the same way as **pairwise mode** in `clamp3_score.py`—requiring **matching folder structure** and **filenames** between `query_dir` and `ref_dir`.
237
 
238
  ```bash
239
- python clamp3_eval.py <query_dir> <ref_dir>
240
  ```
241
 
 
 
 
 
242
  ## **Repository Structure**
243
  - **[code/](https://github.com/sanderwood/clamp3/tree/main/code)** → Training & feature extraction scripts.
244
  - **[classification/](https://github.com/sanderwood/clamp3/tree/main/classification)** → Linear classification training and prediction.
 
164
  ```
165
 
166
  ### **Overview of `clamp3_*.py` Scripts**
167
+ CLaMP 3 provides scripts for **semantic search**, **semantic similarity calculation**, **retrieval performance evaluation**, and **feature extraction** across five modalities. Simply provide the file path, and the script will automatically detect the modality and extract the relevant features.
168
 
169
  Supported formats include:
170
  - **Audio**: `.mp3`, `.wav`
 
179
 
180
  > **Note**: All files in a folder must belong to the same modality for processing.
181
 
182
+ #### **[`clamp3_search.py`](https://github.com/sanderwood/clamp3/blob/main/clamp3_search.py) - Semantic Search**
183
+
184
+ Run retrieval tasks by comparing a query file to reference files in `ref_dir`. The query and `ref_dir` can be **any modality**, so there are **25 possible retrieval combinations**, e.g., text-to-music, image-to-music, music-to-music, music-to-text (zero-shot music classification), etc.
185
+
186
+ ```bash
187
+ python clamp3_search.py <query_file> <ref_dir> [--top_k TOP_K]
188
+ ```
189
+
190
  #### **[`clamp3_score.py`](https://github.com/sanderwood/clamp3/blob/main/clamp3_score.py) - Semantic Similarity Calculation**
191
 
192
  This script calculates semantic similarity between query and reference files. By default, it uses **pairwise mode**, but you can switch to **group mode** using the `--group` flag.
 
231
  python clamp3_score.py query_dir ref_dir --group
232
  ```
233
 
234
+ #### **[`clamp3_eval.py`](https://github.com/sanderwood/clamp3/blob/main/clamp3_eval.py) - Retrieval Performance Evaluation**
235
 
236
+ Evaluates **CLaMP3's retrieval performance** on a paired dataset using metrics like **MRR** and **Hit@K**. Works the same way as **pairwise mode** in `clamp3_score.py`—requiring **matching folder structure** and **filenames** between `query_dir` and `ref_dir`.
237
 
238
  ```bash
239
+ python clamp3_eval.py <query_dir> <ref_dir>
240
  ```
241
 
242
+ #### **[`clamp3_embd.py`](https://github.com/sanderwood/clamp3/blob/main/clamp3_embd.py) - Feature Extraction**
243
 
244
+ If other scripts don't meet your needs, use `clamp3_embd.py` to extract features.
245
 
246
  ```bash
247
+ python clamp3_embd.py <input_dir_path> <output_dir_path> [--get_global]
248
  ```
249
 
250
+ **Feature Output:**
251
+ - **Without `--get_global`** → Shape: **(1, T, 768)** (T = time steps). Uses last hidden states before avg pooling, ideal for applications needing temporal info. Fine-tuning recommended.
252
+ - **With `--get_global`** → Shape: **(1, 768)**. Uses avg pooled features, suitable for applications needing global info, can be used directly.
253
+
254
  ## **Repository Structure**
255
  - **[code/](https://github.com/sanderwood/clamp3/tree/main/code)** → Training & feature extraction scripts.
256
  - **[classification/](https://github.com/sanderwood/clamp3/tree/main/classification)** → Linear classification training and prediction.