Update README.md
Browse files
README.md
CHANGED
@@ -27,27 +27,20 @@ Notable differences from other available models include:
|
|
27 |
1. Performance: CED with 10M parameters outperforms the majority of previous approaches (~80M).
|
28 |
|
29 |
### Model Sources
|
30 |
-
- **
|
31 |
-
- **Repository:** https://github.com/jimbozhang/hf_transformers_custom_model_ced
|
32 |
- **Paper:** [CED: Consistent ensemble distillation for audio tagging](https://arxiv.org/abs/2308.11957)
|
33 |
- **Demo:** https://huggingface.co/spaces/mispeech/ced-base
|
34 |
|
35 |
-
## Install
|
36 |
-
```bash
|
37 |
-
pip install git+https://github.com/jimbozhang/hf_transformers_custom_model_ced.git
|
38 |
-
```
|
39 |
-
|
40 |
## Inference
|
41 |
```python
|
42 |
-
>>> from
|
43 |
-
>>> from ced_model.modeling_ced import CedForAudioClassification
|
44 |
|
45 |
>>> model_name = "mispeech/ced-base"
|
46 |
-
>>> feature_extractor =
|
47 |
-
>>> model =
|
48 |
|
49 |
>>> import torchaudio
|
50 |
-
>>> audio, sampling_rate = torchaudio.load("
|
51 |
>>> assert sampling_rate == 16000
|
52 |
>>> inputs = feature_extractor(audio, sampling_rate=sampling_rate, return_tensors="pt")
|
53 |
|
|
|
27 |
1. Performance: CED with 10M parameters outperforms the majority of previous approaches (~80M).
|
28 |
|
29 |
### Model Sources
|
30 |
+
- **Repository:** https://github.com/RicherMans/CED
|
|
|
31 |
- **Paper:** [CED: Consistent ensemble distillation for audio tagging](https://arxiv.org/abs/2308.11957)
|
32 |
- **Demo:** https://huggingface.co/spaces/mispeech/ced-base
|
33 |
|
|
|
|
|
|
|
|
|
|
|
34 |
## Inference
|
35 |
```python
|
36 |
+
>>> from transformers import AutoModelForAudioClassification, AutoFeatureExtractor
|
|
|
37 |
|
38 |
>>> model_name = "mispeech/ced-base"
|
39 |
+
>>> feature_extractor = AutoFeatureExtractor.from_pretrained(model_name, trust_remote_code=True)
|
40 |
+
>>> model = AutoModelForAudioClassification.from_pretrained(model_name, trust_remote_code=True)
|
41 |
|
42 |
>>> import torchaudio
|
43 |
+
>>> audio, sampling_rate = torchaudio.load("/path-to/JeD5V5aaaoI_931_932.wav")
|
44 |
>>> assert sampling_rate == 16000
|
45 |
>>> inputs = feature_extractor(audio, sampling_rate=sampling_rate, return_tensors="pt")
|
46 |
|