lrauch commited on
Commit
cbc23a7
·
verified ·
1 Parent(s): 854b150

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -8
README.md CHANGED
@@ -8,6 +8,13 @@ tags: []
8
  ConvNext trained on the XCL dataset from BirdSet, covering 9736 bird species from Xeno-Canto. Please refer to the [BirdSet Paper](https://arxiv.org/pdf/2403.10380) and the
9
  [BirdSet Repository](https://github.com/DBD-research-group/BirdSet/tree/main) for further information.
10
 
 
 
 
 
 
 
 
11
  - The model is trained on 5-second clips of bird vocalizations.
12
  - num_channels: 1
13
  - pretrained checkpoint: facebook/convnext-base-224-22k
@@ -17,14 +24,6 @@ ConvNext trained on the XCL dataset from BirdSet, covering 9736 bird species fro
17
  - melscale: n_mels: 128, n_stft: 513
18
  - dbscale: top_db: 80
19
 
20
- ### Model Details
21
- ConvNeXT is a pure convolutional model (ConvNet), inspired by the design of Vision Transformers, that claims to outperform them.
22
-
23
- ## How to use
24
- The BirdSet data needs a custom processor that is available in the BirdSet repository. The model does not have a processor available.
25
- The model accepts a mono image (spectrogram) as input (e.g., `torch.Size([16, 1, 128, 1024])`)
26
-
27
-
28
  ```python
29
  import torch
30
  from transformers import AutoModelForImageClassification
 
8
  ConvNext trained on the XCL dataset from BirdSet, covering 9736 bird species from Xeno-Canto. Please refer to the [BirdSet Paper](https://arxiv.org/pdf/2403.10380) and the
9
  [BirdSet Repository](https://github.com/DBD-research-group/BirdSet/tree/main) for further information.
10
 
11
+ ### Model Details
12
+ ConvNeXT is a pure convolutional model (ConvNet), inspired by the design of Vision Transformers, that claims to outperform them.
13
+
14
+ ## How to use
15
+ The BirdSet data needs a custom processor that is available in the BirdSet repository. The model does not have a processor available.
16
+ The model accepts a mono image (spectrogram) as input (e.g., `torch.Size([16, 1, 128, 1024])`)
17
+
18
  - The model is trained on 5-second clips of bird vocalizations.
19
  - num_channels: 1
20
  - pretrained checkpoint: facebook/convnext-base-224-22k
 
24
  - melscale: n_mels: 128, n_stft: 513
25
  - dbscale: top_db: 80
26
 
 
 
 
 
 
 
 
 
27
  ```python
28
  import torch
29
  from transformers import AutoModelForImageClassification