first
Browse files
README.md
CHANGED
@@ -13,7 +13,7 @@ tags:
|
|
13 |
- audio-text
|
14 |
---
|
15 |
# Mellow
|
16 |
-
[[`Paper`]()] [[`GitHub`](https://github.com/soham97/Mellow)] [[
|
17 |
|
18 |
Mellow is a small Audio-Language Model that takes in two audios and a text prompt as input and produces free-form text as output. It is a 167M parameter model and trained on ~155 hours of audio (AudioCaps and Clotho), and achieves SoTA performance on different tasks with 50x fewer parameters.
|
19 |
|
@@ -43,8 +43,8 @@ python example.py
|
|
43 |
|
44 |
## Usage
|
45 |
The MellowWrapper class allows easy interaction with the model. To use the wrapper, inputs required are:
|
46 |
-
- `config`: The option supported is "
|
47 |
-
- `model`: The option supported is "v0
|
48 |
- `examples`: List of examples. Each example is a list containing three entries: audiopath1, audiopath2, prompt
|
49 |
|
50 |
Supported functions:
|
|
|
13 |
- audio-text
|
14 |
---
|
15 |
# Mellow
|
16 |
+
[[`Paper`]()] [[`GitHub`](https://github.com/soham97/Mellow)] [[`🤗Checkpoint`](https://huggingface.co/soham97/Mellow)] [[`Zenodo`](https://huggingface.co/soham97/Mellow)]
|
17 |
|
18 |
Mellow is a small Audio-Language Model that takes in two audios and a text prompt as input and produces free-form text as output. It is a 167M parameter model and trained on ~155 hours of audio (AudioCaps and Clotho), and achieves SoTA performance on different tasks with 50x fewer parameters.
|
19 |
|
|
|
43 |
|
44 |
## Usage
|
45 |
The MellowWrapper class allows easy interaction with the model. To use the wrapper, inputs required are:
|
46 |
+
- `config`: The option supported is "v0"
|
47 |
+
- `model`: The option supported is "v0"
|
48 |
- `examples`: List of examples. Each example is a list containing three entries: audiopath1, audiopath2, prompt
|
49 |
|
50 |
Supported functions:
|