soham97 commited on
Commit
f00f693
·
1 Parent(s): 2c6e7ae
Files changed (1) hide show
  1. README.md +16 -2
README.md CHANGED
@@ -1,5 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Mellow
2
- [[`Paper`]()] [[`Checkpoint`]()]
3
 
4
  Mellow is a small Audio-Language Model that takes in two audios and a text prompt as input and produces free-form text as output. It is a 167M parameter model and trained on ~155 hours of audio (AudioCaps and Clotho), and achieves SoTA performance on different tasks with 50x fewer parameters.
5
 
@@ -79,7 +93,7 @@ print(f"\noutput: {response}")
79
  The composition of the ReasonAQA dataset is shown in Table. The training set is restricted to AudioCaps and Clotho audio files and the testing is performed on 6 tasks - Audio Entailment, Audio Difference, ClothoAQA, Clotho MCQ, Clotho Detail, AudioCaps MCQ and AudioCaps Detail.
80
 
81
  ![alt text](resource/data.png)
82
- - The ReasonAQA JSONs can be downloaded from Zenodo: [checkpoint \[drive\]](https://drive.google.com/file/d/1WPKgafYw2ZCifElEtHn_k3DkcVGjesqB/view?usp=sharing)
83
  - The audio files can be downloaded from their respective hosting website: [Clotho](https://zenodo.org/records/4783391) and [AudioCaps](https://github.com/cdjkim/audiocaps)
84
 
85
  ## Limitation
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - small audio-language model
5
+ - ALM
6
+ - audio
7
+ - music
8
+ - sound events
9
+ - audio reasoning
10
+ - audio captioning
11
+ - audio question answering
12
+ - zero-shot
13
+ - audio-text
14
+ ---
15
  # Mellow
16
+ [[`Paper`]()] [[`GitHub`](https://github.com/soham97/Mellow)] [[`Checkpoint`](https://huggingface.co/soham97/Mellow)]
17
 
18
  Mellow is a small Audio-Language Model that takes in two audios and a text prompt as input and produces free-form text as output. It is a 167M parameter model and trained on ~155 hours of audio (AudioCaps and Clotho), and achieves SoTA performance on different tasks with 50x fewer parameters.
19
 
 
93
  The composition of the ReasonAQA dataset is shown in Table. The training set is restricted to AudioCaps and Clotho audio files and the testing is performed on 6 tasks - Audio Entailment, Audio Difference, ClothoAQA, Clotho MCQ, Clotho Detail, AudioCaps MCQ and AudioCaps Detail.
94
 
95
  ![alt text](resource/data.png)
96
+ - The ReasonAQA JSONs can be downloaded from Zenodo: [checkpoint](https://drive.google.com/file/d/1WPKgafYw2ZCifElEtHn_k3DkcVGjesqB/view?usp=sharing)
97
  - The audio files can be downloaded from their respective hosting website: [Clotho](https://zenodo.org/records/4783391) and [AudioCaps](https://github.com/cdjkim/audiocaps)
98
 
99
  ## Limitation