Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
microsoft
/
Phi-4-multimodal-instruct
like
1.39k
Follow
Microsoft
12.5k
Automatic Speech Recognition
Transformers
Safetensors
24 languages
phi4mm
text-generation
nlp
code
audio
speech-summarization
speech-translation
visual-question-answering
phi-4-multimodal
phi
phi-4-mini
custom_code
arxiv:
2503.01743
arxiv:
2407.13833
License:
mit
Model card
Files
Files and versions
Community
74
Train
Use this model
main
Phi-4-multimodal-instruct
/
figures
Ctrl+K
Ctrl+K
13 contributors
History:
1 commit
garg-amit
Added model files
d93d2f6
3 months ago
audio_understand.png
42.6 kB
Added model files
3 months ago
multi_image.png
192 kB
Added model files
3 months ago
speech_qa.png
46.8 kB
Added model files
3 months ago
speech_recog_by_lang.png
90.7 kB
Added model files
3 months ago
speech_recognition.png
63.5 kB
Added model files
3 months ago
speech_summarization.png
41 kB
Added model files
3 months ago
speech_translate.png
47.7 kB
Added model files
3 months ago
speech_translate_2.png
46.3 kB
Added model files
3 months ago
vision_radar.png
174 kB
Added model files
3 months ago