Microsoft Vibe Voice

August 26, 2025

Microsoft just released VibeVoice - 1.5B SoTA Text to Speech model - MIT Licensed

It can generate up 90 minutes of audio Supports simultaneous generation of > 4 speakers Streaming and larger 7B model in-coming Capable of cross-lingual and singing synthesis

alt microsoft vibe voice in huggingface

alt next microsoft vibe voice in huggingface

Source:

https://huggingface.co/microsoft/VibeVoice-1.5B