Microsoft Vibe Voice
Microsoft just released VibeVoice - 1.5B SoTA Text to Speech model - MIT Licensed
It can generate up 90 minutes of audio Supports simultaneous generation of > 4 speakers Streaming and larger 7B model in-coming Capable of cross-lingual and singing synthesis
Source: