Microsoft is testing this AI model that they think is “too risky” to launch

Microsoft's VALL-E 2, achieving zero-shot TTS human parity, features Repetition Aware Sampling and Grouped Code Modeling. LibriSpeech and VCTK datasets validate robust, natural, similar speech. Usable in accessibility, education, interactive voice response, and translation chatbots, but public release withheld over risks like speaker impersonation. Token repetition and decoding history refine performance.

Microsoft is testing this AI model that they think is “too risky” to launch
Microsoft's VALL-E 2, achieving zero-shot TTS human parity, features Repetition Aware Sampling and Grouped Code Modeling. LibriSpeech and VCTK datasets validate robust, natural, similar speech. Usable in accessibility, education, interactive voice response, and translation chatbots, but public release withheld over risks like speaker impersonation. Token repetition and decoding history refine performance.