Local, offline, cross-platform text-to-speech using Unity Inference.UPDATE: Now supports Supertonic-3 and the following languages: English, Korean, Japanese, Arabic, Bulgarian, Czech, Danish, German, Greek, Spanish, Estonian, Finnish, French, Hindi, Croatian, Hungarian, Indonesian, Italian, Lithuanian, Latvian, Dutch, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Swedish, Turkish, Ukrainian, Vietnamese.SimpleOfflineTTS is a package that will let you generate AudioClips from any text string using Unity Inference. It's completely offline without any networking or services required.It does not have any platform-specific code or libraries, so can be used on any platform supported by the Inference package (Windows, Mac, WebGL and Android tested). Inference can be performed on the CPU or GPU.To generate the voices, you can either use the Supertonic-3-TTS models (included), with 10 voices and multi-language support, or free downloadable (.onnx) voice models from Piper (English-only). You can easily change the voices using the Inspector, and download more and add them to the project.Unlike some other speech engines, SimpleOfflineTTS does not use the eSpeak-ng libraries, which would be covered by a GPL-v3 licence, making them unusable for commercial purposes.This asset uses the Supertonic-3 AI model under the Open RAIL-M License; see THIRD_PARTY_NOTICES.txt file in package for details.For the Piper implementation, this asset uses a custom phonemiser built from scratch in C#, and speech models mostly covered by Creative Commons licences (it's worth checking the MODEL_CARD for all models you intend to use individually).- Fast and high-quality text-to-speech generation- Cross-platform- Supports 31 Languages- Local & Offline; no networking required- CPU or GPU-accelerated- Expandable using many free Piper voice modelsI used AI to debug and refine my C# code.




