FrostSpeech TTS - Offline Text-to-Speech (TTS) & Voices Plugin
Frostember Studios
$26.99
$29.99
10%OFF
(no ratings)
Jump AssetStore
Generate high-quality voiceovers directly in the Unity Editor for free. No API costs or internet required! Fully offline Text-to-Speech using Piper, Kokoro models + ElevenLabs & Edge cloud support.Welcome to FrostSpeech TTS, the ultimate Text-to-Speech generation ecosystem for Unity. Whether you are building RPGs with thousands of dialogue lines, prototyping voiceovers, or creating final character voices, FrostSpeech delivers high-quality AI speech directly inside the Editor with zero hassle.Stop wasting hours downloading Python scripts or paying hundreds of dollars for monthly cloud API subscriptions. FrostSpeech bridges the gap between premium AI voice generation and perfect Editor integration.DOCUMENTATION + ROADMAP | YOUTUBE SHOWCASE | FORUM🔥 Key FeaturesZero API Costs (100% Offline): Generating high-quality AI voices in the cloud is expensive. FrostSpeech integrates state-of-the-art local neural engines (Piper and Sherpa-ONNX) directly into Unity. Generate unlimited speech completely offline and for free!Next-Gen AI Models (Kokoro & More): Access hundreds of highly realistic, open-source voice models, including the widely acclaimed Kokoro TTS. The tool automatically downloads and manages these .onnx models behind the scenes.Premium Cloud Integration: Need the absolute best industry-standard voices? Switch the engine to ElevenLabs and generate premium cloud audio directly in your project using your own API key.Auto-Import & Caching Workflow: Say goodbye to manually moving files. Enter your text, adjust the sliders (Speed, Noise Scale), and click Generate. The tool automatically creates the .wav or .mp3 file, imports it into your Asset Database, and readies it as an AudioClip.Character Profile System: Maintain perfect consistency across your game's cast. Save and load specific generation settings (engine, voice model, speed, and pitch) per-character using our custom FrostSpeechTTSProfile ScriptableObjects.One-Click Setup Wizard: Get the entire local TTS ecosystem running in seconds. Upon first launch, the tool automatically downloads and extracts the necessary core binaries (.exe, macOS, or Linux) for your specific operating system.API and online models: ElevenLabs & Edge TTS for top-tier audio outputs💻 API & Scripting Triggering programmatic voice generation via code is incredibly easy using the included FrostSpeechAPI. Just pass your text and character profile to the asynchronous API, and receive a fully loaded AudioClip at runtime or in custom Editor tools. Perfect for dynamic quest dialogue systems.Questions or need support?contact@frostemberstudios.comCore Functionality:Asset Type: Editor Extension / Utility / APIPrimary Use: Offline Text-to-Speech (TTS) Generation, Audio Asset Creation, Dialog PrototypingSupported Engines: Piper (Local/Offline), Sherpa-ONNX (Local/Offline - including Kokoro models), ElevenLabs (Cloud API).Compatibility:Unity Version: Requires Unity 2021.3 LTS or newer.Supported OS (Editor-Time): Windows 10/11 (x64), macOS (Intel & Apple Silicon), Linux (x64).Build Targets: The local AI generation is designed primarily for Editor-time use to generate static .wav files for your game builds. The runtime API (FrostSpeechAPI) can be used for runtime audio generation but requires manual binary packaging for your target platform. ElevenLabs API generation works at runtime on any platform with internet access.Architecture & Dependencies:Dynamic Binary Downloading: To comply with Unity Asset Store guidelines (and avoid GPLv3 viral licensing issues with eSpeak-NG), core engine binaries (.exe, macOS, Linux executables) and AI .onnx models are NOT included in the initial .unitypackage. They are downloaded securely and automatically on-demand via the tool's built-in setup wizard into a hidden Binaries~ folder.UI Framework: Built entirely using Unity's modern UIElements (UI Toolkit) for a clean, responsive, and native Editor window experience.Scripting: Includes full C# source code (Namespaces: FrostemberStudios.FrostSpeechTTS). Utilizes async/await Tasks for non-blocking UI during downloads and generation. Uses UnityWebRequest for model downloading and ElevenLabs integration.Data & Storage:Output Formats: Automatically generates and imports .wav files (for local Piper/Sherpa) and .mp3 files (for ElevenLabs cloud API) directly into the Unity Asset Database as AudioClip objects.Settings Storage: Uses custom FrostSpeechTTSProfile ScriptableObjects for character voice configurations and EditorPrefs for secure API key storage.Model Size: Be aware that downloading high-quality AI models requires disk space. Typical .onnx models range from 20MB to 100MB per voice.Important Legal Note:The dynamically downloaded HuggingFace AI models come with varying licenses. Some are Public Domain/MIT, while others are restricted to Non-Commercial Use (CC-BY-NC). Please verify the specific license of the voice model before using generated audio in a commercial project.


