Whisper Medium Multi (OnnxRuntime/DirectML) for Unity
They Love Games
$66.00
(no ratings)
Jump AssetStore
The Whisper Medium Multi model runs on the ONNX Runtime accelerated by DirectML; ready-to-drop into a project for 64-bit Windows providing fast speech translation with minimal setup no extra downloadsThis is an audio tool that is not affected by SRP compatiblity.This package currently supports 64-bit Windows for the Unity Editor and Standalone.SummaryWhisper Medium Multi (DirectML) is a turnkey Windows x64 Unity package that ships a single native DLL containing the OpenAI Whisper medium multilingual model, accelerated by ONNX Runtime + DirectML. It works offline in the Editor and Windows Standalone builds, enabling fast, local language detection and transcription (in the source language) from either audio clips or microphone input (continuous or push-to-talk). For translation, the package uses an internal Qwen 2.5 text translation backend: the app transcribes with Whisper, then translates the transcript text via a dedicated `TranslateText(text, from, to)` API.User scenarios- You’re building a game and already support controller/keyboard/mouse, but you want speech input for accessibility and hands-free actions.- You want a voice-driven application that reacts to spoken commands without cloud latency.- You need multilingual transcription and optional transcript translation on a machine with no internet access, so you need a fully offline solution.- Single native DLL with Whisper medium multilingual on ONNX Runtime + DirectML (offline, Windows x64).- Speech-to-text from audio clips (StreamingAssets) and live microphone input.- Includes multilingual sample WAV clips for quick testing (Chinese/French/Spanish).- Microphone modes: continuous capture and push-to-talk.- Language auto-detect (or user-selected language) plus transcription from mono 16 kHz PCM.- Optional offline text translation via Qwen 2.5 (transcript text in, translated text out).- Clean API surface: `DetectLanguage(pcm)`, `Transcribe(pcm, language)`, `TranslateText(text, from, to)` (pass empty string for best-effort auto language).- Two example scenes: one for audio clips (transcribe/translate); one for real-time mic dictation (continuous + push-to-talk).- Used AI to draft Python scripts that convert the MIT-licensed Whisper medium multilingual model to ONNX.- Assisted implementation of the native DLL using ONNX Runtime + DirectML (embedding the model and wiring C# ↔ C++ calls).- Assisted Unity scripting to hook UI, read microphone/audio clips, and build example scenes.- Assisted integrating an internal Qwen-based text translation backend behind a neutral external API (`TranslateText`).- Helped outline unit tests and documentation; all code was manually reviewed and tested.




