100% Local · 100% Private
Speech-to-Text,
Your Way
Longform & live transcription, speaker diarization, audio notebook — powered by Whisper, NeMo, VibeVoice & more. Runs entirely on your machine.
Features
Everything You Need
100% Local & Private
Your audio never leaves your machine. No cloud, no telemetry.
Multi-Backend STT
Whisper, NeMo Parakeet & Canary, Vibe Voice ASR and whisper.cpp - supporting native GPU acceleration on CUDA, AMD, Intel, Apple Silicon.
Longform Transcription
Hours of audio transcribed in minutes with GPU acceleration.
Live Mode
Real-time sentence-by-sentence transcription for continuous dictation.
Speaker Diarization
Identify who said what with automatic speaker labeling and subtitling.
Audio Notebook
Calendar-based view, full-text search, and audio playback for all your notes.
LM Studio Integration
Chat with a local AI about your transcription notes via LM Studio.
Remote Access
Access your home GPU from anywhere via Tailscale or share on your local network.
Cross-Platform
Linux, Windows 11, and macOS with native Apple Silicon (Metal) support.
See It In Action
Tour & How-To
App Tour
A full walkthrough of all features — session, notebook, server config, and more.
Quick Start
From installation to your first transcription in under a minute.
Backstory
About This Project
TranscriptionSuite started as a personal tool and turned into a hobby project. I’m an engineer — just not a software engineer. This whole thing is vibecoded, but not blindly: for example, Dockerizing the server for easy distribution was 100% my idea.
I’m using this project to learn programming. Starting from virtually nothing, I now have a decent grasp of Python, git, uv & Docker. I started doing this because it’s fun, not to make money — though I do find, despite my mechanical engineering degree, that I want to follow software as a career.
Since I dogfood the app every day, I’m not going to abandon it. I’ll also try to deal with bugs as soon as possible.
Inspired by RealtimeSTT .