Dictee 1.3.2 — Offline voice dictation for Plasma 6 (Wayland-native, with plasmoid)

Hi everyone,

Let me introduce dictee : a voice-dictation tool with its own Plasma 6 plasmoid — but it does more than that. Your dictation is transcribed and translated live, right where your cursor is, of course ; and there’s also a full diarization backend for your meetings and video calls. 100 % local by default, Wayland-native, with several visual-feedback options to choose from.

dictee push-to-talk demo — press F9, speak, text appears at the cursor

What it does

  • Push-to-talk dictation with a customizable shortcut, for live transcription and translation right at the cursor.
  • 4 ASR backends switchable on the fly: Parakeet-TDT 0.6B v3 (25 langs, native punctuation, default), Canary-1B v2 (built-in translation, 48 pairs), faster-whisper (99 langs), Vosk (lightweight, strict offline).
  • Optional post-processing: regex/dictionary rules, language-aware capitalization, and LLM cleanup via local Ollama (lightweight model like gemma3:4b, 100 % offline).
  • File transcription window with timeline player, multi-tab, speaker diarization up to 4 speakers (NVIDIA Sortformer), per-tab translation, LLM analysis on diarized transcripts via your LLM service of choice — local (Ollama, LM Studio, vLLM, Jan, or custom endpoint) or cloud (OpenRouter, Mistral, DeepSeek, Perplexity, Groq, Claude, Gemini, OpenAI…). Export to PDF / SRT / JSON / Markdown.

Plasma 6 plasmoid

Quick access to dictee and visual feedback right in the panel via the plasmoid.

dictee plasmoid widget on Plasma 6

File transcription with diarization

A dedicated window for offline transcription of audio/video files. The chunked diarization pipeline lets you work on long files.
dictee-transcribe — file transcription with speaker diarization, audio player, and per-tab translation

Install

curl -fsSL https://raw.githubusercontent.com/rcspam/dictee/master/install.sh | bash

Or grab the package for your distro directly from the v1.3.2 release — Debian/Ubuntu, Fedora/RHEL, Arch PKGBUILD (AUR-ready), and a generic tarball for other distros. You’ll also find a standalone .plasmoid if you only want the widget.

CUDA libs are bundled via a pip venv at postinst on .deb / .rpm — no NVIDIA repo to add. CPU-only packages exist for laptops without a discrete GPU. Since v1.3.2, the CUDA package no longer pulls the 1.5 GB of pip libs when no NVIDIA GPU is detected — automatic CPU fallback at runtime.

Repo & license

Looking for feedback

I’d particularly like to hear from Plasma 6 users on:

  1. Wayland edge cases — different compositors, multi-monitor focus, Activities.
  2. Plasmoid UX — is the cheatsheet popup useful, or noisy ?
  3. Anything that breaks on a fresh install — especially on distros I test less often (openSUSE, NixOS).
5 Likes

Hi everyone.

For those who would not know, animation-speech is fully configurable.