I was wondering if someone has thought about bringing a voice assistant into Plasma.
There are plenty of open source solutions both for TTS and STT, with support for many languages. I’m thinking in OpenAI’s whisper for STT and Piper for TTS. Both projects are giving quality results, and creating a voice assistant shouldn’t be a big issue for the great developers at KDE
Besides Whisper and Piper, there is a quite powerful noise reduction library named RNNoise, and we could use Llama.cpp to bind LLM like Llama or Mistral for answering some questions (we could use chatgpt on mobile devices or old computers).
The idea behind would be to have a voice assistant for the desktop, capable to interact with the installed applications or answer generic questions, for instance:
- “open firefox”: open firefox
- “open discuss dot kde dot org”: open your default browser and go to specified url
- “detect bluetooth devices”: scan for bluetooth devices and let connect them, it can show the desktop wizard as well
- “set an alarm to 6pm”: An alarm should sound at specified time
- “set an appointment for tomorrow at 10”: a new entry in korganizer with the subjet specified should be created
- “open display settings”: open “System preferences” to the specified section
- “open downloads folder”/“empty recycle bin”: interact with the filesystem
- “what’s the weather like?”: return your location’s weather
- “what’s the distance from moon to mars” “tell me a recipe with low fat for today”: answer generic questions via Llama/Mistral/ChatGPT
- “create a python script to connect to the OpenAI API”/ “create a bash script to create a rsync backup to a network share”: open kate and put the script there (or create a frontend to interact with the AI assistant.
- “Turn the air conditioner off”: connect HomeAssistant API and control your home.
Possibilities are endless.