Feature request: Speech integration into krunner

Motivation: accessibility

I can’t speak. I use linux as my primary OS of choice.
To speak, I use a Windows only program called Tobii Communicator 5. It is fine, but not great.
I have a Tobii I13 device mounted on my wheelchair running Windows 10 (on a i5-7300u cpu). I also have a laptop running Fedora Atomic Desktop (Kionite flavour). On that device, I installed a Windows KVM to have Communicator 5 installed.

On iOS and MacOS, there is a feature called Live speech. That works great.
To get accessibility to the next level on kde, here is my request: Krunner with speech integration.
I know that KSpeech exist, but that does not work with eg. Zoom microphone
I think this would be a great feature for the kde desktop :slight_smile: - and spot on regarding KDE’s accessibility goal :smiley:

5 Likes

It’s definitely not a real implementation of what you’re ask for, but KRunner can run arbitrary commands, including speak-ng. So if you type for example speak-ng "hello!" into KRunner, it will speak that text.

Unfortunately espeak-ng sounds terrible, very robotic.

Maybe it is worth to look at piper
GitHub - rhasspy/piper: A fast, local neural text to speech system

Piper Voice Samples

But it has not been very well integrated with speech dispatcher yet. Some interesting topics to follow:
Speech dispatcher integration · Issue #285 · rhasspy/piper · GitHub

Piper as speech-dispatcher plugin · rhasspy/piper · Discussion #328 · GitHub

Generating speech locally in the web browser · Issue #352 · rhasspy/piper · GitHub

You can also integrate piper or any other TTS with clipboard:

In actions:
.*
In command
echo %s | piper --model /home/user/.local/share/piper/voices/en_US-libritts_r-medium.onnx -s 261 --length-scale 1.15 --sentence-silence 1.2 --noise-w 0 --output-raw | aplay -r 22050 -f S16_LE -t raw -

or for translate shell
trans -sp -n US,f -no-translate -e google -i echo %s | xclip -selection clipboard

After that you just copy text and press “execute action” icon in clipboard.

Also have a look at pied
GitHub - Elleo/pied: Pied makes it simple to install and manage text-to-speech Piper voices for use with Speech Dispatcher.
It produces speech dispatcher config files and allows to quickly switch voices.

Depending on your distro and sound system this program might be very useful:
qpwgraph - A PipeWire Graph Qt GUI Interface

It allows to connect diffrent sound inputs and outputs.

Helvum is similar.

I hope it helps a bit.

2 Likes

Piper and Whisper implementation would be interesting for many reasons, not just accessibility, but in behalf of laziness and comfort as well.

1 Like

Well said. But not only that. I use it for speed reading, which allows me to consume online content more quickly and I find it easier to focus my attention on listening than reading. So it is very useful.

@Jdreioe
By the way, Piper is integrated into Read Aloud add-on for Chrome / Chromium and soon for Firefox. You can also install MS Edge on Linux (with flatpak) . MS Edge on Windows has online TTS of great quality. Perhaps it works on Linux as well. But I am not sure because I haven’t tested it yet.

Edition:

I think a good idea would be to have a widget that would act similarly to Read aloud, integrated with clipboard so that user could just select text in any application and press icon of this widget and have TTS. The solution with clipboard actions is similar but having widget would be much more convenient. Being able to redirect / enable sound from TTS to some applications like for example video conferencing apps would be great.

1 Like

Piper would be awesome! Especially because the voices are actually fine.
The feature Apple has implemented that I dont think any other OS/DE/program has yet, is (quote from Apples website) “phone call other participants hear your words spoken through their device’s speakers or headphones. Otherwise, the voice plays from your device’s speakers so you can type to speak in in-person conversations”
That would be very helpful, because when having Zoom meetings, other people cant really hear what I’m saying - unless its through my iPad via Live Speech.
If KDE could add something like this, it would be very awesome! Maybe just a widget with tts or a updated KSpeech would be awesome

1 Like