Speech-to-Text error “invalid .srt file” – Kdenlive 25.12.1

Post

Hello,

I’m experiencing a reproducible issue with Speech Recognition / Speech-to-Text in Kdenlive 25.12.1 on Linux, and I’d like to report it with as much technical detail as possible.


Environment

  • Kdenlive version: 25.12.1

  • OS: Linux (Ubuntu-based)

  • UI language: pt-BR

  • Speech engines tested: VOSK and Whisper

  • Language model: pt-BR (installed and detected correctly)

  • Feature status: Speech Recognition configured


Steps to reproduce

  1. Open any project with a video clip containing clear, audible speech

  2. Ungroup the clip and select only the audio clip

  3. Go to Sequence → Subtitles → Speech Recognition

  4. Choose either VOSK or Whisper

  5. Start processing


Observed behavior

  • Processing starts normally

  • When it finishes, Kdenlive shows the error:

“The selected file /tmp/xxxx.srt is invalid”

  • No subtitles are created in the timeline

Critical observation

While monitoring the /tmp directory during processing, I observed that:

  • Kdenlive generates a temporary .wav file

  • This .wav file is:

    • Very small

    • Almost empty

    • Contains only low-level noise, similar to an open microphone

    • Does NOT contain the actual audio from the selected clip

  • Because of this, no valid .srt file is generated (or the file is empty/malformed)

This indicates that the failure happens before the speech recognition engine runs.


Implications

The issue appears to be related to:

  • Audio extraction or rendering for speech recognition

  • Possibly the MLT audio consumer used internally

  • Subtitle system integration introduced in the 25.12.x branch

Both VOSK and Whisper behave the same, which strongly suggests that the engines themselves are not the root cause.


What works

I found two workarounds:

  1. Manually create a subtitle track before running speech recognition

    • Sequence → Subtitles → Add Subtitle Track (SRT)

    • After this, speech recognition works correctly

  2. Enable “Save to file”, choose a path outside /tmp, then import the .srt manually

These workarounds avoid the automatic SRT creation/import step.


Additional notes

  • Audio playback in the timeline works correctly

  • Audio meters show correct levels

  • Manually exporting the same audio clip to WAV works correctly

  • Cleaning /tmp does not change the behavior


Questions

  • Is this a known bug in Kdenlive 25.12.x?

  • Is this related to recent changes in the subtitle system or speech recognition pipeline?

  • Is there a recommended workflow to avoid this issue?

I’m happy to provide logs, debug output, or additional tests if needed.

Thanks for your time!

Hello,

after some trial and error trying to solve the same issue on my end. I came to realise this was caused by having my timeline zone being too short (the blue selector at the top of the timeline).

Extending the selected area fixed the issue for me both for Vosk and Whisper.
I’m thinking this might be the same issue for you - as by changing the selected timeline zone, the generated .wav file in /tmp changed to the selected audio portion.

1 Like