Hi everyone!
I’d like to propose an improvement for Kdenlive’s Whisper speech-to-text integration.
ROCm already provides full support for PyTorch and Whisper, as highlighted in a recent ROCm blog post. Thanks to this, a growing range of AMD hardware can run Whisper with GPU acceleration — including several Radeon RX 7000 series GPUs, and (on Linux all, on Windows some) Radeon RX 9000 series GPUs. In addition, the latest Ryzen AI processors starting from the Ryzen AI 9 365 and above are supported on both Linux and Windows.
Given that Whisper can already take advantage of ROCm, it would be fantastic if Kdenlive could enable or expose this ROCm-based GPU acceleration for AMD GPUs and Ryzen AI hardware. This would dramatically improve transcription performance for a large number of users who rely on AMD systems.
Would it be possible for the team to explore support for ROCm-accelerated Whisper within Kdenlive?
Thanks a lot for considering this!