GPU rendering: only uses 60-75% of the GPU & CPU

If I use libx264 to encode, the CPU activity reaches 100%. But with hardware encoding (h264_nvenc), the GPU and CPU are stuck between 60% and 75%. No matter what options I choose in encoder threads, parallel processing, etc, they never go above 60-75%. I don’t even have a beefy GPU…

I then tried the same project and rendering settings in Shotcut, the GPU and CPU reach 85-100%, making the exports about 10-20% faster compared to Kdenlive.

Is there any setting I can change to make Kdenlive use more processing power of the GPU/CPU? The roadmap mentions “Improved GPU support”, so maybe export speed will improve a bit in the future? :slight_smile:

Windows 10 22H2
Kdenlive 24.12.1

Thank you for the comparison. This is interesting.

Do you have the same number of effects in both projects? If yes, I guess you have used GPU effects in Shotcut and this is why Shotcut uses the GPU for rendering all the time. In Kdenlive all effects are rendered on CPU so Kdenlive has to switch between CPU and GPU rendering all the time and this is why Kdenlive doesn’t can use 100% CPU and GPU. So Kdenlive needs about 10-20% more time for rendering. There is no “hidden switch” to make Kdenlive faster in rendering.

GPU effects are on our roadmap as mid-term goal.

I tested with and without any effects, Shotcut was always faster, but with effects the difference was more noticeable.
I just tested it again, using the latest versions of Kdenlive and Shotcut, plus an older version of Shotcut that I’ve used for a very long time without updating, because it was quite stable.

9 minute 30fps 4K clip, no effects, H264_NVENC, VBR 71% quality, parallel processing, GOP 15 frames, B frames 2, no additional parameters:

Kdenlive - 4:57, 156 Mb/s, CABAC 4/Ref frames
Shotcut - 4:41, 120Mb/s, CABAC 4/Ref frames
Old Shotcut - 3:41 :open_mouth:, 146Mb/s, CABAC 2/Ref frames

CPU and GPU activity at 90%-100% on old Shotcut only, latest version is similar to Kdenlive (60-70%), unlike what I said in the original post. I guess I was thinking about the old version.

Now the same thing, but with 200% saturation effect applied to the whole clip:
Kdenlive - 13:05, 177 Mb/s
Shotcut - 11:20, 120Mb/s
Old Shotcut - 11:46 164Mb/s

CPU and GPU activity more or less similar in all programs, around 60-70%.

Despite the different bitrates, the 3 versions look identical to me, only when zooming in 500% I can see some tiny differences between the exported clips.

How the hell does Shotcut keep the same bitrate after applying the saturation effect? :thinking: :sweat_smile:
Also, 177 to 120Mb/s is a huge difference between Kdenlive and Shotcut. I wonder what parameters/presets they are using…

Anyway, it’s not that important that rendering on Kdenlive is a bit slower. I used Shotcut for several years, I’ve nothing negative to say about it, but Kdenlive is better in many ways, so I’m here for the long term. :slight_smile:
It’s great to know that GPU effects are on the roadmap.

That’s pretty much entirely down to the encoding options it’s passing ffmpeg - and if they aren’t the same then all bets are off as to what your timing numbers actually mean and where that time is being spent.

You can see what options kdenlive is passing in the text box in the bottom-right corner of the render dialog. I’m not sure how you might do the same for shotcut.

Whether or not each version uses GPU or CPU for effects is the main place where something other than ffmpeg (libav) can have any effect on what you are measuring - but a hard final CBR of 120Mb/s is going to be an ffmpeg parameter.

Both shotcut and kdenlive are using MLT, so comparing versions using the same version of it is going to make these measurements the most indicative of things we might be missing out on optimising in kdenlive once you’ve got them both using the same ffmpeg parameters.