Cmake builds of some KDE projects make the system unresponsive by queuing way too many jobs

Not sure if this is the right place to post this, but since this isn’t strictly a bug affecting any built projects, I’m avoiding the bugtracker…

I’ve been building kwin and plasma-workspace on my own for a long time now because I’m applying some minor patches that change some hardcoded default values which I don’t agree with upstream. The patches themselves however are irrelevant and are not the point of this thread. All they do is force me to fork Arch Linux’s respective packages and build them myself with these patches.

What I want to talk about in this thread is that both kwin and plasma-workspace apparently have some issues with their cmake configs. The problem is that when building, way too many build-jobs are queued, overwhelming the CPU scheduler and making the (desktop) system really unresponsive during the build time. Compiling other stuff like the Linux kernel for example, which of course is not cmake-based, but also other large projects which are cmake-based, do not have that problem.

My system uses a Ryzen 3950X (16C32T) with 64 GiB of RAM, and I have set -j32. In well-behaving build-setups, this results in at most 32 build jobs being queued at a time, utilizing nearly 100% of my CPU, all while the system is staying responsive, as the CPU scheduler can handle the workload being thrown at it.

When building kwin and plasma-workspace on the other hand (I’m not building any other KDE projects at the moment), cmake queues way too many build jobs, which can be observed in htop for example. As said, this results in the system being really unresponsive and freezing multiple times.

When forcing only one build-job (-j1), everything is fine, well, except for it not being optimal and efficient. But as soon as more than one build job is set on the cmake --build command, the number of build jobs explodes.

I am neither a C++/Qt developer, nor am I familiar with cmake itself. However, my suspicion is that it is spawning nested build jobs, ignoring the overall / top-level number of max parallel build jobs, so it exceeds the system’s capabilities.

Can someone shed some light on this? Could this be an issue with the cmake configs of these projects? Or could this be a cmake bug?

It’s not a huge problem, as it only affects me while building said projects, but it’s definitely something that I can only observe here, nowhere else. It is not a system configuration or a hardware problem.

Thanks!

2 Likes

What are the precise commands you are using to build them? It seems you’re not using kde-builder.

From what I understand CMake actually doesn’t handle the number of jobs by default, it just relegates to the defaults of make or ninja. Make defaults to no parallelization, which is why you need to pass the -j flag, while ninja defaults to parallelization, which is why you don’t need to pass the -j flag. CMake can override those with the -j / --parallel flag, as mentioned in Building KDE software manually | Developer.

1 Like

As said, I’m repackaging Arch’s kwin and plasma-workspace packages but with a few unrelated patches applied, so basically just this:

  • https://gitlab.archlinux.org/archlinux/packaging/packages/kwin/-/blob/5b2026b54b694bd6811224851d303b5907c913e8/PKGBUILD#L94-99
  • https://gitlab.archlinux.org/archlinux/packaging/packages/plasma-workspace/-/blob/7647145e5b7467af0f2a55b078a0dcd21737df66/PKGBUILD#L144-150

(my account is not able to post links here)

CMake doesn’t decide the job parallelization, and neither does KWin/PW. They’re just complex C++ projects that are known to be tough to build. If it’s making your system unresponsive, reduce the amount of jobs.

That is not what I’m saying. I also can’t reduce the number of jobs, as said, because the number of build jobs explodes regardless whether I set -j32, -j16 or -j8. There is something wrong with the build setup, hence why I posted this.

1 Like

And btw, the backend also doesn’t matter. -G Ninja results in the same issue.

If you go to the build directory and run make -j8 or ninja -j8 (depending on which generator you used), does it still generate excess jobs?

Yes. Both make and ninja do that. Both when called explicitly or implicitly (cmake --build).

Like I’ve set, this looks like there are nested make/ninja calls somewhere in the build process, which then obviously ignores the overall max job count, as each individual child process has -j8 set itself.

But like I’ve also said, I am not a C++/Qt dev and am very much unfamiliar with cmake. All I can see is that the build process is misbehaving and ignoring the overall max job count, choking the CPU scheduler.

1 Like

makepkg.conf

Sorry, but this is completely irrelevant and off topic. Of course I have set my MAKEFLAGS. Same issue with and without it. This env var is also irrelevant when -jN aka. --parallel=N is set. This thread is not about system configuration and what not. Already talked about it…

I haven’t dug too deep, but I believe I can reproduce this.

I’m currently building OpenSUSE’s package of plasma-workspace. It runs with -j12. You can see a build log (from the official package) here.

Running this on my system, it sure appears to be using some number greater than 12 threads:

Let me know if I can do anything to assist.

The only thing I can see is that moc goes a little crazy and consumes all of the CPU, but again this is not something controlled by plasma-workspace. Otherwise it only uses as many jobs as I requested it to.

Edit: or I should specify, it really shouldn’t be the project’s fault that ninja/make decides to spin up so many mocs at the same time. i dunno if that’s CMake’s or make/ninja’s fault though.

Thanks for taking a deeper look at this.

After some googling, I found out about cmake’s AUTOGEN_PARALLEL and CMAKE_AUTOGEN_PARALLEL vars.

-DCMAKE_AUTOGEN_PARALLEL=1 appears to help. It’s a bit weird though… When I set -j8 (both make and ninja), the number of build jobs stays “low” (can’t tell how many jobs are running, but it’s definitely only a few). But when I set it back to -j32 using the make backend, there’s still massive continuous system unresponsiveness which I don’t see elsewhere when building stuff in parallel with 32 threads. The ninja backend however appears fine and there’s only initial unresponsiveness which then seems to settle quickly.

Relevant docs and issues:

  • https://cmake.org/cmake/help/latest/prop_tgt/AUTOGEN_PARALLEL.html#prop_tgt:AUTOGEN_PARALLEL
  • https://bugs.gentoo.org/934462 (more relevant links there)

According to some links in the linked Gentoo issue thread, some Linux distros as well as OpenBSD and FreeBSD set this var globally in order to prevent this kind of MOC build job multiplication (if I understand this correctly).

1 Like

Another seemingly relevant link

KDE Builder users also are affected by this. See `num-cores` option not working (#112) · Issues · SDK / KDE Builder · GitLab

Try prepending your command with taskset to limit how many cores it (and child processes) are allowed to use.

For example on my system I do:

taskset -c 8-16 kde-builder

This will work with any process; kde-builder, cmake, make, a game, blender, whatever. Customize the number of cores for your system of course. :slight_smile:

3 Likes

Thanks for this! This made a huge improvement here with building Kate on Haiku, I can’t use the default build instructions used on Linux, we have our own, but wow! :smile:

Masking CPU cores via taskset is absolutely not the solution to this problem. Tools like cmake should be able to coordinate CPU resources without the user having to know their system’s CPU topology and specifying where things are allowed to run. Manual resource allocation is not the point of such build coordinators.


I’ve explained in my previous posts exactly what the problem is, namely cmake spawning by default $(nproc) MOC compile processes for each already running build job where this is necessary, which also is $(nproc), so it’s overwhelming the system’s CPU scheduler by potentially running $(nproc) * $(nproc) build jobs in parallel. It does not put the MOC processes into the overall build-job queue and spawns them separately/additionally. And this is a problem regardless whether you masked specific CPU cores for certain process trees.

Setting -DCMAKE_AUTOGEN_PARALLEL=1 and only allowing one MOC compile process per build job results in at most $(nproc) * 2 parallel build jobs, which is better, but still bad, since it ignores the overall number of allowed parallel build jobs. However, just have a look at what a difference this option makes when observing the system’s overall process/thread count while building a large project like kwin or plasma-workspace for example.


Since my account is still not allowed to include direct links in my posts on this forum, here’s the relevant link to the cmake docs again:
https://cmake.org/cmake/help/latest/prop_tgt/AUTOGEN_PARALLEL.html

Number of parallel moc or uic processes to start when using AUTOMOC and AUTOUIC.

  • An empty (or unset) value or the string AUTO sets the number of threads/processes to the number of physical CPUs on the host system.
  • A positive non zero integer value sets the exact thread/process count.
  • Otherwise a single thread/process is started.

Sounds like you are pretty familiar with the topic! Perhaps you’d be interesting in helping improve this? :slight_smile:

https://community.kde.org/Get_Involved