Random display corruption when playing certain games (with `kwin_wayland` traces) across multiple OSes

As Random Indefinite System Hang and crash - Fedora Discussion and Random Indefinite System Hang and crash - #8 by rokejulianlockhart - Fedora Discussion explain:

    1. Problem

      My display shall apparently randomly freeze when I’ve been using anything in fullscreen for a while. This can be a Steam game, or even (yesterday) a video on Firefox. I thought that this was due to package corruption in my old openSUSE Tumbleweed installation. However, this appears to occur on #Fedora-KDE #Workstation-WG #F40 too, barely a day after installation.

      Nothing appears in journalctl -b -1 due to the system not even respond to SysRq commands, necessitating that I utilize the motherboard’s reset switch, so I’ve no useful logging. Leaving the system (in the times when SysRq doesn’t work) appears to cause it to indefinitely hang.

      I am confident that this relates to my hardware somehow.

    2. Cause

      It appears to coincide with new hardware. However, a kernel change could well have occurred when I reinstalled to my new hardware, too:

      is all which I know changed when this first occurred.

    3. Related Issues

      1. Prism Launcher consistently kills X, sometimes Wayland too. · Issue #2139 · PrismLauncher/PrismLauncher · GitHub
      2. Where to report XWayland crash? The cause matters, right? - Applications - openSUSE Forums
      3. I’ve not reported it yet [1] to Sign in · GitLab, but I intend to.
    4. Workarounds

      Not rendering anything in fullscreen appears to work.

  1. After it stopped occurring in War Thunder a few months ago, I expected that this problem has been remediated. However, a week ago, it occurred once in WT, albeit not since. It worried me, but not enough to consider the problem recurrent.

    However, I just tried Prism Launcher for the first time since I reported this, and it occurred almost the same, as https://imgur.com/a/b6Cpbg1 demonstrates:

    Although notably, this time, all except the display server remained interactible (because it hang and then didn’t reboot - just kept refreshing the display to a more corrupted state every 5s) so I was able to utilize Revision - Super User

    skill -KILL -u RokeJulianLockhart
    

    as root on tty3 (as I’ve explained at "Update available" notification for shell - #5 by beedellrokejulianloc - snapd - snapcraft.io) and then re-login, without issue. I’m submitting this comment on the same boot.

    I’m confident that this must be an issue with some fundamental OS software - be it the compositor (KWin) or GPU drivers (Nouveau, I think) but I have no idea how to diagnose it, and it’s rendering almost all Linux-based OSes unusable for a significant amount of games on my hardware.

    Is

    the correct course of action?


    Relevantly, as those videos demonstrate, I need to rename this post to “Random (previously idefinite) system hang and crash”, but can’t due to its age.

I’m asking for help here because although this first occurred on openSUSE Tumbleweed after a system update, it’s remained since I installed Fedora 40 as its replacement. This leads me to believe that it’s upstream somewhere, but I have little idea where.

After a lot of deliberation, I’ve considered here the most likely place, considering that KWin under Wayland acts as both compositor and display server, and because it’s the primary commonality between the OSes I’ve used thus far.


  1. Issues · Mesa / mesa · GitLab ↩︎

I’ve also reported this as a bug at 488989 – amdgpu DRM/radeonsi Mesa: X and Wayland die when using Prism Launcher or War Thunder, sometimes disabling user input entirely., because all indications point to it being one. I’m mostly asking here for help with workarounds and diagnosis methods.

Considering how visually similar this is to the display corruption depicted in Ubuntu GNOME - Corrupted screen at boot...HELP! - Linux - Neowin, perhaps this is even more upstream of an issue. I’ve been putting off reporting to Mesa despite the openSUSE devs suggesting it due to wanting to correctly go through the downstream/devolution process, but I’m not sure I can have a partially unusable GPU for almost a year.

Per comment #3 on the bugs.kde.org counterpart:

I just installed Windows 11 and played Prism. It didn’t crash for the 6 hours I played. This is undoubtedly a software regression in the graphical stacks utilized by Fedora and OSTW (of course, with the commonality of KDE Plasma 5 and 6 accounting for most of that stack, being the DE).

I’m gonna keep most new findings there, since this is evidently a regression.