Ideas for better Focus stealing prevention

aguzinski · March 28, 2025, 1:16pm

I’m not really satisfied with the focus stealing prevention settings. On medium, windows will steal the focus while I’m typing and on high applications that I just started from another application (e.g. steam) open in the background.

So i thought about how I expect this to work and came up with this idea to improve it:

fn steal(
    current,      // currently active window
    new,          // new window
    start,        // time when new.application has been started 
    kbd_timeout,  // user configurable maximum time difference since last keyboard activity in ms (default: 1500ms?) 
) {
    if new.is_child_window_of(current) 
        focus(new)
    else if time_of_last_keyboard_activity() < start
        focus(new)
    else if now() - time_of_last_keyboard_activity() > kbd_timeout {
        if new.is_error_message()
            put_to_front(new) // only make it visible, but don't interrupt my typing
        else
            focus(new)
        }
}

Reasoning:

child windows of the current application were likely just requested. There are only corner cases where they are not expected to steal focus from their parent and I can’t think of a way to tell those appart anyway.
if last keyboard activity was before the new application was started, I’m waiting for the window to appear
kbd_timeout: If I’m just typing, I don’t want to be interrupted typing
if the new window is an (important?) error message and I’m typing, I want to see it, but would like to finish typing.

I hope the pseudocode makes my intent clear. Please ask if not.

Is all this feasible to implement in kwin? Any other ideas?

skyfishgoo · March 28, 2025, 2:04pm

good idea

with logic like this, there wouldn’t even need to be a focus stealing setting in plasma… it wouldn’t even be an issue.

no idea how feasible it is either but it’s worth discussing.

pallaswept · March 28, 2025, 4:24pm

Funny enough, the only issue I have with focus stealing right now, is that I can’t use Firefox’s cool new ‘automatically Picture-in-Picture video when changing tab’ because the PiP window steals focus from the browser tab I just switched to, so every ctrl+tab would also need an alt+tab chasing it. I’d hardly call viewing video in the browser a corner case.

This practically never happens for me. Are you using X11 by any chance?

What are the current algorithms anyway?

crossroads · March 29, 2025, 8:51am

Same issues I have. Plus, in Plasma 6, in some cases launcher and krunner are not activated in “High” settings, which makes them useless.

Yes it is X11, but I don’t need Wayland, and it doesn’t look good in my distro. Though I plan to change it soon.

shorberg · March 29, 2025, 9:58am

I like your idea.

And I did some further brainstorming. How about adding a user setting for situations where the algorithm can’t be sure sure, situation is ambiguous or as you have in the final branch there, an error message.

When unclear what to focus:
* Keep old focus
* Focus new

I’m leaning toward “Focus new” being the best default, but that way each user could decide for themselves which behavior they prefer in the edge cases.

aguzinski · March 30, 2025, 7:16am

Funny enough, the only issue I have with focus stealing right now, is that I can’t use Firefox’s cool new ‘automatically Picture-in-Picture video when changing tab’ because the PiP window steals focus from the browser tab I just switched to, so every ctrl+tab would also need an alt+tab chasing it. I’d hardly call viewing video in the browser a corner case.

Hm.. that sounds more like a bug in Firefox to me. It should be able to control focus changes between it’s own windows.

On medium, windows will steal the focus while I’m typing

This practically never happens for me. Are you using X11 by any chance?

Nope, I’m on Wayland. This happens when I start something that needs a while to open, but continue working instead of waiting until the window pops up.

What are the current algorithms anyway?

That’s a good question. I hope one of the kwin developers will join in and explain.

Same issues I have. Plus, in Plasma 6, in some cases launcher and krunner are not activated in “High” settings, which makes them useless.

I never had that problem - maybe this only happens on X11?

[..] How about adding a user setting for situations where the algorithm can’t be sure sure, situation is ambiguous or as you have in the final branch there, an error message. [..]

You’re right - this should be configurable. I’m not sure if there are any ambiguous situations, though.

shorberg · March 30, 2025, 8:32am

I was thinking on the lines of the above, for a human it might be mostly obvious what to do but might not be for the algorithm.

pallaswept · March 30, 2025, 10:32am

Maybe it is, but the context here is important. Your primary point of reasoning which I quoted above that quote there (my emphasis):

child windows of the current application were likely just requested. There are only corner cases where they are not expected to steal focus from their parent

The PiP window in a browser is one example where they are not expected to ‘steal’ focus from their parent.

Another example that comes to mind is 3D editors, opening a new viewer. IDK my point is, I’m not really sure that reasoning will hold. It doesn’t take long to imagine examples where it’s not true. It’s just kinda funny that one of those examples is the only problem I have right now

Just a tip: this conversation will get really confusing shortly if we continue to conflate the terms ‘taking’ focus and ‘stealing’ focus as just always ‘stealing’.

You said in OP

But this

The oddity there isn’t focus stealing, because you launched the application, you gave it focus. It’s how you’re able to send input elsewhere when it should have focus, that’s the oddity. You should have to click on your original window to shift focus away from the application you just launched, to be able to keep typing, and focussing the original window should have caused the new one to appear in the background.

It seems like maybe you’re trying to fix something when the problem is actually something else, but I’m not sure. It might help if we had some way to actually reproduce this behaviour. Could you give an example that others could try?

aguzinski · March 30, 2025, 3:40pm

[..] The PiP window in a browser is one example where they are not expected to ‘steal’ focus from their parent. [..]

It would be interesting to know how exactly the window protocols work. I’m assuming that every new window requests focus (no further information provided), and then the window manager decides what to do. In that case, the window manager should simply focus the new window.

If the new window should not have the focus the application needs to handle this - either by moving focus back to it’s main window (I think this is possible - it kind of would make the main window a modal dialog of it’s child), or by passing all input events from child to the main window. In that case it’s a bug in the application because it’s not doing that. But what if the protocols are different? Is kwin not acting on some information? I really don’t know.

Just a tip: this conversation will get really confusing shortly if we continue to conflate the terms ‘taking’ focus and ‘stealing’ focus as just always ‘stealing’.

&

The oddity there isn’t focus stealing, because you launched the application, you gave it focus.

I adopted the term stealing because the relevant setting is called “Focus stealing prevention”.
But what really is the definition here?

When I start the application, the window is not there, so when it finally pops up, focus moves to it. I think this could be interpreted as either - depending on how the user feels about it.

It might help if we had some way to actually reproduce this behaviour. Could you give an example that others could try?

One example for me is steam, as it needs quite a bit before opening it’s main window - especially when it decides to make an update.
So let’s say I’m coding, but soon I’ll be done and would like to play a game after that. Then I will start steam via krunner and go back to work.

Another one is if I start some GUI application I’m developing via RustRover. It may take a while to compile, so I can do some other stuff while waiting.

In both cases, I’d like to keep focus where I am and switch to the new window later.

pallaswept · March 31, 2025, 3:10am

It’s pretty much in line with the usual idea of giving/taking/stealing, and the ‘rightful owner’ is defined by user interaction (clicking, hovering, typing, etc)

That’s a bug. When you start the application, the application should have focus from that moment that you clicked the icon or pressed enter. Even if it hasn’t created a window yet. Imagine you start some app which takes keyboard input and never creates a window. It should be given focus immediately, so you can type, and that app would see the keypresses, right? So why doesn’t this one?

You’re giving the application focus, because you started it, so it’s not ‘stealing’… But why (how?!) steam isn’t taking focus immediately, and somehow allows you to type into another window, without taking focus away from steam (so its windows start in the background and don’t steal) is the weird thing.

Put another way: The problem here isn’t when the steam window appears. It’s before that. The problem is when it doesn’t give focus to steam immediately. It seems like this is normal behaviour because I can repro it right here with kate for example.

If I start kate using krunner or Application Launcher widget, I press enter or click, and start spamming keys real quick, I can get a few keypresses into this window before the kate one appears.

Kate should have focus at that moment that I press enter to start it, and I should have to click in this window to give it focus, so that I can type into it, and that should mean that focus stealing prevention will cause kate’s windows to be created in the background.

Because kate isn’t given focus immediately, I can’t take it away by clicking here to type. This is the real problem here.

To put this into the context you gave: Let’s say you’re coding and you’ll be finished soon. You start steam. Steam should have focus now, because of your interaction which started it. So when its window appears, it should still have focus. But you want to keep typing, so you click in your IDE, and keep typing. Now, your IDE has focus, because of your interaction to switch to it, and focus stealing prevention should ensure that steam windows which appear afterwards, without any interaction, do not steal that focus (stealing, not giving, because you gave focus to something else, the IDE, by clicking on it, and not to steam, because you didn’t click on it).

The ‘where I am’ here, is (or, should be) the newly started application. It’s not your IDE, you interacted with KDE to start krunner, thus giving it focus, and used it to start a new app, giving that app focus. Focus should only be returned to the original app (your IDE) if you cancel out of there.

But somehow, focus is not being given to the application when you interacted with it (started it) but is being ‘stored up for later’, or something. Focus is being returned from krunner/kickoff, to the app which had focus previously, your IDE, when it should be given to the application you just launched from it. That’s the problem.

You can see why I originally suggested that there be some understanding of the existing algorithm. You’re trying to fix something, and I’d like to help (because focus stealing sucks!) but neither of us has a solid idea what’s broken or how, so we’re not really in a position to try and fix it yet. We’ll get there!

johnandmegh · April 5, 2025, 11:37pm

For what it’s worth, the discussion on this bug report seems related: 488060 – Focus stealing prevention window rules are not enforced for the native Wayland apps

aguzinski · April 6, 2025, 9:26am

@pallaswept: Off-Topic, but it occured to me that I might have you at a disadvantage (assuming you use the same username on github). I’m kermitfrog over there. And if you’re wondering about UIO - I haven’t forgotten or given up on it, but my head needed a serious break from that project.

It’s pretty much in line with the usual idea of giving/taking/stealing, and the ‘rightful owner’ is defined by user interaction (clicking, hovering, typing, etc)

Ok. Let’s go with that.

When I start the application, the window is not there, so when it finally pops up, focus moves to it.

That’s a bug. When you start the application, the application should have focus from that moment that you clicked the icon or pressed enter. Even if it hasn’t created a window yet. Imagine you start some app which takes keyboard input and never creates a window.

This is difficult to imagine. I can think only of a few possibilities for such an app:

TUI - this runs in a terminal emulator, so focus is handled there.
Input remapping / deamon that handles global shortcuts / short lived script that does something based on the next keys you press. These accept input and don’t have a window. But they also don’t have focus!
Something that does not give feedback to the user. Terrible UX - let’s never do this!
Something that does feedback on an external device that is independent of the window manager. This one might be valid - but how do you handle focus changes?

In my mind focus is tied to a UI. Without UI, you don’t have focus. You can grab the input directly, but that is not the same as it circumvents the distribution of input events that the user can control through the window manager.

It should be given focus immediately, so you can type, and that app would see the keypresses, right?

I disagree on this one.
I think what you mean is that the user will expect the app they started to accept input immediately - and there is some merit to this from the UX point of view, but I also see problems.
Without a window giving feedback, I don’t know what that input does and that is bad UX.

Because kate isn’t given focus immediately, I can’t take it away by clicking here to type. This is the real problem here.

I think I get your point, but feel that this is not practical.

The problem is that when the application starts, we can’t predict how many windows it will open and when. It might not open any windows and not accept user input at all. This has to be decided at runtime. So before the window is actually opened, it’s not clear if it even wants focus, so it can’t get it.
Kate could immediately open a placeholder window to allow switching away, but that does not change behaviour for other applications.

For what it’s worth, the discussion on this bug report seems related: 488060 – Focus stealing prevention window rules are not enforced for the native Wayland apps

Thanks. I think it’s related in the way that what we’re discussing here might make the buggy functionallity (per window focus stealing settings) obsolete for most use cases.

pallaswept · April 19, 2025, 1:28pm

Oh hey totallynotkermit Nice to see you again. Small world!
Sorry I took forever to reply, I’m not really visiting these (or any) forums any more, just popped by to get this done.
Any way, this is all in good hands without my input, so I guess this is all just for conversation and contemplation’s sake

Agreed! Such an app might be pretty awful. For purposes of conversation we might imagine an app that sends output to a printer or blinks keyboard lights or plays sounds or something silly, but it wasn’t really intended as an example of a real application, but as a demonstration of why the logic will fail if we don’t do it this way.

More of a thought experiment. The point is that the time between starting the app and opening a window is unbounded and unknown so we kinda have to treat it like it’s possibly infinite if we want things to work. (and also possibly zero and everywhere inbetween, of course)

Sorry, what I meant (I worded it badly) was that the app should receive input as soon as it’s ready. Like, once you start an application, and let’s assume that application doesn’t have any delay and is ready immediately, and you start typing immediately, you expect that typing to go into that app.

This is real simple if we’re talking about an app that starts in 5ms because we press enter and bam it’s there. But for an app that takes 5s (like steam) what can we do? Do we wait until it is ready? Well that obviously sucks as you’ve noticed Otherwise what, wait some weird amount of time inbetween when we start it and when it is ready, and then swap? Swap at some time after it’s ready? Then we would have a whole new problem. Obviously the only choice is to give focus immediately.

This is the kind of thought experiment I was doing earlier This is a good one.

Yeh, maybe our app opens a window in 2 seconds, or 5, or two windows, or maybe it renders a 3D model for 20 minutes and then creates a window to show it, or maybe it just never shows a window at all.

In any of those cases it’s fine if it is given focus immediately because it could still be accepting keyboard input; and as above, probably best that way because a delay of giving it focus until the window finally does appear, will feel like stealing.

And if it has a window but doesn’t accept input, it should still take focus so we can send WM commands to it, like closing or minimising the window…

But what about when it doesn’t (yet, or forever) have a window AND does not take keyboard input? I wouldn’t want that to immediately take focus if I run sleep 9000

But in my experiments here, it already doesn’t. An app like that effectively runs as a systemd service in it’s own unit containment with disconnected standard IO and stays that way until it ends, never taking focus.

I guess a more direct way of saying this is: we should not wait until the output (window) is ready, before we direct input to (focus) the application; because the application might be ready for input even if there is no output; and even if it’s not ready, the experience of a ‘delayed focus’ will be pseudo-focus-stealing as you’ve reported here; and even if it never will be ready, plasma can never know that. So we need to send input to the app immediately, window or no…

This works correctly for all apps with or without windows OR with or without input AND those which have either but delay in using them. BUT if you run an app with no UI at all, it will still get focus - which makes perfect sense because the system can never know beforehand, whether the UI is never coming or just delayed.

At the moment, it appears to be assuming there will be no UI (it doesn’t shift focus) until it sees one, which strikes me as kinda weird since it’s so rare and basically operator error to actually run something with no UI which doesn’t return in a very short time (I mean, one should not run a service from krunner, that’s for systemd), and there’s no way to avoid the behaviour you observed unless it’s done the other way, assuming the UI is coming, and shifting focus accordingly.

Anyway sorry that was a bit long. Kinda got a few gears grinding in my brain here courtesy of your “what about no UI at all” thought experiment I guess this whole message is just an explanation of my answer to that, “I don’t think we care about that”

You’ve presented a really interesting conundrum because logically there’s no other choice here - we either have to endure focus stealing as you’ve reported, or we have to endure that if we run a service from our UI then our previously active app will lose focus and we will have to focus it manually. I think the latter is the best choice.

Always nice to talk with you.

aguzinski · May 1, 2025, 11:04am

No problem - I sometimes need a long time to answer as well

You’ve presented a really interesting conundrum because logically there’s no other choice here - we either have to endure focus stealing as you’ve reported, or we have to endure that if we run a service from our UI then our previously active app will lose focus and we will have to focus it manually. I think the latter is the best choice.

We’ll have to disagree on this one then. I totally think that focus stealing is the lesser evil here.

Always nice to talk with you.

Same here ^^

pallaswept · May 2, 2025, 5:15am

I found that equal parts surprising and confusing Try as I might, I can’t imagine why anyone would ever run a service from the UI, let alone care more about doing that, than the bug you’ve reported here? I feel like I’m missing something.

To be clear, when I say “run a service from the UI”, I’m not talking about sudo systemctl start exampled. That should work fine. So would exampled --daemonize, because they both exit after the daemon starts. I’m talking about just opening krunner and typing in exampled and leaving it running forever. For a more realistic example, this would be starting krunner and typing in pipewire and selecting ‘Run pipewire’ from the Command Line section. (PS don’t do that, I just tried it to insanity check it, and bad things happened )

I’m sitting here racking my brain about this, trying to imagine why someone would want to do this and the more I think about it the more I realise it’s really unsupported behaviour in lots of ways. For example, this hypothetical service started from the UI, would be in entirely the wrong cgroup for a service. The OOM killer would treat it inappropriately for a service. The CPU scheduler would too. (all because of the way it is started it’s indistinguishable from a foreground app) As it is now, we have this feature non-working, AND focus stealing.

Don’t get me wrong, if that is an important or common (or even desirable in any way) feature for some reason then, so be it, but… is it?

aguzinski · May 2, 2025, 6:56am

There are 2 reasons why I prefer focus stealing. One of them is that I feel the bother of having to manually click the application I was just using is greater than that of a stolen focus - at least if we employ the solution I proposed.

The other is that it’s not just directly started services we’re talking about. There are also:

applications that start as a tray icon.
background applications (e.g. scripts) that perform some operation in the background.
anything started from a shell
services started by gui applications
…

There are also services that run in userspace like most of the input remappers, althought these would usually be started by one of the methods mentioned above.

If we’re just talking about using the focus immediately behaviour on stuff started from krunner or the menu it might work well enough to be ok (but I’d still prefer focus stealing). But what about the rest?
On a system like Android, which had established mechanisms to distinguish foreground and background applications right from the start (or at least as far back as I remember), this might be achievable. But transforming linux to that point is a LOT of work and at least involves patching all GUI applications to anounce themselves as foreground.

krake · May 2, 2025, 8:17am

A service is unlikely to open a window and thus not even be involved in the focus chain.

Services that do have the capability of opening a window will often do so on request by a client. Which will in most cases have a window and be able to transfer focus via activation token if so desired.

There might be services that can open windows but either lack the capability to transfer such a token or really open them without client request.
In those cases it is IMHO better to not given them focus automatically but wait for the user to decide when to interact.