Questions and X11 support

I rely heavily on features in X11 (mouse control, window metadata, etc) for accessibility, on such script I use controls a touchscreen monitor, which supports HDR, but I cannot use it due to X11 not having HDR support, I also use xdotool extensively for automating tasks on my system (pointer return on key press or after using a touchscreen - etc.) I cannot live without these features, so until kwin_wayland has full support for these (and others), Wayland is a hard no for me.

TLDR; I use just about everything in xdotool in some way shape or form, and I simply cannot live without that extra functionality. (What is the roadmap for a compatibility mode for this?)

PS. NVIDIA users still use Wayland AFAIK due to shoddy support, (you wouldn’t exclude parts of your userbase would you?)

Maybe one approach would be to look into KWin Scripting

This should have the same (or at least very similar) capabilities on X11 and Wayland.

1 Like

Yeah, KWin scripts exist, but at best you’d have to reinvent a lot. There are some good reasons why a lot of the features got removed - but the window manager does have access to most of them.

I got curious about this (I’m pretty sure my own eyesight will crash within a decade) and, while doing some searches, I stumbled on kdotool (I can’t post links yet, but look up jinliu/kdotool on Github). It wasn’t an obvious search result, so I thought I’d mention it to you.

FWIW, as long as your GPU is supported by the nvidia-open drivers, it works with Wayland very well, by the way. Which is probably bad news - it’s not great to be the last one stuck on a platform that’s getting progressively abandoned.

kdotool is very useful and powerful

Are these capable of what you do with xdotool?

  • ydotool - Generic Linux command-line automation tool
  • wtype - xdotool type for wayland
  • wlrctl A command line utility for miscellaneous wlroots Wayland extensions.

About x.org compat… I think KDE will not abandon x11 as quickly as Ubuntu yesterday but my impression is kwin-x11 support will be minimal since there arent that many devs willing to work for it. Idk of any official x11 deadline for KDE.

The main advantage is that it would work on X11 and Wayland.
So it could be implemented and tested while still on X11 and then likely just reused once the person switches to Wayland.

This would be true for both direct usage of the API and indirectly through kdotool

2 Likes

I came to browse here because I too require X11 into the future. xdotool is good but I refer the OP to Actiona which is dated but powerful if you dig deep. You can add xdotool through Actiona command. If you add another layer through Albert you can have an elegant tool chain where Albert extension (you can add) will drive Actiona and xdotool if you need it. I even experiment with AI tool driving this chain so that AI agent can navigate complex UI landscapes. AI interaction with Ubuntu seems to be overlooked by the decision makers when X11 is ditched. An AI Agent coaching new Ubuntu users for example?

A tool that drives your whole session in a similar way you could, would probably be using the remote desktop portal.

Essentially like someone using a remote desktop tool like TeamViewer to interact with your session.

No. That is not what I have in mind. Consider a complex UI like .. Blender. Or any other such as Azure. Forget YouTube as tutorials with dumps of static screenshote to be interpreted. Instead an AI Agent mentors the desktop user on how to navigate this Blender landscape. User requests AI Agent through structured protocol (not chat) to say .. build a ??? blueprint with these parameters .. and AI Agent (trained in Blender nuances) drives Blender by downloading a navigation script exercised by local app to drive Blender to pop out a result. That is why X11 is needed. In my mind. Protocol roughly mapped out so far.

A human tutor would watch the student via desktop sharing and give hints, potentially control mouse and/or keyboard from time to time.

An AI tutor could to the same, no?

No need for human tutor other than orchestrating multiple “AI Agent tutors”.

Sure, that’s what I said.

Just suggesting that an AI tutor could use the same approach as a human tutor would to interact with the student’s session

As long as you’re using X11, you might find AutoKey useful. It doesn’t add much to what you’re already doing, but it makes managing hotkeys and scripts more convenient. We are also testing a Wayland version.

It might join the symphony orchestra of UI tools to be conducted but also we (and the AI Agent) needs to target images and x,y coordinates to conduct the music in a script. I’ll recruit it into orchestra to experiment. Take a look at Sikulix to get some more ideas from another angle.

1 Like

I looked at Sikulix years ago, but never actually used it. It appears t be almost unmaintained now.

I’m not entirely clear what your project is, but it sounds pretty interesting. If some of it becomes public facing, I’d like to hear about it.

Nor am I on occassions. I have been flagged in past for posting visual metaphors which are OT ideas and trust that this “brainstorms” category has a more open, flag-free, community. I shall indeed post a proof of concept when I have time shared with other duties. I have the hosting sandbox ready. I have the Agents ready for experiments. The paperman mode is feasible.

Postscript: Now today into my intray pops this McKinsey Report .. one of many. It is the way forward. Prepare for an agentic and hybrid AI model | McKinsey

1 Like

I was able to get all my needs sorted on Wayland ( i found the options in the settings app)
I would appreciate it if the Wayland only options were visible but grayed out so users could know that something was available but not usable