Hi everyone,
I’ve built an open-source MCP server called kwin-mcp that enables AI agents to perform full GUI automation on KDE Plasma 6 desktops. It runs inside completely isolated KWin Wayland sessions — no windows appear on your host display, no input leaks out.
How it works
Each session creates three layers of isolation:
- A private D-Bus bus via
dbus-run-session - A virtual KWin Wayland compositor via
kwin_wayland --virtual - Input injection scoped to that compositor via KWin’s EIS D-Bus interface
The server then exposes 29 tools through MCP (Model Context Protocol): mouse input, keyboard, multi-touch gestures, screenshots via ScreenShot2 D-Bus, accessibility tree queries via AT-SPI2, clipboard, window management, and generic D-Bus calls.
Why KWin?
KWin is the only Wayland compositor I’m aware of that exposes an EIS (Emulated Input Server) interface via D-Bus. This is critical because it provides a clean path for input injection without triggering XDG RemoteDesktop portal authorization dialogs. Since we own the isolated session, we can bypass the portal entirely.
What you can do with it
- AI-driven GUI testing: Let Claude Code or other MCP clients interact with KDE applications
- Headless automation: Run GUI workflows in CI/CD without a display
- Accessibility inspection: Query the AT-SPI2 widget tree programmatically
Real-world use case: E2E testing a KDE Plasma dock
I’m actually using kwin-mcp myself to develop krema, a dock for KDE Plasma 6. I develop it with Claude Code, and kwin-mcp handles automated E2E testing — launching the dock in an isolated KWin session, clicking icons, verifying window previews, testing drag-and-drop, all without touching my actual desktop.
Here’s krema running on my desktop:
This is the kind of workflow kwin-mcp enables: write a KDE app, and let an AI agent test its GUI automatically in an isolated environment.
Screenshots & performance
Screenshot capture runs at ~30-70ms per frame through KWin’s ScreenShot2 D-Bus interface. Any action tool accepts a screenshot_after_ms parameter for burst frame capture — you can capture multiple frames after a click to observe UI transitions without extra round-trips.
Installation
pip install kwin-mcp
# or
uv tool install kwin-mcp
Requirements: KDE Plasma 6+, Python 3.12+, at-spi2-core
- GitHub: GitHub - isac322/kwin-mcp: MCP server for Linux desktop GUI automation on KDE Plasma 6 Wayland -- 29 tools for mouse, keyboard, touch, accessibility, and screenshot in isolated KWin sessions
- PyPI: kwin-mcp · PyPI
Limitations
- KDE Plasma 6+ only — relies on KWin-specific D-Bus interfaces
- US QWERTY layout for direct
keyboard_type; Unicode input works via wtype - AT-SPI2 coverage varies by application
I’d love feedback from anyone who works on KWin internals or has experience with the EIS protocol. Are there plans to make EIS more broadly available across KDE applications?
