I’m trying to write a Python script to recognize objects in images of specific windows/areas on a screen. My idea is to get the raw image data through the org.kde.KWin.ScreenShot2 interface (or, is there a better way to try?) and pass it to the machine learning part. However, the script occasionally gets stuck at the line f.read(expected_size) when looping through screenshots. I tried adding a breakpoint before that, and the dbus interface does indeed return a dbus.Dictionary (containing the format/height/scale/stride, etc., as described in the KWin source file).
I then checked the Qt documentation, and QImage::Format_ARGB32_Premultiplied is indeed 4 bytes/pixel, so it should read a size of 4 * width * height. I tried reading the data byte by byte, but the script runs very slowly (~80fps → ~30fps).
Am I missing some details about dbus? Or should modify the way fd object read in Python?
Thanks in advance ![]()
class AreaMonitor:
def __init__(self, interval_ms: int = 1000):
self.bus = dbus.SessionBus()
self.kwin_interface = dbus.Interface(
self.bus.get_object(
bus_name="org.kde.KWin.ScreenShot2",
object_path="/org/kde/KWin/ScreenShot2",
),
dbus_interface="org.kde.KWin.ScreenShot2",
)
self.interval = interval_ms / 1000.0
self._capture_options = {
"include-cursor": dbus.Boolean(False),
"native-resolution": dbus.Boolean(True)
}
def capture_area(self, x: int, y: int, width: int, height: int):
read_fd, write_fd = os.pipe()
try:
write_fd_dbus = dbus.types.UnixFd(write_fd)
results = self.kwin_interface.CaptureArea(
dbus.Int32(x), dbus.Int32(y),
dbus.UInt32(width), dbus.UInt32(height),
self._capture_options,
write_fd_dbus
)
os.close(write_fd)
image_width = results.get("width", 0)
image_height = results.get("height", 0)
expected_size = image_width * image_height * 4
with os.fdopen(read_fd, 'rb') as f:
image_data = f.read(expected_size)
return image_data
except Exception as e:
os.close(write_fd)
raise e