I made a (separate) tool for labeling detected faces

Hi digiKam enthuisiasts!

Some background: I use the two ML models built into digiKam to detect faces and then recognize the people. So then I get suggestions for new faces that belong to an existing person, but also a whole bunch of faces that it didn’t recognize at all. I have to manually go through them and tell it what’s the right label.

I find the UX for labeling detected faces (not for adding new face rectangles to images, I’m not doing that and can’t comment on that) to be quite cumbersome. It’s not fundamentally wrong or broken, but there is a number of little issues that make it rather difficult to use for me. I could go and file bugs in the bug tracker, but for all of them to be fixed, it would probably take a while. To highlight only some of them:

  • Whenever I confirm a suggestion at the top of a person’s page, it takes about 1s to save that change (and I have quite a fast and modern machine). Maybe it’s because I have so many images and faces.
  • Sometimes, the view additionally jumps, i.e. scrolls down a bit. I suspect when the image I just labeled happens to be near the viewport afterwards (as it got removed from the “unconfirmed” area and added to the “confirmed” area below), then the UI will scroll to it. So after clicking the checkmark button, I need to wait 1s and then scroll up and then repeat for the next image. This prevents me from just hammering the mouse button when a bunch of suggestions are correct in a row. Of course I could use Shift+click selections and then confirm, but that’s different acrobatics than what I do to confirm individual faces, so I have to move my hands around for that.
  • When I insert another name for one of the faces, that’s now the default name it suggests for all the remaining faces – even though the AI detected them to be the originally suggested person. So I have to type the original person’s name again to continue.
  • In general, I always have to type when the currently suggested person isn’t the right one. Even though there’s only a handful of different people in my photos. Two of them happen to have similar names, so I have to type a bunch of characters before it selects the right one (or use the mouse after typing).
  • When I set up a keyboard shortcut for a person label, it’s assigned to the image as a whole and not to the detected face region, so it’s pointless.
  • I can’t undo, so I have to be careful and can’t just click and type as fast as I can.
  • When I’m not sure who the person is, I open the image full screen, but then I first have to go find the face on the image. It would be nice if it could be highlighted.
  • Generally, I have to switch back and forth between keyboard and mouse a lot.

In the meantime, I have built a separate tool that hooks into the digiKam SQLite database and performs only that one job: face labeling. It can confirm face suggestions, assign other people to those faces, or ignore faces. It’s much faster because it mainly uses keyboard input, preloads things, and batch-saves them later (which also allows for more flexible undo).

I’d like to share it here so that (1) others who have a couple thousand faces to label can be more efficient too, and (2) perhaps it can serve as inspiration for upcoming digiKam features. It’s not super many things that would need to be improved in digiKam to provide a similar UX, but I’m not sure how difficult they are to change.

It didn’t allow me to post links, so:

bitbucket org / Philipp91 / digikam-facelabeler

And the undo feature was already requested in bug ID 298160.