Spectacle - Consolidated UX for Content Extraction?

Spectacle recently added OCR, though seems like the QR/Bar code reading which has similarities, has an entirely different UX approach.

Could these maybe get unified under the same flow?

Ie: OCR adds “Extract Text” button. Maybe that button could have dropdown to extract content, picking between OCR or code (QR/Bar).

Note that bottom toolbar horizontal lenght would actually be shorter with “Extract ⌄” than with “Extract Text”, which seemed to be an important UX consideration during OCR implementation.

A QR code should be easily found and understandable, I think.
If a QR is selected and nothing else, it should just have a “Scan QR” button and the “Extract Text” need not be shown.
If a region with QR code + text is selected, then both buttons could appear.
If only text is selected, only “Extract Text” appears.

I agree, however there is no more space for adding extra horizontal buttons, as described in the thread I linked. Thats why re-using the “Extract Text” button seems like the next best approach.

I believe a more streamlined process would involve the following: when a user presses on “Accept,” the standard screenshot preview/edit window would open. Within this window, all identified text and/or QR codes in the image would be highlighted, offering interactive options to either open the QR code link or select and copy the text.

This approach would eliminate the necessity for an “Extract Text” button altogether, as its currently opening the Preview/Edit window anyway.

Implementing such type of functionality (this is similar to Windows Snipping Tool) is highly unlikely in the FOSS space… :sweat_smile:
In fact, does tesseract (the backend) even support something like this?

Haven’t seen the codebase yet but in theory it should definitely be possible

Ok so yes tesseract doesn’t support QR Code scanning, but I found something similar to what I mentioned earlier that uses ZXing for the QR code scanning part:

It has some bugs, but I opened corresponding issues for the GitHub repo. Something similar could be implemented in the Spectacle app natively without this wrapper, but closer to my idea where the text would be visible in the Preview/Edit window of Spectacle itself. Right now this wrapper opens a window with the text and the scanned QR code info all in the same text block, with buttons to copy the text, save it as a file, or save the screenshot.

1 Like

I modified the spectacle-ocr-screenshot project and got this result:

Newproject-ezgif.com-video-to-gif-converter

As you can see tesseract is not reliable, sometimes it detects icons as symbols or letters, other times it doesn’t detect text at all but I get the same result when using Spectacle “Extract Text” feature so tesseract itself is the problem but otherwise this is exactly what I was proposing if tesseract could offer better OCR then this is definitely doable

1 Like

Guys I think you’re confusing the point of the thread. Let me clarify so it doesnt get derailed further: my propossal is about UI/UX changes only, OCR and QR/Bar code recognition are features that already work in spectacle (OCR requires tesseract and code reading requires prison). The only problem is that they offer radically different UI/UX, and my only point is to make them work in a similar way, while respecting certain UI constraints described by the devs.

1 Like

Yes, sorry I overcomplicated things. While what I proposed makes it clearer what was found, in the current version of Spectacle the extraction of the text or the qr code happens only after pressing on the Extract text button so the easiest solution would be to possibly just rename the “Extract Text” button to something like “Smart Scan”. It is a bit unusual that after pressing “Extract Text” the text gets copied to the clipboard while the QR Code content only gets proposed to be copied

Wow, this is actually impressive.

1 Like

Thanks. I got better results since with

SetPageSegMode(tesseract::PSM_SPARSE_TEXT);

Though I have found that RapidOCR or PaddleOCR work a lot better than Tesseract.