Duplicate Search takes literally days

Fraser_Cross · September 17, 2024, 7:48pm

I’m new to digiKam. I’m using version 8.5 on windows 11. I’m using the Mysql Internal database.

My machine is a Ryzen 9 7900X3D with 64 gb ram, Nvidia 4090 and I’m using Western Digital Black SN850X 2 and 4 tb NVMe drives. The image collection is on it’s own physical drive.

I have 286,583 images that I need to organize. Some images have many, many copies. I’m attempting to search for duplicates between 90 and 100 similarity.

The problem I’m having is that it took 20hrs for the duplicate search to get to 4% complete.

In the Windows Task Manager, the memory used is under 2 gb, the cpu is under 10% and the disk is under 0.5% while I’m searching for duplicates.

I’ve tried a different computer, different drives, reinstalling, changing the database type, deleting the database and starting fresh (a few times) and updating the fingerprints repeatedly.

The odd thing is, when I right click on an image and click “Find Similar” with the search set from 50 to 100, it only takes a few seconds.

Can anyone give me any suggestions as to what I can do to speed things up?

*UPDATE 1:
Duplicate search for 1 hr at 100 to 100 similarity has reached 1%.

*UPDATE 2:
Duplicate search for 9.5 hrs at 100 to 100 similarity has reached 26%

*UPDATE 3:
The entire program crashed at some point and I’ve had to start over