Trying to understand the logic behind file search/Baloo

Hi, everyone. After being a Gnome user since the beggining of time, I’m trying to get into KDE for a change.

My workflow heavily relies on searching through heaps of PDF files on one of my hard drives, which is mounted on my home folder as “big-drive”. Besides having over 14 tb of data, I really would only want to index my Documents folder, which is not at all that big in comparison (the folder is symlinked to ~/Documents).

I just can’t hope to have every possible reference I might need open at every moment, and sometimes I can only remember the reference by author name or subject, so I would expect to hit the search, type “J. Doe”, “geometry”, etc. and find every single paper, book or whatever other PDF with J. Doe as author name inside this folder.

As an example, here i am trying to search for “Petersen”, “Riemannian Geometry”, which I know is a file with this name inside the Documents folder

The actual PDF file is named “(Graduate Texts in Mathematics) Peter Petersen - Riemannian Geometry.pdf”, however as you can see, “Peter”, “Petersen”, “Riemannian”, “Geometry” gives nothing. The results are actually always the same. Not only that, I am inside the actual folder where the file is after manually digging for it, and Dolphin just cannot find it by searching inside the folder.

However, as you can see, it finds J. Conway’s book as a result, I believe because it is also searching for file contents: each Springer GJM book has a list of all books in the collection, and Petersen’s book is in the list. But of course, this information is useless if it can’t find the file itself.

This is just an example. There are also many other “Riemannian Geometry” named PDF files in the same folders where it found the results above, but as far Baloo is concerned, they appearently don’t exist.

I find this all very strange. What kind of file names do I need to have in order to be able to find them with a regular search? Should I maybe bind mount the drive instead of symlinking the folder?

I’ve tried to:

  • Remove all the other folders from indexing and leaving only this one.

  • Disable/purge/enable balooctl.

  • Index the whole home folder. The extreme solution.

Doesn’t matter, the results are always the same.

PS.: Almost forgot to mention, but this is Fedora 41 updated a couple of days ago when the last major update dropped, so I believe I have the latest version of everything I could possibly have in this context.

it would be better to add the actual path to the folder rather than go thru your /home/documents folder.

1 Like

The path is ~/big-drive/Documents. All my drives are mounted in home.

Not a Baloo expert by any means, but FWIW I’ve never been steered wrong by KFind as a file location tool - I have it mapped to Super/Win+F, so Super+F, *electric*2024* in the “Named:” box finds all my electric bills from 2024, for example :slight_smile:

I also like that workflow because if I want to open with an alternative app, open the containing folder, etc. it’s easy to do all of that with one or multiple of the files that were found, based on how KFind presents its results list.

1 Like

KFind just can´t find anything for me. It just gives up immediately.

is that the path you have shown in the file search settings?

or do you have it searching your ~/Documents folder?

what does balooctl6 status show?

Yes. The ~/Documents folder is just a symlink to this folder in the drive.

The result of balooctl:

Baloo File Indexer is running
Indexer state: Idle
Total files indexed: 65,963
Files waiting for content indexing: 0
Files failed to index: 0
Current size of index is 1.50 GiB

I am not sure baloo follows symlinks.
I believe that’s by design to prevent circular indexing.

what is the output these commands?

balooctl6 config list includeFolders

balooctl6 config list excludeFolders

balooctl6 config list excludeFilters | grep pdf

balooctl6 config list excludeMimetypes | grep pdf

balooctl6 failed

I prefer to disable baloo and use Recoll. The flatpak version works very well for me.

Yes, that’s why I indexed the actual folder, not Documents.

:upside_down_face: