Baloo_file_extractor running and eating up CPU and memory

I have a fairly recent install of Debian 12.1 (about to update to 12.2), and I find that baloo_file_extractor is eating up a full CPU core and >20GB of RAM, as well as periodically thrashing my SSD. On a fresh reboot, balooctl status shows Baloo is currently disabled. To enable, please run balooctl enable, so why is it even up and running?

How do I get it to stop, eating up all my memory and thrashing my SSD, it’s causing performance issues just trying to use my machine, and causing my SSD to heat up and SMART to complain.

EDIT: Ohhhh, I was running balooctl as root (was looking at SMART data), when running it as normal user it is actually running. Has a huge number of files indexed, and content waiting for indexing. I think I know which dir might be the issue, let me try and not have baloo index it.

EDIT2: I added a directory to the Not Indexed list, rebooted my computer as it said I needed to, but balooct status says it’s indexing files in there.

The Indexing line shows a file in /home/drizzt/Calibre Library. And then here is the output for exclude folders. So what gives?!

$ balooctl config show excludeFolders
/home/drizzt/Documents/Calibre Library/

So baloo is running “as you” and the “balooctl status” which gave you “disabled” was when you were root…

It sounds like baloo has made a list of “all the files” it thinks it should index (before you’ve put in the exceptions) and is working through them in batches, 40, and then the next 40, and then the next. I think best to start again from scratch, kill a running baloo_file and run “balooctl purge”.

Baloo’s been fixed to use a less memory, that was about 6 months back, but you still need to worry about swap. At a guess, if you were using > 20GB RAM, you were overflowing into swap and that punishes the SSD.

I had similar problems with Baloo and follow this guide

Fixed those problems with Baloo,
So I deleted user/.local/share/index and user/.local/share/index-lock ((overdone maybe) or not),

For some reason the above that did not work when I did mine, heh.
So I manually deleted the two /index and /index-lock, fixed !

checked next day the " balooctl status " wonderful no failed to index, if Baloo is kept up to date after it’s first few runs it is a great thing in the toolbox.
Baloo can be a pita even now many weeks after fixing it still occasionally will go to index a multitude of files, why did it wait eh no idea, at least no more failed to index.

My post on this

Hope this also helps

Yeah, seems to be doing OK now.

This isn’t true for me, I guess Debian 12.1/12.2 is very far behind. baloo_file is currently sitting at about 6.1GB of memory usage, which seems excessively large amount of RAM, with the index says it’s 8.8GB.

Being behind is sorta a Debian thing, sometimes unfortunately. I’m tempted to switch to testing, might do that when I have a free weekend without too much going on.

Although, that ticket you linked said Frameworks v5.68 was the fix, and the About this System says I’m running KDE Frameworks v5.103.0, so I’d think I’d have the fix in place.

I thought the same thing and removing the two index and index-lock things are going better , why / no idea,

After having problems with a rolling distro after updates = breakages decided that testing was too close too rolling for me to bother with
so tried MX21 this year in April then in September installed MX23 a few personal mods and has had a good run, the biggest difference is the abundance of tools to use for backups and creating your own system as an iso with everything supplied.

That’s going back a a bit, it was talking about a fix in 5.68 and that you needed to reindex to leave behind the corrupted data. The tail end of the bug mentions it catching people as they upgrade from Debian 10 to 11…

All the same, yes, there’s been a handful of notable changes to baloo in the last 6 months. A fix for index corruption issues, a cap on the amount of RAM it uses (for people with systemd), a fix to split up what was a single massive transaction into smaller chunks when baloo does it’s original scan for “all the new and changed files” and a necessary fix for systems running BTRFS with many subvolumes.

Ah, I see. Well, must be that cap on RAM usage that I’m hitting.

Although strangely, right now I’m seeing it only about 124MB. I’m wondering if it’s because I loaded up the VM and used up a bunch of RAM, and baloo_file had most of it’s memory usage as some kind of “soft” usage, such that the kernel could take back the memory for another application as needed.

I think that’s the case, baloo pulls pages into memory as and when it needs them. They can be dropped when memory is tight. Think of it like read caching…

When baloo wants to write something however, it sets up a transaction and the “changed pages” are flagged dirty and cannot be dropped - until the transaction is committed. For that time baloo can require/demand “more” memory.