since no one else seems to be as reliant as i am on spell check, i went on a deep dive and came up with the following writeup on my adventure.
hopefully it benefits someone besides me.
Spell Check – Dictionary Maintenance
When a user makes a dictionary addition via the right click “Add to Dictionary…” menu option, there is no single place where these additions are saved. Spell check dictionaries in linux and kubuntu in particular are spread out across various directories in a users /home
folder. They are associated with the application the user was using at the time they added the word.
The result is often a confusing array of user dictionaries that may or may not contain an added word depending on which application is in use. To coordinate these is manual process that is outlined here.
Tools Needed
- Meld for comparing and merging files
- Kate for viewing and manipulating file contents
- Dolphin for file management and navigating the directory tree
sort
command line tool for alphabetizing a file full of words and removing duplicates
find
, printf
, tail
commands for finding user dictionaries
Identifying User Dictionaries
The most likely applications for me to find added words are the browser Firefox or editor Kate, since that is where I do most of my typing, followed by my preferred markdown editor Typora. Other sources of new words could be from any of the office suites installed on my machine like Libre Office, Onlyoffice, WPS2019 or even code editors like VSCodium, though program language terms might be more of a technical nature and thus poor candidates for merging with the rest.
The first step is to identify the file where added words are saved for each of the applications you consider the major sources of new words. To this end a handy script finding the last 10 files changed on your machine is:
find -type f -printf "%T@ %p\n" | sort -n | tail -10
Run this immediately after using the “Add to dictionary…” feature in an application and you should be able to identify the file that was changed. Make note of this location and/or make a bookmark in Dolphin so you can easily get back to it in the future.
Dictionary File Differences
The main two dictionary files that interest me are fortunately stored as plain text files without a header, just a simple list of words. This makes it easy to sort, remove duplicates, as well as cut and paste into other dictionary files, as need be. It’s also possible to link these files together so they act as one where additions from one application will appear in the other after a logout.
When a dictionary file uses a header of any kind (even if its just numerical count of the number of words), you cannot just sort these without consequence. Though the purpose of the numerical count seems unclear since mismatch between the count and the number words does not always seem to matter.
The outlier in my case was Typora which stores it’s added words in .json
format. So adding words to this file requires a bit of reformatting using Find and Replace before a cut and paste operation can done. For this operation, a copy is made of the source dictionary in case any find and replace work goes sideways.
Another consideration is not all these applications will necessarily be using that same starting point for their standard dictionary, therefore some may already contain words that needed to be added to others.
Comparing and Merging Dictionaries
The plain text (no header) formats are the easiest to maintain since tools like Meld can be used to zipper together separate collections that may or may not have significant overlap. However Meld works best when the files are ordered for easy visual reference. To that end, a sort command is useful to create a common starting point:
sort -u /path/to/dictionary -o /path/to/dictionary
This removes any duplicate entires (-u
) and then sorts the file alphabetically before writing it back to the original file (-o
). Now with both dictionaries similarly prepared, they are ready for the compare operation and synchronization if desired.
An easy way to check completeness of a dictionary with kate is to make sure kate’s dictionary has the most comprehensive list of words, re-log to ensure the dictionary files are read, then open one of the other dictionary files in kate. Any missing words will be underlined and easy to spot. This is especially helpful when the dictionary is in .json
format where a meld comparison would be useless.
Location examples
Kate (and most KDE apps)
~/.hunspell_en_US
Firefox (snap)
~/snap/firefox/common/.mozilla/firefox/[user].default/persdict.dat
Onlyoffice (flatpak)
~/.var/app/org.onlyoffice.desktopeditors/data/onlyoffice/desktopeditors/data/dictionaries/all/all.dic