Linux Fu: Stopping a Runaway

The best kind of Hackaday posts are the ones where there was some insurmountable problem with an elegant solution devised through deep analysis of the problem and creativity. This is …read more

Apr 14, 2025 - 18:26
 0
Linux Fu: Stopping a Runaway

The best kind of Hackaday posts are the ones where there was some insurmountable problem with an elegant solution devised through deep analysis of the problem and creativity. This is not one of those posts. I’m sure you are familiar with bit rot. You know, something works for a long time and then, for no apparent reason, stops working. Well, that has been biting me, and lacking the time for the creative, elegant solution, I decided to attack it with a virtual chainsaw.

It all started with a 2022 Linux Fu about using autokey.

The Problem

I use autokey to give me emacs-style keystrokes in Web browsers and certain other programs. It intercepts keystrokes and translates them into other keystrokes. The problem is, the current Linux community hates autokey. Well, that’s not strictly true. They just love Wayland more. One reason I won’t switch from X11 is that I haven’t found a way to do something like I do with autokey. But since most of the powers-that-be have decided that X11 is bad and Wayland is good, X11 development is starting to show cracks.

In particular, autokey isn’t in the normal repositories for my distro anymore (KDE Neon). Of course, I’ve installed the latest version myself. I’m perfectly capable of doing that or even building from source. But lately, I’ve noticed my computer hangs, especially after sleeping for a long time. Also, after a long time, I notice that autokey just quits working. It is running but not working and I have to restart it. The memory consumption seems high when this happens.

You know how it is. Your system has quirks; you just live with them for a while. But eventually those paper cuts add up. I finally decided I needed to tackle the issue. But I don’t really have time to go debug autokey, especially when it takes hours for the problem to manifest.

The Chainsaw

I’ll say it upfront: Finding the memory leak would be the right thing to do. Build with debug symbols. Run the code and probe it when the problem comes up. Try to figure out what combination of X11, evdev, and whatever other hocus pocus it uses is causing this glitch.

But who’s got time for that? I decided that instead of launching autokey directly, I’d launch a wrapper script. I already had autokey removed from the KDE session so that I don’t try to start it myself and then get the system restaring it also. But now I run the wrapper instead of autokey.

So what does the wrapper do? It watches the memory consumption of autokey. Sure enough, it goes up just a little bit all the time. When the script sees it go over a threshold it kills it and restarts it. It also restarts if autokey dies, but I rarely see that.

What’s Memory Mean?

The problem is, how do you determine how much memory a process is using? Is it the amount of physical pages it has? The virtual space? What about shared libraries? In this case, I don’t really care as long as I have a number that is rising all the time that I can watch.

The /proc file system has a directory for each PID and there’s a ton of info in there. One of them is an accounting of memory. If you look at /proc/$PID/smaps for some program you’ll see something like this:

00400000-00420000 r--p 00000000 fd:0e 238814592 /usr/bin/python3.12
Size: 128 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 128 kB
Pss: 25 kB
Pss_Dirty: 0 kB
Shared_Clean: 128 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 128 kB
Anonymous: 0 kB
KSM: 0 kB
LazyFree: 0 kB
AnonHugePages: 0 kB
ShmemPmdMapped: 0 kB
FilePmdMapped: 0 kB
Shared_Hugetlb: 0 kB
Private_Hugetlb: 0 kB
Swap: 0 kB
SwapPss: 0 kB
Locked: 0 kB
THPeligible: 0
VmFlags: rd mr mw me sd 
00420000-00703000 r-xp 00020000 fd:0e 238814592 /usr/bin/python3.12
Size: 2956 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 2944 kB
Pss: 595 kB
Pss_Dirty: 0 kB
Shared_Clean: 2944 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
. . .

Note that there is a section for each executable and shared object along with lots of information. You can get all the PSS (proportional set size) numbers for each module added together like this (among other ways):


cat /proc/$PID/smaps | grep -i pss | awk '{Total+=$2} END { print Total}'

Building the Chainsaw

So armed with that code, it is pretty easy to just run the program, see if it is eating up too much memory, and restart it if it is. I also threw in some optional debugging code.

#!/bin/bash
#- Run autokey, kill it if it gets too big
#- what's too big? $MLIMIT
MLIMIT=500000 
#- how often to check (seconds)
POLL=10

#- Print debug info if you want
function pdebug {
#- comment out if you don't want debugging. Leave in if you do
#- echo $1 $2 $3 $4
}

while true   # do forever
do
   PID=$(pgrep autokey-qt)  # find autokey
   pdebug "PID",$PID
   if [ ! -z "$PID" ]   # if it is there
   then
      # get the memory size
      PSS=$(cat /proc/$PID/smaps | grep -i pss | awk '{Total+=$2} END { print Total}')
      pdebug "PSS", $PSS
      echo $PSS >>/tmp/autokey-current.log
      # too big? 
      if [ "$PSS" -gt "$MLIMIT" ]
      then
         pdebug "Kill"
         echo Killed >>/tmp/autokey-current.log
         # save old log before we start another
         cp /tmp/autokey-current.log /tmp/autokey-$PID.log
         kill $PID
         PID=
         sleep 2
      fi
   fi
   if [ -z $PID ]
   then
      # if died, relaunch
      pdebug "Launch"
      autokey-qt & 2>&1 >/tmp/autokey-current.log
   fi
   pdebug "Sleep"
   sleep $POLL
done

In practice, you’ll probably want to remove the cp command that saves the old log, but while troubleshooting, it is good to see how often the process is killed. Running this once with a big number gave me an idea that PSS was about 140,000 but rising every 10 seconds. So when it gets to 500,000, it is done. That seems to work well. Obviously, you’d adjust the numbers for whatever you are doing.

Bad Chainsaw

There are lots of ways this could have been done. A systemd timer, for example. Maybe even a cgroup. But this works, and took just a few minutes. Sure, a chainsaw is a lot to just cut a 2×4, but then again, it will go through it like a hot knife through butter.

I did consider just killing autokey periodically and restarting it. The problem is I work odd hours sometimes, and that means I’d have to do something like tie it to the screensaver. But I agree there are dozens of ways to do this, including to quit using autokey. What would your solution be? Let us know in the comments. Have you ever resorted to a trick this dirty?