Last May I wrote about how I was keeping alive a persistent SSH tunnel back to my home network. Specifically, it connects to an iMac Pro that I use as a home media/automation server. I frequently need to screen share, view a web service hosted on it, or just access its command line. It also serves as our build server at work. An SSH tunnel is basically a poor man's VPN. And I feel more secure about using that then just outright opening up ports in my firewall.
Anyway, nine months later, it's proven to be a great solution. However, there is an occasional quirk.
launchd job that controls the tunnel is set to automatically restart the process if it dies. And, sure, when it does die
launchd does the right thing and restarts it. But in some situations the tunnel won't actually terminate when it fails - it just hangs or becomes unresponsive. I was able to reproduce the issue a few times by making my WiFi flake out. So, I turned off the iMac's wireless and hardwired it into my router. That was a big improvement, but I still saw the occasional oddness and found myself locked out a number of times when trying to connect back from work, the coffee shop, etc.
Thankfully, before I wasted too much time debugging, I recognized this problem for what it was: a giant rabbit hole.
I knew damned well that I could absolutely lose (and enjoy!) a whole evening or entire weekend running tests, debugging, and trying to figure out the cause of the intermittent failures. I've done it a thousand times with other piddly little problems. But, as Oliver Wood might say, "I am wiser now." I've got better ways to spend my time. So, let's just fix it.
My thinking went like this:
- When the network goes down, the tunnel will close, the
sshprocess will quit, and
launchdwill keep restarting it until it connects again. That part works fine.
- The problem is figuring out when the connection has stalled or experienced some other type of unresponsive fuck-up. How do I detect that?
- Me! I'm a smart human. It should be me detecting the issue instead of wasting hours trying to find a way to make the computer catch it. When I try and connect to the tunnel and it doesn't work, that means something is broken. And since I only care if it's broken when I'm actually trying to use it, that's when I need to fix it. Automatically detecting any and all failures simply isn't necessary.
- So, when I notice the tunnel is down, how do I tell my iMac to kill the process and let
launchdrestart it when I don't have access to the machine in the first place? I could just kill the process automatically on a set schedule. But that might interrupt me when I was actually using it. Also, if it were down, I'd have to wait until the next interval for it to come back up. I need to send a command while remote and without having access to the machine - preferably in a way that I can also do from my phone.
- <Spidey sense starts tingling>
- Hazel! Dropbox! And a small shell script! My three favorite tools.
First, let's figure out how to kill the existing (stalled)
ssh process. From my original blog post, you can see that
launchd runs this command to start the tunnel:
/usr/bin/ssh -N imacvpn
So, we can figure out the
pid of that process by
greping for the
imacvpn keyword like this:
ps -ax | grep imacvpn | head -n1 | cut -f1 -d " "
That's a four part command
- First, it lists all the running processes under my account.
- Filter to just those that contain the name of the SSH tunnel. That will typically return two results. The real SSH tunnel command and also the
- So, grab just the first line which will be (should always be?) the real tunnel process.
- Split that line of text using a space as the delimiter. The first token will be the
pidwhich we'll use to identify and kill the tunnel.
pid, we can then
kill the process like this:
That'll find the ID of the SSH tunnel process, kill it, then
launchd will take over and start it back up again.
Great. But how to make that script run on command? That's where Hazel and Dropbox come in.
If you don't know Hazel, go here. What we're going to do is tell Hazel to monitor a specific folder inside my
~/Dropbox/. Whenever a new file is added to it, Hazel will detect the change, delete the file (since we don't actually care about the file), and then run the above shell script.
Here's how the configured Hazel rule looks:
Basically, we're using Dropbox as a way to trigger a filesystem change that Hazel will detect and run whatever command we ask it to.
It's great because if I'm on my computer at work or my laptop in a coffee shop, I can just save a blank text file into that folder. Dropbox will sync it to my iMac. Hazel will see the change and run the script.
And if I'm somewhere with just my phone, I've found the easiest thing to do is open up Photos.app and save a photo into Dropbox.app using Files.app via the system share sheet.
So that's the dumb solution to my problem that I came up with. I'm still not sure what the actual technical reason is for the SSH tunnel occasionally crapping out on me. I'm just glad I'm to the point in my nerd existence where I can be happy applying a fix and not caring about the real underlying issues that don't concern me.
After reading this post, Leo Kennis wrote to me:
A slightly safer option would be to change the command to:
ps -ax | grep imacvpn | grep -v grep | cut -f1 -d " "
The part with
grep -vis a negative
grep(find all lines that do not match "grep")
This way it doesn't matter which of the two results is listed first.
That's one of my favorite things about shell scripts - there's a million approaches to every problem. The above is a great way to make sure the correct
pid is chosen. Thanks, Leo!