Moving back to Google – just a little bit

I’ve been hosting my company‘s email with FastMail since 2008. They’re amazing. But my personal email had been with Gmail since the service was in beta in 2004. (And everything before Gmail lost to time and bit rot. Sigh.)

Around five years ago, I started getting nervous with so much of my online identity tied to an address that I was essentially borrowing and had no real control over. I was never worried about Google losing any of my data, but I had heard countless horror stories of Google’s AI flagging an account for some type of violation and locking out the user with no recourse.

If I ever lost access to my primary email account, I’d be dead.

So I began the rather annoying process of moving all of my online accounts over to use a new address at a domain I control. FastMail imported everything from my old Gmail and Google Calendar account, and with the help of 1Password, I was able to methodically switch my email everywhere else over the course of a few weeks.

I’ve been using my new address full-time for the last five years and now get only two or three non-spam emails a month to my old Gmail account.

Soon after switching emails, I began to question my other dependencies on Google. Along with Facebook, I started worrying about all the data they were collecting on me. I was also concerned about how I was playing a part in their monopoly over the web as a whole.

So I switched to using Duck.com as my full-time search engine. And I gave up Chrome in favor of Firefox. I even tried using Apple Maps as much as possible. In short, even if the alternative service wasn’t on par with their bigger competitor, I felt it was worthwhile to give them my support to encourage a more balanced ecosystem.

The switch mostly went well. I felt like the search results I got with Duck.com were good enough. I only had to fall back to Google for the occasional technical query. Firefox also made great strides with its support for macOS during that time with their Quantum project. And Apple Maps, despite all the awful reviews online, worked just fine navigating around Nashville for me.

But over the last year I’ve started, slowly, coming back to Google’s services.

It all started with Google Photos. I (mostly with the help of my own backup strategies) trust iCloud with my family’s photo archives. But Apple just makes it too inconvenient to use with a partner. Because of the way iCloud is siloed per user, my library is completely walled off from my wife’s. That means I can’t see photos of my kids that she takes. And she can’t see mine. Google Photos supports connecting your library with another person’s. (While that’s a super useful feature, we don’t do that. For our workflow, it’s easier just to sign into my Google account in Google Photos on my wife’s phone so everything funnels into one primary account.)

And while Apple’s Photos.app AI-powered search is mostly-good, it’s limited by their privacy stance and what they can process on-device. And the result is that it can’t even begin to compete with the ways I’m able to slice, dice, sort, and organize my photos with Google.

Is Google using the faces and location data in my photos to train their robot overlords? Most definitely. Do I care? Yes. But is it enough to outweigh the benefits I get from their otherwise amazing offering that I pay $10/month for? For me, no.

Added to that is the degradation in quality I’ve seen in Duck.com’s search results since last year. I’m not sure what changed under-the-hood, but I found myself having to search a second time in Google way more frequently to the point where I just gave up and made Google my default choice in January.

I’ve been a paying customer of Dropbox since 2008 (or 2009?). But because of the $10/month I was paying Google for extra photo storage space (2TB) (which I get to share with my wife’s Google account), and the $10/month I pay for extra iCloud storage (which I also get to share with my wife), it just didn’t make sense to keep paying for Dropbox as well when I could use Drive instead. And you know what? After using Drive for the last six months I’ve found that it’s really quite nice. Especially with the added benefits of everything integrating with Docs and Spreadsheets and their very capable (but decidedly non-iOS and ugly!) mobile apps.

Further, although not really that important, I’ve also migrated my calendars from FastMail back to Google Calendar simply because every other service in the world that wants to integrate with my calendar data (and that I want to give permission to) supports Google’s protocol but not standard CalDAV. It’s a shame, but I’ve decided to make my life easier and just go with it than wall myself off by taking a principled stand for open data.

What does this all mean?

I still use Firefox. I stick with Apple Maps when possible. But I’ve slowly moved back to Google’s services in cases where they’re so far ahead of the competition I just can’t help it, which has created a bit of a halo-effect with their complimentary services.

And in a most-decidedly un-Googly turn of events, customers of their Google One extra-storage plans can now talk to a Real Live Human if something goes wrong. That gives me much more confidence in my precious data’s longevity with them, which is what drove me away from Gmail in the first place.

Dammit, Google. I don’t trust you. But I can’t quit you, either.

Finder Folder Actions not being triggered when files are added with rsync

A couple weeks ago I wrote about how I was automatically capturing the photos and videos my kids’ daycare emails to me and importing them into Photos.app. The major pieces of that script worked fine – parsing the emails, downloading the images, and then rsync’ing them down to my Mac every hour.

But what was failing was the Finder Folder Action I setup that was supposed to import the files into Photos.app whenever new ones were added to that folder.

For some reason, the Folder Action would only occasionally fire. Maybe for one out of every ten items. Sometimes, if I navigated to the folder in the Finder, the action would kick-in and import everything. But sometimes not.

All I can think is that because the files were being added to the folder via rsync – using some BSD-like filesystem APIs instead of the higher-level macOS ones – the Folder Action was never being triggered. Again, it occasionally worked, but mostly failed. So I could be entirely wrong about all of this.

Anyway, I rewrote the whole thing to just run an AppleScript every hour via cron, which handles the whole process itself. Since making that change it’s been working perfectly.

Maybe someone reading this will find it helpful.

Fixing Broken Backblaze B2 Scripts when Run From cron

Just a quick note for my future self and anyone else who might be running into this problem.

Last week I migrated all of my backups off of Amazon S3 and rsync.net to Backblaze B2. The cost savings are enormous – especially for a small business like myself. And the server-to-server transfer speeds using their b2 Python script, while not as fast as using a raw rsync connection, are quite a bit quicker than using S3.

Before committing to B2, I gave it a really thorough test by seeding it with 350,000 files totaling 450GB. The whole process took about eight hours coming from my primary Linode server in Atlanta. I was quite pleased.

Anyway, after testing all of my scripts, I put them into cron and ignored them for the next few days assuming they’d “just work”. But when I went back to check on them, I found every one had been failing silently.

At first I thought maybe the b2 command wasn’t found in $PATH when running via cron for some reason, but that wasn’t it. Next I double-checked that b2 was using the correct credentials I had previously authorized it with by hand. Nope.

Turns out, b2 was throwing this Python exception.

Creating a Pipfile for this project...
Creating a virtualenv for this project...
Traceback (most recent call last):
  File "/usr/local/bin/pew", line 7, in <module>
    from pew.pew import pew
  File "/usr/local/lib/python2.7/site-packages/pew/__init__.py", line 11, in <module>
    from . import pew
  File "/usr/local/lib/python2.7/site-packages/pew/pew.py", line 36, in <module>
    from pew._utils import (check_call, invoke, expandpath, own, env_bin_dir,
  File "/usr/local/lib/python2.7/site-packages/pew/_utils.py", line 22, in <module>
    encoding = locale.getlocale()[1] or 'ascii'
  File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/locale.py", line 564, in getlocale
    return _parse_localename(localename)
  File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/locale.py", line 477, in _parse_localename
    raise ValueError, 'unknown locale: %s' % localename
ValueError: unknown locale: UTF-8

I’m hardly a Python expert, and I’ve traditionally had nothing but problems anytime I’ve had to do anything with pip, so this didn’t surprise me. What did surprise me was that this error was happening both locally on my Mac (10.14.4) and on my remote Ubuntu 18.04 box.

After some googling I found this bug in pipenv. The solution is to add the following to your b2 scripts that are run by cron:

export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8

And that fixed it.

I know macOS ships with a mostly-broken installation of Python, but the latest Ubuntu LTS? Anyway, if this is common Python/pip knowledge, at least now I know, too.

A Faster Way to Create Multiple Tasks in OmniFocus (with all sorts of details!) Using Drafts.app

Following-up on my previous post about using Drafts to create new GitHub issues, here’s another action I built and use all the time.

This allows you to create multiple tasks in OmniFocus with defer dates, due dates, and tags in one step.

It does this by parsing a compact, easy-to-write syntax that I’ve adopted from other OmniFocus actions and tweaked to my liking and then converting it into TaskPaper format, which can be “pasted” into OmniFocus in one go. This removes the need to confirm each individual action separately.

Yes, you could also do this by writing your tasks in TaskPaper format directly, but I find its syntax (while innovative!) a bit cumbersome for quick entry. The format this action uses isn’t as feature-rich, but it does everything I need and with less typing.

Instructions:

Each line in your draft becomes a new task in OmniFocus, with the exception of “global” tags and dates, which I’ll describe later.

Each task goes on its own line and looks like this:

Some task title @defer-date !due-date #tag1 #tag2 --An optional note

The defer date, due date, tags, and note are all optional. If you use them, the only requirement is that they come AFTER the task’s title and the “–note contents” must be LAST.

The defer and due dates support any syntax/format that OmniFocus can parse. This means you can write them as @today, @tomorrow, @3d, @5w, etc. If you want to use a date format that includes characters other than letters, numbers, and a dash (-), you’ll need to enclose it in parenthesis like this: @(May 5, 2019) or !(6/21/2020).

Global Defer/Due Dates:

By default, tasks will only be assigned defer/due dates that are on the same line as the task title. However, if you add a new line that begins with a @ or ! then that defer or due date will be applied to ALL tasks without their own explicitly assigned date.

Global Tags:

Similarly, if you create a new line with a #, then that tag will be added to ALL tasks. If a task already has tags assigned to it, then the global tag(s) will be combined with the other tags.

Full Featured (and contrived) Example:

Write presentation !Friday #work
Research Mother's Day gifts @1w !(5/12/2019) --Flowers are boring
Asparagus #shopping
#personal
@2d

You can install the action into your own Drafts.app from the action directory.

Creating New GitHub Issues From Drafts.app

After last week’s post about how to create a GitHub issue with image attachments from an email, I thought I’d try and speed up how quickly / easily I’m able to create new issues that don’t come from customer emails – i.e., the ones that just randomly occur to me.

Drafts is my preferred way of capturing text and ideas on Mac and iOS and then doing something with it. It has tons of scripts (actions) to do just about anything, and you can write your own if you need something custom.

So, after a quick look through GitHub’s API docs, I put together this script for Drafts.

It fetches your most recently active repos, presents them in a dialog prompt to pick one, and then creates a new issue in that repo using the contents of the current draft. Simple. Fast. Awesome. And a lot easier than trying to navigate GitHub’s mobile website.

You can install the action into your own Drafts.app from the action directory.

Backing Up Shared iCloud Photo Albums and Where to Find Them on Disk

In my quest to backup ALL THE THINGS, I turned my attention earlier this week to the shared iCloud Photo Albums my friends and family use to pass around photos and videos of our kids.

All of the items in my iCloud library (and my wife’s library) are combined and backed up to Google Photos automatically. For better or worse, Google Photos is the “source of truth” that contains all of our archives and is sorted into albums. It’s the backup I’d use to restore if iCloud ever goes belly-up. (And I have a redundant backup of Google Photos itself in case Google ever loses my data.) And the actual Photos.app library on my iMac is backed up to Backblaze for good measure, too. So the photos we take are covered.

But there are a ton of great memories of our kids snapped by other people. Those only reside in the shared iCloud photo streams. How do I back those up?

Ideally, Photos.app on Mac (or iOS) would have a preference to automatically import shared items taken by other people – and then those would feed into Google Photos. But that doesn’t exist. I could manually save-to-my-library new items as they’re shared, but that’s error prone and not scalable.

Also, what about the 2,000+ previously shared photos? I thought I would be clever and just select-all on my Mac and drag them into my main library, but after doing a few quick tests I realized Photos.app isn’t smart enough to not duplicate the photos I took and shared when importing. (This is likely due to Apple scaling-down and stripping out metadata of shared items.) And there’s no way to sort by “other people” or build a smart album of “photos taken by other people” to filter out your own images when importing.

So, I decided to do some digging.

The first step was to locate the shared albums on disk. I searched my main Photos Library.photoslibrary bundle, but couldn’t find them inside. A quick glance through ~/Application Support/ didn’t turn up any obvious hiding places either. That’s when I fired up DaisyDisk to search for large (10GB+) folders.

Success!

For my own reference and for anyone else who comes across this post after googling unsuccessfully, iCloud’s shared photo albums are stored here:

~/Library/Containers/com.apple.cloudphotosd/Data/Library/Application Support/com.apple.cloudphotosd/services/com.apple.photo.icloud.sharedstreams/assets/

Each shared album is inside that folder and given a UUID-based folder name. And inside each album, every shared photo/video is itself inside its own UUID folder name. It’s quite impenetrable and obviously not meant for users to poke around, but the programmer in me understands why it is this way.

At the top level is a Core Data database. I thought I might get clever and explore that to see if I could extract out the metadata of the shared items and use it to help me write a “smart” backup script (that perhaps imports other people’s photos directly into Photos.app) instead of just taking the brute-force approach and backing up the entire album as a dumb blob, but I haven’t had enough time yet to investigate.

So until I find the time to build that “smart” approach, I’m going about it the dumb way and nightly syncing everything to B2. It’s not ideal, but it covers my needs for now.

Creating GitHub Issues (with image attachments!) From an Email

I’m very meticulous about logging all of the feedback I receive from my customers. Whether it’s a bug report or a feature request, I want all of that information captured in a single place where I can plan and act on it. For me, that place is the Issues section in my app’s GitHub repo.

Normally, when I get a customer email, my workflow is to reply back to them with any clarification I need, and then once we’ve finished with any back and forth, create a new GitHub issue with the relevant info from their email and a note reminding me to email them back when their issue is resolved.

This takes all of a minute to do. But it still means opening a browser, navigating to the repo, clicking on “Issues”, then “New Issue”, and copy and paste the email details. Further, if the user supplied any screenshots, I have to save those out of the email and upload them to GitHub as well. Like I said, it only takes a minute or so, but it adds unnecessary friction.

Today I decided to automate all of that.

I use the fantastic Postmark service to send all of my company‘s transactional emails. They also have an equally awesome inbound service that will parse any emails you forward to them and POST the details as a JSON blob to your web hook.

So, I created a forwarding rule in Fastmail to forward any emails sent to [email protected] and [email protected] to my secret Postmark inbound address.

Postmark receives the forwarded email and POSTs the data to my server, which runs a small PHP script (embedded below) that downloads any image attachments contained in the email and creates a new GitHub issue in the appropriate repo with the contents of the email and image attachments.

It all works great! What used to be a slightly annoying process to do a couple times a day, now takes all of three seconds in my email client – whether I’m at my desktop or out and about on my phone.

Maybe you’ll find this script helpful.

Fixing a Broken Service With a Tiny Bit of Automation

This post is a nice, unintentional follow-up to yesterday’s one about backing up all of my family’s photos and home videos. Anyway…

My kids go to a fantastic daycare. My wife and I couldn’t be happier. The teachers are wonderful, they love our children, and our kids adore them, too. But, the third-party service the school uses to communicate with parents is absolute horseshit.

I won’t say what the service is because I don’t want to give them free publicity or maybe even alert them to what I’m doing, but if you have daycare-aged children, you probably know it. All the schools use use it.

All of the teachers carry around iPads in the classroom. They use this third-party app to check-in / check-out the children, capture photos and videos throughout the day, record what they ate for lunch and how long they napped, and (if your child is young enough) document their diaper changes. At the end of the day, after we sign them out of school, my wife and I get an automated email from the service with a summary of each kid’s day. But what we look forward to most are the photos/videos they take of our kids that get sent to us as they happen. When you’re slogging through a boring day at the office, seeing a happy picture of your kid on the playground with their friends is awesome.

Now, let me be clear. The service works. Mostly. I mean, it functions adequately. But it’s a horrorshow of app / website design.

It looks like something straight out of 2009-era iPhone development. It’s difficult to use. Crashes frequently. And from what the teachers have told me, the educator version isn’t any better.

Luckily, you don’t have to use their app. You can opt-in to get all the updates and photos sent to you via email, which is what my wife and I do. But, the HTML emails they send have never rendered properly in any email client – desktop or web – that I’ve tried. But that’s fine. They may not be pleasant to look at, but I can read the information in them.

My biggest gripe is that we often want to save any particularly good photos of our kids and share them with the grandparents. You can’t save the photo out of the email, because the embedded image is cropped to a square for some strange reason. You need to first tap on the image to load the full version in a browser and download it from there. Fine. But, any photo that contains any child in addition to your kid – like a group shot with a friend – is displayed with a transparent div on top of it so you can’t download it (at least on a mobile device) for privacy reasons. Look, I get it. Some parents might not want other parents unintentionally posting photos of their kids to social media. But it’s still annoying. It just forces us to take – and then crop – a screenshot. Also, the emails containing videos, which are often the best ones, can’t be downloaded at all.

Last night I got frustrated enough to finally do something about this.

I use Postmark to send all of my company‘s transactional emails. They’re fantastic for sending emails, but one feature they offer that I’ve never taken advantage of is handling inbound emails.

You can forward any email to a secret address they provide you, and they’ll parse the email and POST all of its information as a helpful JSON object to whatever URL you specify.

So, I setup a webhook in their control panel pointing to a PHP script on my web server. Then, I told Fastmail to forward all emails from the daycare service to my secret Postmark email address. You can see where this is going, can’t you?

When they send a new email to my server, the PHP script finds the link in the email’s HTML content that points to the full version on the service’s website. It then downloads that web page, parses out the URL to the full image, downloads that, and saves it into a folder on my server. This works for videos, too.

The PHP script I wrote is specific to the service our daycare uses, but if you’re curious, here it is…

That’s the first step.

Next, my iMac at home runs a script every hour to download any new photos or videos from my server and puts them in a folder inside my Mac’s “Pictures” folder. When that happens, a folder action I built with Automator automatically imports them into Apple’s Photos.app, where they’re synced to all of my mobile devices and iCloud. Soon after that, Google Photos on my iPhone will detect the new items and archive them in Google’s cloud, where they’re backed-up and made available on my wife’s phone as well.

Here’s a photo of the Automator action. It couldn’t be simpler – just one step…

The result? We get to see all of our kids’ photos as they happen, in the nice Photos app on our phones – rather than digging through the service’s crappy emails. And, sharing the pictures with the rest of our family is a one-tap process – even for the videos which previously weren’t available at all!

Backing Up Everything (Again)

This will take a while. Bear with me.

I’m obsessive about backing up my data. I don’t want to take the chance of ever losing anything important. But that doesn’t mean I’m a data hoarder. I like to think I’m pragmatic about it. And I don’t trust anyone else to do it for me.

From around 2006 to 2012, I kept a Mac mini attached to our TV with a Drobo hanging off the back. It had all our downloaded movies on it. And every night it would automatically download the latest releases of our favorite TV shows from Usenet so my wife and I could watch them with Plex the next day. It worked great, and all the media files were stored redundantly across multiple hard drives with tons of storage space. (Would it survive a house fire? No. But files like that weren’t critical.) But with the rise of streaming services and useful pay-to-watch stores like iTunes, now I’d rather just pay someone else to handle all of that for me. So, I don’t keep any media files like that locally any longer.

But my email? My financial and business documents? My family’s photo and home video archive? I’m really obsessive about that.

For most of my computing life, all of that data was small enough to fit on my laptop or desktop’s hard drive. In college, I remember burning a CD (not a DVD) every few months will all of my school work, source code, and photos on it for safe keeping. The internet wasn’t yet fast enough to make backing up to a cloud (were clouds even a thing back then?) feasible, so as my data grew I just cloned everything nightly to a spare drive using SuperDuper and Time Machine. It worked for the most part. Sure, I still worried about my house catching fire and destroying my backups, but there really wasn’t an alternative other than occasionally taking one of the backup drives to work or a friend’s house.

But then the internet got fast, really fast, and syncing everything to the cloud became easy and affordable. I was a beta user of Gmail back in 2004. I was an early paid subscriber of Dropbox since around 2008. All of my data was stored in their services and fully available on every computer and – eventually – mobile device. At the time, I thought I had reached peak-backup.

I was wrong.

Now we have too much data. My email is around 20GB. My family’s photo library is approaching 500GB. That’s more data than will fit on my laptop’s puny SSD. It will fit on my iMac, but it leaves precious available space for anything else. I could connect external drives, but that gets messy and further complicates my local backup routine. (Yes, Backblaze is a good, potential solution to that.)

Another problem is that most of our data now is either created directly in the cloud (email, Google Docs, etc) or is immediately sent to it (iPhone photos uploaded to iCloud and/or Google Photos), bypassing my local storage. If you trust Google (or Apple) to keep your data safe and backed up, that’s great. I don’t. I’ve heard too many horror stories about one of Google’s automated AI systems flagging an account and locking out the user. And with no way to contact an actual human, you’re dead in the water along with all your data. Especially if you lose access to your primary email account, which is the key to all your other online accounts.

So, I need a way to backup my newly created cloud data, too. This is getting complicated.

First step. My email. This is easy. Five years ago I setup new email addresses for my personal and business accounts with Fastmail. They’re amazing. I imported my 10+ years worth of email from Google (sadly, my pre-2004 college email and personal accounts are lost to the ether), setup a forwarding rule in Gmail, and with the help of 1Password, changed all of my online services to use my new email. It took about a month to switch everything over, but now the only email coming to my old Gmail address is spam. Fastmail keeps redundant backups of my email. And I have full IMAP copies available on multiple computers in case they don’t. And if something ever goes wrong, unlike Google where their advertisers are the customer – and I’m the product – I pay Fastmail every month and can call up a live human to talk to.

Source code. I’m a paying GitHub customer. Everything’s stored and backed up there. But still, what if they screw up. I ran a small, self-hosted server with GitLab on it for a while instead of GitHub and set it to backup all my code nightly to S3. That worked great. But, I like GitHub’s UI and feature set better. Plus, it’s one less server I have to manage. So, where do I mirror my code to? (Much of my code is checked out locally on my computer, but not all of it.)

Back in 2006, my boss at the web agency I was working at told me about rsync.net. They provide you with a non-interactive Unix shell account that you can pipe data to over SFTP, rysnc, or any other standard Unix tool. You pay by the GB/month, and they scale to petabyte sizes for customers who need that. So, I signed up and used them to backup all of my svn (remember svn?) repos. With the rise of git and switch to GitHub, I cancelled my account and mostly forgot about them.

But, aha!, I now have new data storage problems. Rsync.net could be a great solution again. So, I re-signed up and setup my primary web server to mirror all of my GitHub repos over to them each night. Here’s the script I’m using…

Next up, important documents. Traditionally, I’ve kept everything that would normally go in my Mac’s “Documents” folder in my Dropbox account. That worked great for a long time. But once I started paying Google for extra storage space for Google Photos (more on that later), it felt silly to keep paying Dropbox as well. So, after 10+ years as a paid subscriber, I downgraded to a free account and moved everything into Google Drive. Sure, it’s not as nice as Dropbox, but it works and saves me $10 a month.

Like I said above, I mostly trust Google, but not entirely. So, let’s sync my Google Drive’s contents to rsync.net, too. Edit your Mac’s crontab to add this line…

30 * * * * /usr/bin/rsync -avz /Users/thall/Google\ Drive/ [email protected]:google-drive

Also, I keep all of the really important paperwork that would normally be in a fire safe in my garage in a DEVONthink library so I can search the contents of my PDFs. It’s synced automatically with iCloud and available across my mobile devices. But still, better back that up, too.

45 * * * * /usr/bin/rsync -avz /Users/thall/FireSafe.dtBase2 [email protected]:

So, that’s all of my data except for the big one – my family’s photo and home video archives.

For a long time I kept all my family’s archives in Dropbox. I even made an iOS app dedicated to browsing your library. I could have stuck everything in Apple’s Photos.app where it’s available on my devices via iCloud, but that’s tied to my Apple ID. My wife wouldn’t be able to see those photos. Plus, any photos she took on her phone would get stored in her iCloud account and not synced with the main family archive. So, we used the Dropbox app, signed-in to my account, to backup our phones’ photos.

But, like I said earlier, our photo and video library become to big to comfortably fit in Dropbox. Plus, Google Photos had just been released and it was amazing. Do I like the thought of Google’s AI robots churning through my photos and possibly using that data to sell me advertisements? No. But, their machine-learning expertise and big-data solutions make it really hard to resist. So, I spent a week and moved everything out of Dropbox into Google Photos.

Now everything is sorted into albums, by date, and searchable on any device. I can literally type into their search box “all photos of my wife’s grandmother taken in front of the Golden Gate bridge” and Google returns exactly what I’m looking for. It’s wonderful.

My wife’s phone has the Google Photos app installed with my account on it so every photo she takes gets stored in a shared account we can both access and view on all our devices.

But what’s the recurring theme of this blog post? That’s right. I don’t fully trust any cloud provider to be the only source of my data. Someone clever said “the cloud is just someone else’s computer.” That’s exactly correct. If your data isn’t in at least two different places, it’s not really backed up.

But how do I backup my 500GB+ of photos that are already in Google’s cloud? And then how do I keep new items recently added synced as well?

As usual, I tried to find a way to make it work with rsync.net. I found a great open-source project called rclone. It’s a command line tool that shuffles your files between cloud providers or any SFTP server with lots of configurable options and granularity.

First off, even if rclone does do what I need, I can’t just run it on my Mac. My internet is too slow for the initial backup. I need to use it on one of my servers so I have a fast data center to data center connection between Google and rsync.net.

Getting it setup on one of my Ubuntu servers at Linode was a simple bash one-liner. Configuring it to then work with my Google and rsync.net accounts was just a matter of running their easy-to-use configuration wizard.

Note: rclone doesn’t support a connection to Google Photos. Instead, you need to login to Google Drive on the web and enable the “Automatically put your Google Photos into a folder in My Drive” option in Settings. (And also tell your Google Backup & Sync Mac app not to sync that folder locally – unless you have the space available – I don’t.) Then, rclone can access your Google Photos data via a special folder in your Drive account.

With everything configured, I ran a few connection tests and it all worked as expected. So, I naively ran this command thinking it would sync everything if I let it run long enough:

rclone copy -P "GoogleDrive:Google Photos" rsync:GooglePhotos

Things started out fine. But eventually, due to Google API rate limits, it was quickly throttled to 300KB/sec. That would have taken MONTHS to transfer my data. And, the connection entirely stalled out after about an hour. I even configured rclone to use my own, private Google OAuth keys, but with the same result. So, I needed a better way to do the initial import.

Google offers their Takeout service. It lets you download an archive of ALL your data from any of their services. I requested an archive of my Google Photos account and eight hours later they emailed me to let me know it was ready. Click the email link to their website, boom. Ten 50GB .tgz files. Now what to do with them?

I can’t download them to my Mac and re-upload them – that’s too slow. Instead, I’ll just grab the download URLs and use curl on my server to get them, extract them, and sync them over.

I don’t have enough room on my primary web server – plus I don’t want to saturate my traffic for any customers visiting my website. So, spin up a new Linode, attach a 500GB network volume, and we’re in business. Right? Nope.

The download links are protected behind my Google account (that’s great!) so I need a web browser to authenticate. Back on my Mac, fire up Charles Proxy and begin the downloads in Safari. Once they start, cancel them. Go to Charles, find the final GET connection, and right-click to copy the request as a curl command including all of the authentication headers and cookies. Paste that command into my server’s Terminal window and watch my 500GB archive download at 150MB(!!)/sec.

(Turns out, extracting all of those huge .tgz files took longer than actually downloading them.)

Finally, rsync everything over to my backup server.

And that’s where I currently am right now. Waiting on 500GB worth of photos and videos to stream across the internet from Linode in Atlanta to rsync.net in Denver. It looks like I have about six more hours to go. Once that’s done, the initial seed of my Google Photos backup will be complete. Next, I need a way to backup anything that gets added in the future.

Between the two of us, my wife and I take about 5 to 10 photos a day. Mostly of our kids. Holidays and special events may produce a bunch more at once, but that’s sporadic. All I need to do is sync the last 24 hours worth of new data once every night.

rclone is the perfect tool for this job. It supports a “–max-age=24h” option that will only grab the latest items, so it will comfortably fit within Google’s API rate limits. Once again, setup a cron job on my server like so:

0 0 * * * rclone copy --max-age=24h "GoogleDrive:Google Photos" rsync:GooglePhotos

And, that’s it. I think I’m done. Really, this time.

All of my important data – backed up to multiple storage providers – and available on all of my and my family’s devices. At least until the whole situation changes yet again.

A few more notes:

All of my web server configuration files are stored in git. As are all of my websites’ actual files. But, I still run an hourly cron job to backup all of “/var/www” and “/etc/apache2/sites-available” to rsync.net since it’s actually such a small amount of data. This lets me run one command to re-sync everything in the event I need to move to a new server, without having to clone a ton of individual git repos. (I know I need to learn a better devops technique with reproducible deployments like Ansible, Puppet, or whatever the cool tech is these days. But everything I do is just a standard LAMP stack (no containers, only one or two actual servers), so spinning up a new machine is really just a click in the Linode control panel and couple apt-get commands and dropping my PHP files into a directory.)

My databases are mysqldump’d every hour, versioned, and archived in S3.

All of the source code on my Mac is checked out into a single parent directory in my home folder. It gets rscyn’d offsite every hour, just in case. Think of it as a poor man’s Time Machine in case git fails me.

I do a lot of work in The Omni Group‘s apps – OmniFocus, OmniOutliner, and OmniGraffle. All of those documents are stored in their free WebDAV sync service and mirrored on my Mac and mobile devices.

All of my music purchases have gone through iTunes since that store debuted however many years ago. I can always re-download my purchases (probably?). Non-iTunes music ripped from CDs long ago, and my huge collection of live music, is stored in iTunes Match for a yearly fee. A few years ago when I made the switch to streaming music services and mostly stopped buying new albums, I archived all of my mp3s in Amazon S3 as a backup. I need to set a reminder to upload any new music I’ve acquired as a recurring task once a year or so.

Also, I have Backblaze running on my desktop and laptop doing its thing. So yeah. I guess that’s yet another layer of redundancy.

A Simple, Open-Source URL Shortener

tl;dr One evening last week, I built pretty much the simplest URL shortening service possible. It’s simple, fast, opinionated, keeps track of click-thru stats, and does everything I need. It’s all self-contained in a single PHP script (and .htaccess file). No dependencies, no frameworks to install, etc. Just upload the file to your web server and you’re done. Maybe you’ll find it useful, too.

Anyway…

I run a small software company which sells macOS and iOS software. Part of my day-to-day in running the business is replying to customer support questions – over email and, sometimes, SMS/chat. I often need to reply to my customers with long URLs to support documents or supply them with custom-URL-scheme links which they can click on to deep-link them into a specific area of an app.

Long and non-standard URLs can often break once sent to a customer or subsequently forwarded around. I’ve used traditional link shortening services before (like bit.ly, etc), but always worried about my URLs expiring or breaking if the 3rd party shortening service goes out of business or makes a system change. Even if I upgraded to a paid plan which supports using a custom domain name that I own, I’m still not fully in control of my data.

So, I looked around for open-source URL shortening projects which I could install on my own web server and bend to my will. I found quite a few, but most were either outdated or overly-complex with tons of dependencies on various web frameworks, libraries, etc. I wanted something that would play nicely with a standard LAMP stack so I could drop it onto one of my web servers without having to boot up an entirely new VPS just to avoid port 80/443 conflicts with Apache. Out of the question was anything requiring a dumb, container-based (I see you, Docker) solution just to get started. Nice-to-haves would be offering basic click-thru statistics and an easy way to script the service into my existing business tools and workflows.

Admittedly, I only spent about an hour looking around, but I didn’t find anything that met my needs. So, I spent an evening hacking together this project to do exactly what I wanted, in the simplest way possible, and without any significant dependencies. The result is a branded URL shortening service I can use with my customers that’s simple to use and also integrates with my company’s existing support tools (because of its URL-based API and (optional) JSON responses – see below).

Requirements

  • Apache2 with mod_rewrite enabled
  • PHP 5.4+ or 7+
  • A recent version of MySQL

Install

  1. Clone this repo into the top-level directory of your website on a PHP enabled Apache2 server.
  2. Import database.sql into a MySQL database.
  3. Edit the database settings at the top of index.php. You may also edit additional settings such as the length of the short url generated, the allowed characters in the short URL, or set a password to prevent anyone from creating links or viewing statistics about links.

Note: This project relies on the mod_rewrite rules contained in the .htaccess file. Some web servers (on shared web hosts for example) may not always process .htaccess files by default. If you’re getting 404 errors when trying to use the service, this is probably why. You’ll need to contact your server administrator to enable .htaccess files. Here’s more information about the topic if you’re technically inclined.

Creating a New Short Link

To create a new short link, just append the full URL you want to shorten to the end of the domain name you installed this project onto. For example, if your shortening service was hosted at https://example.com and you want to shorten the URL https://some-website.com, go to https://example.com/http://somewebsite.com. If all goes well, a plain-text shortened URL will be displayed. Visiting that shortened URL will redirect you to the original URL.

Possibly of interest to app developers like myself: The shortening service also supports URLs of any scheme – not just HTTP and HTTPS. This means you can shorten URLs like app://whatever, where app:// is the URL scheme belonging to your mobile/desktop software. This is useful for deep-linking customers directly into your app.

iOS Users: If you have Apple’s Shortcuts.app installed on your device, you can click this link to import a ready-made shortcut that will let you automatically shorten the URL on your iOS clipboard and replace it with the generated short link.

Viewing Click-Thru Statistics

All visits to your shortened links are tracked. No personally identifiable user information is logged, however. You can view a summary of your recent link activity by going to /stats/ on the domain hosting your link shortener.

You can click the “View Stats” link to view more detailed statistics about a specific short link.

Password Protecting Creating Links

If you don’t want to leave your shortening service wide-open for anyone to create a new link, you can optionally set a password by assigning a value to the $pw_create variable at the top of index.php. You will then need to pass in that password as part of the URL when creating a new link like so:

Create link with no password set: http://example.com/http://domain.com

Create link with password set: http://example.com/your-password/http://domain.com

Password Protecting Stats

Your stats pages can also be password protected. Just set the $pw_stats variable at the top of the index.php file.

Viewing stats with no password set: http://example.com/stats

Viewing stats with password set: http://example.com/stats/your-password

A Kinda-Sorta JSON API

This project aims to be as simple-to-use as possible by making all commands and interactions go through a simple URL-based API which returns plain-text or HTML. However, if you’re looking to run a script against the shortening service, you can do so. Just pass along Accept: application/json in your HTTP headers and the service will return all of its output as JSON data – including the stats pages.

Contributions / Pull Requests / Bug Reports

Bug fixes, new features, and improvements are welcome from anyone. Feel free to open an issue or submit a pull request.

I consider the current state of the project to be feature-complete for my needs and am not looking to add additional features with heavy dependencies or that complicate the simple install process. That said, I’m more than happy to look at any new features or changes you think would make the project better. Feel free to get in touch.