Switching from GitHub to GitLab
I've been a happy paying customer of GitHub since early 2009. But yesterday, for a few different reasons, I deleted all of my private repositories and moved them over to a self-hosted installation of GitLab. I didn't make that decision lightly, as I've been very happy with GitHub for the last five years, but here's why...
First, I've started working on a new Mac app. Every time I start a new project, unless it's open source, I create a new private repo for it on GitHub. This project happened to be my 21st private repository on GitHub. If you're familiar with their pricing structure, you'll know they charge based on how many private projects you have. $22 a month will get you twenty repos. But as soon as you create that twenty-first one, you graduate onto the $50 a month plan. Maybe if I were actually hosting 50 repositories with GitHub I'd be willing to pay that much, but for the foreseeable future I'm going to be in the low twenties, and $50 a month is just too much. It's a shame they don't just outright charge you a dollar per month per project.
The second reason is an issue I've been mulling over for quite a while. I love the cloud. I love having my data in the cloud. But some of it is so precious, in this case my code, that I want to know exactly how it's being taken care of and looked after. While I have no reason to doubt GitHub has plenty of backups in place, I have no way of really knowing for sure how safe my code is. Hosting it myself has its inherit risks, too, but at least I can have full ownership of my data and be certain of the backup strategies in place. This also dovetails nicely with the pleasure nerds like myself get in doing a job themselves. Whether that's hosting your own email (which I'm not crazy enough to do), managing your own web server (yes, please), or automating your own digital backups, there's a sick pleasure to be had in doing a job yourself and doing it well.
A final reason for switching away from GitHub was the uneasy feeling I got watching the story of Julie Ann Horvath unfold last week. I didn't like the idea of my money going to a company that seemed so fundamentally broken. Since then, GitHub has taken forceful, actionable steps to correct the issue, but it still worried me.
So those are my three and a half reasons for moving my private repos away from GitHub. If you agree with me, or if you have your own reasons for wanting to move away, what follows is a brain dump of the steps I took towards getting moved over and situated happily on a GitLab installation.
First off, if you've never heard of GitLab, go take a look through their website. It's a Rails app that is shamefully funny in how closely they've copied the look and feel and functionality of GitHub. Everything from the activity timeline, to pull requests, to user and team access roles, to issue tracking, to shareable git-backed gists. It's all very nicely implemented. Many open source projects start off strong and can later falter when the creators get bored. But I feel fairly confident in GitLab as their community open source version is based off an enterprise product they sell and do support for. Quite a few businesses are using GitLab as a GitHub replacement in situations where their code needs to remain on site.
So, where are we going to host it? My initial thought was to boot up a new virtual server with Rackspace, which is where I host all of my business servers. Rackspace is great. A little expensive, but the customer support makes up for it. Their minimum monthly price for a 512mb server, which is all we'll need, is around $10 a month. I was nearly about to create the server when I decided to finally take a look at DigitalOcean. They're the new hotness in cloud hosting and have a reputation for being extremely inexpensive. (Bonus points: they offer two-factor authentication on their user accounts, which is something Rackspace still lacks.) Poking around, I found I could get a comparable 512mb server with DigitalOcean for a flat $5 a month. But what really sealed the deal is they offer one-click installs of various server apps - WordPress, etc. I wasn't looking forward to the fairly intensive setup that GitLab requires, but amazingly, GitLab is one of DigitalOcean's one-click installs.
True to their word, I had a ready-to-go GitLab server up and running in less than a minute after clicking the "create" button. All that remained was fine tuning everything to my needs.
The first step upon getting a new cloud server is to secure it. I always follow the steps outlined in this guide. It does a good job of locking everything down and only takes about five minutes to follow.
Of note, when you get to the section about enabling ufw (the firewall), DigitalOcean boxes don't come with everything you need installed. I had to run the following command before setting up ufw...
sudo apt-get install linux-image-$(uname -r)
Another note, and this is just personal preference, I also modify my ssh port to be something non-standard. That can be changed in...
Also, while the user facing side of GitLab is great, I have no idea how security conscious they are. I'd hate for an unpatched security hole in their web app to expose any of my private code. One way to mitigate that chance is to lock down web traffic to the specific IP addresses you'll be accessing it from. Your home, your office, etc. With ufw it's just a quick...
sudo ufw allow from your-ip-address to any port 80
for each of your IPs.
Once you've gotten the security taken care of, you can move on to configuring GitLab. Most of the hard work is already done for you by DigitalOcean. You'll just need to fill in the appropriate values in...
Then restart GitLab with...
sudo service gitlab restart
With all that done, the next step is moving your repositories from GitHub to GitLab. (I'm sure there is a better direct git-to-git way of doing what follows, but this was the simplest solution for my needs.) For each of your repos, do a clean mirror to your Desktop to make sure you've got everything.
git clone --mirror email@example.com:username/repo-name.git
Then, cd into the repo directory and....
git remote add gitlab ssh://firstname.lastname@example.org:22/username/repo.git git push -f --tags gitlab refs/heads/*:refs/heads/*
That final git push with all the refs will push every branch and all of your tags making sure nothing is left behind.
Once done, you can safely delete your repo from GitHub.
The last step is making sure you have rolling backups of your GitLab installation and repositories in place. I looked into piecing together my own backup script until I realized GitLab already has a rake backup task available that stores everything into a single tar file. Perfect. I can then just upload that to S3 for safe keeping. To do that, we'll be using s3cmd to handle the uploads.
sudo apt-get install s3cmd
Configure it with...
Then, create a script in your git user's home directory called backup.sh containing...
cd /home/git/gitlab && PATH=/usr/local/bin:/usr/bin:/bin bundle exec rake gitlab:backup:create RAILS_ENV=production s3cmd put tmp/backups/`ls tmp/backups/ | grep -i -E '\.tar$' | tail -1` s3://bucket-name/git/
Setup cron to run that script once a day and you're good.