Arq

I've had a few emails asking me to explain in a little more depth how I'm using Arq to backup a portion of my files to Amazon Glacier, as explained briefly in The Backup Strategy.

The files I've got backed up to Amazon are in my opinion, the most safe, secure, and important files in my entire backup system. Should a major disaster strike my machine, my house, my city, or even my country - these files will remain safe. Stored remotely on Amazon's servers in multiple facilities and on multiple devices within each facility.

This gives me peace of mind knowing that any kind of theft, fire, or natural disaster could strike and my most important and precious data would be available for download to a new machine.

Amazon S3 vs Amazon Glacier

Amazon S3 was launched on March 14, 2006. Almost 6 years later, on August 21, 2012 - Amazon Glacier launched. This gave customers an option for their backup storage, depending on case-by-case usage and access needs of the individual. Both services redundantly store data in multiple facilities and on multiple devices within each facility.

In a nutshell - S3 is excellent for data which is accessed frequently. Glacier is a more long term, hands off solution. Naturally, Glacier is the cheaper option of the two, storage speaking.

Amazon S3 is designed to provide 99.999999999% durability and 99.99% availability of objects over a given year. According to Amazon, S3's design aims to provide scalability, high availability, and low latency at commodity costs. S3 offers 99.999999999% durability of your data. It's designed to withstand the concurrent loss of 2 data centres without losing your data.

Amazon Glacier is an extremely extremely low-cost, pay-as-you-go storage service that can cost as little as $0.01 per gigabyte per month. It offers the same 99.999999999% durability. In order to keep costs low, Amazon Glacier is optimised for data that is infrequently accessed. Initiating retrieval from Glacier typically takes 3-5 hours, and Amazon charges for retrieving large amounts of data from Glacier.

How Arq Works

Arq is a fantastic menu-bar application made by Haystack Software which acts as a window to your Amazon S3 / Glacier account and provides an easy-to-use interface to manage folders and files that you want to backup to the cloud.

Arq essentially interlinks with your own Amazon Web Services storage account. You're able to encrypt your backups with a password, if you're that way inclined. Encryption is performed locally on your machine rather than using Amazon's server-side encryption.

Arq stores backups in an open, documented format. Haystack Software provide an open-source command-line utility called arq_restore that's hosted at GitHub. This gives me peace of mind knowing that if Haystack Software stopped developing Arq, I'd still have access to my data.

If you're replacing your machine with something new, you simply install Arq on the new Mac and you can adopt an 'old' backup set. This means when upgrading your machine, you don't have to go through the initial backup process again. Arq will continue to backup your files periodically at your set time interval like it was backing up the old machine.

There's nothing to think about. The only preferences you're left to play with is the ability to set a schedule for backing up, hourly, daily at a pre-defined time, or manually. You're also able to set a storage budget for S3 storage, pre-defining a maximum amount you're wanting to spend monthly. Arq will automatically purge old backups (think Time Machine) to keep you within this set budget.

Once Arq is setup, you can completely forget it's there. It'll periodically do its thing in the background (you can even disable the menubar icon for complete transparency).

How I Use Arq

Arq is essentially an easy interface to my Amazon Web Services account. With an interface designed with the sole task of backing up, it provides an interface much cleaner and streamlined than accessing the same account through an FTP application like Panic's Transmit.

At this stage, I don't backup all of my local files to Amazon Glacier. This is due to the current DSL connection I'm working with at home. This will change late-2014 once my street is connected to the new fibre network. Then - I'll be backing up 100% of my local files to Amazon for redundancy.

Currently, Arq is set to kick in every hour and backup these files.

  • Dropbox Directory
  • Aperture Library
  • iPhoto Library (Masters)
  • Documents Folder

Arq is smart and works out for itself when drives are connected/unplugged. For example, the Aperture library listed above is stored on my WD Passport drive which is only plugged in from time to time. If Arq picks up that the drive is connected, the backup is updated. If it's unplugged, Arq will simply skip over this portion of the backup without annoying me with any notifications.

The data stored on Amazon's servers is my absolutely worst case backup. I'd only need to spend the time, and few dollars to access the data if my MacBook Pro was stolen or destroyed, along with my local Time Machine backup, and my two external off-site drives.

Arq can be purchased directly from Haystack Software for $39.99. This includes a license to run Arq on one machine, if you're wanting to use it across two machines (desktop + laptop), there's a 2-license discount which can be purchased for $69.99.