On a recent post, I explained how easy it is to transfer data from Amazon's S3 storage to Google Cloud Storage (GCS). I mean, this is cloud computing, so it should be simple, right? Well, in case my readers ran into problems (like I did), I didn't want to skip over the fact that issues can arise. And big issues they were...
First off, my transfer did not work as I described. In fact, I started writing the blog entry while the transfer was happening, and it was still transferring when I was done. It only appeared as if it were going to be successful. However, when all was said and done, I ended up with errors. To keep a long story short - and to not bore you with the research I had to do - I'll just link to my Stack Overflow post, and let you read it if you're interested.
At the end of the day, I got up to speed with gsutil, a very handy command line utility for talking with Google Storage from the local computer (remember - I'm running Xubuntu, but it should work fine for you Windows folks too). Some background though: when I started using S3, my intentions were to archive to Glacier to save money, and then only restore to S3 if a disaster ever happened. I would just sync my Macbook to the cloud, and then it would automagically archive to Amazon's cheap, long-term storage. Something went awry in the mix, though, and my files were neither in Glacier, nor classified as Standard storage in S3. The file types, as viewed from S3, was Glacier - but I could not see them in that service. I started down the path of restoring the files by meticulously right-clicking and restoring from Glacier within the S3 web console, but then I found out that the files would only be available for 3-5 days, and then go back to Glacier status. On top of that, I found out that the pricing would quickly escalate. So my dream of restoring from Glacier to S3 in bulk and have my Aperture library back up and running within 3 hours should a catastrophe happen was immediately squashed. I guess that's why they say that you should test your backup plan before putting all of your eggs in one basket, right?
At any rate, I got to learn some new command line interface (CLI) options for Linux, which always gets me going. Again, I'll save you from the boredom of explaining all of my research, but it suffices to say that the following command is what I needed to get my local files (from a USB thumb drive) to GCS:
gsutil -m rsync -r -d /media/benmctee/27F4-D3DE/ApertureLibrary.aplibrary gs://photo-archive-benmctee/ApertureLibrary.aplibrary
Let me explain what is going on here:
gsutil - that's the Google Storage Utility, which is part of the Google Cloud SDK. It's very useful, and more intuitive than one would think.
-m - Enable multithreading. This allows for multiple operations to go on at once when there are a lot of files to be processed. at over 220,000 files in my library, this really sped things up.
rsync - this shouldn't be new to any CLI users out there. But if it is, it's a very useful file mirroring tool for Linux (not sure about Windows?). It will sync two directories to ensure a 100% backup.
-r - Recursive. This option allows us to dive deep into all the folders
-d - Delete remote files that are not the same or available locally (use with caution!)
/media/benmctee.... - This is my local directory on my thumbdrive. Remember, always use the local directory first, and then the remote. Otherwise, serious deletions/file damage can occur!
gs://photo-archive-benmctee... - This is my GCS bucket, the "remote" location
If you want more details on gsutil rsync, check it out on Google's website.
This time, I did wait until a successful transfer before making this blog post. If you never used the Glacier option before, then my first post will hopefully work for you, because that is a lot easier and more straightforward. But if not, this should get you going. To install the Google Cloud SDK, which puts gsutil on your computer, head on over to Google Cloud Platform website. Happy clouding!