SaaS prepper: backing up my 12,000 Flickr photo collection with a Raspberry Pi

Gareth Cronin
7 min readFeb 13, 2022

--

My wife and I have been using Flickr as our main photo album and photo storage site since 2007 and we have nearly 12,000 photos up there. Our phones back up to Google Photos, but we also have a Panasonic Lumix mirrorless camera, and Flickr is where we actually curate the photos from all of our devices. We create albums for special events, and one each season for day-to-day snaps.

Flickr was acquired by Yahoo long ago, and they haven’t exactly doubled down on investing in it, but it is still a great site for those who are more about curating and exhibiting quality photos than they do about social media. Its video handling is probably the weakest point, but it’s adequate. Flickr actually has a really interesting back-story for those interested in startup history — it was Stewart Butterfield’s first gaming company that pivoted into something else — the second being Slack!

Every few months I see a scary story questioning Flickr’s future (e.g. this Techspot one from a few months ago) and pondering its demise. When this happens, I wonder whether I should attempt a migration to Google Photos and carry on there, but the thing is we really like Flickr and it’s only the worst-case scenario that I want to prepare for. It is possible that one of these days, Yahoo will announce that they are shutting the platform down and give us a few months to find a new home for our photos, or perhaps they will experience an infrastructure disaster that takes a long time to recover from. In the event of the Flickrpocalypse, I decided I wanted a local copy of our photos, one that I can hold in my hand — not a copy to another cloud service. I figure this makes me a SaaS prepper.

The requirements

  • Back up not just of the photos and videos themselves (with their EXIF tags) but also arranged into the albums the way they are on Flickr
  • Run unattended and automatically back up any new photos or changes to albums
  • Recover from a failure or restart without repeating a whole lot of backing up unnecessarily (I knew from experience with desktop-based Flickr backup tools that it will run for a long time for 12K photos)
  • It shouldn’t cost me anything on an ongoing basis

The solution

My solution is a Raspberry Pi running Raspberry Pi OS (formerly Raspbian) Linux with an external USB hard drive, a one file Node.js script that calls the Flickr API, and a Cron job to run the script once a day.

The Github repo is here.

The solution

The hardware

I have a couple of spare 1TB external USB hard drives. I buy one for Time Machine each time I replace a Mac, and I have unused devices from laptops that were recycled a while ago.

I have an idle Raspberry Pi model 3 lying around from a failed workaround for a Sonos mesh wi-fi problem that I wrote about last year. I did quick Google to make sure that it could run Node.js — because that’s my happy place for knocking up quick solutions. I also checked that it would talk to my external USB drive.

Preparing the Raspberry Pi

The Pi was already set up with Raspberry Pi OS from my last project, so I didn’t need to do anything other than enabling SSH to make life a bit easier while I wrote the Node.js app.

Preparing the drive

It’s been a few years since I last partitioned and formatted a physical drive, not to mention a couple of years since I did so with a cloud volume. All this serverless is making me rusty!

I decided to use FAT32 as the filesystem. I needed something that Pi OS can write to reliably, and something that would be easy to read from on a regular laptop should the Flickrpocalypse arrive.

Pi OS comes with parted — the de rigueur for partition changes on Linux these days. This tutorial covers the basics. I took these steps:

  • Ran sudo fdisk -l to find the device path, for this drive it was /dev/sda
  • The drive hadn’t auto-mounted when I plugged it into the running Pi, but if it had I would have needed to run mount to find any mount points and unmounted them with umount
  • Ran sudo parted, then entered select /dev/sda to work with the drive
  • Deleted the existing partitions on my external drive with a couple of rm commands
  • Ran mklabel to create a new partition table
  • Ran mkpart in “interactive mode” — i.e. without any arguments — it has a really nice guided setup that asks questions about file system type and the percentage of disk you want to use, I went with a start of 0% and an end of 100% to create one big extended partition for the backups
  • Ran print to check all was well

The next step was formatting a filesystem in the new partition, which was just a matter of running sudo mkfs.fat /dev/sda1

I like to mount a new volume manually before I add it to fstab and the /media directory seems to be the place to do it these days. FAT volumes don’t obey Unix style permission modes, so this combination of of options was necessary to give write access to the Pi OS default “pi” user, who has a uid of 1000:

sudo mount -o rw,user,uid=1000,dmask=007,fmask=117 -t vfat /dev/sda1 /media/backup

For the /etc/fstab entry to ensure my drive would mount on boot I used the modern approach of a UID. The command sudo blkid will show the UID for each device. Once I had that, I added the fstab version of the manual mount:

PARTUUID=f3c40bd8–2064–4bbf-88a2-f7c553cdbe8b /media/backup vfat rw,user,uid=1000,dmask=007,fmask=117 0 1

When I mess with fstab I like to then unmount my manual mount (in this case umount /media/backup) and then run mount -a. If there are any errors in fstab, you’ll get a useful error message and can fix it. That is much better than only discovering your fstab is corrupt on the next reboot!

Installing Node.js

I found a how-to for Node installation and followed it. Essentially it’s just a matter of adding the right Node repository to APT and then running sudo apt install -y nodejs

Consuming the Flickr API

Given Flickr’s vintage, I expected to have some difficulty with the API, but I was pleasantly surprised. It’s consistently structured, well documented, and has a playground online to test with. The “flickr-sdk” NPM module is comprehensive and works out of the box. It also returns Promises, so it works nicely with async/await. Requesting a Flickr API key is instant self-service.

For backup purposes, I only needed a handful of API calls:

  • fetch the list of albums with flickr.photosets.getList passing our Flickr NSID (see below)
  • fetch the list of photos in each album with flickr.photosets.getPhotos passing NSID and album ID
  • fetch the info about the photo with flickr.photos.getInfo — this lets us get the title (title._content) figure out whether its actually a video (media property) and what the file extension is (originalformat property)
  • fetch the available sizes of the photo or video with flickr.photos.getSizes

The sizes are quite a list. Photos have a size with the label “Original” which is the original high quality photo. For videos though, the “Original” labelled size is just a reference still. I found I needed to iterate through the sizes from the best quality to the lowest to find one to download. The size then has a “source” property with the URL direct to the file.

The other catch is the NSID. This is an internal Flickr ID for a user. You can find your own NSID by logging in and then visiting the API explorer method for getting user info.

Downloading from a URL

I started downloading using vanilla Node.js with the https service, but quickly discovered complexity with following redirects on videos. I installed one of my favourite NPMs, the HTTP client SuperAgent, and used its pipe() method to redirect to a writeable stream. SuperAgent magically takes care of all the details.

The catch with FAT32

One wrinkle that I haven’t been able to overcome is FAT’s lack of support for unicode characters. My script uses the photos’ names on Flickr but some of my names have characters with accents, including macrons (e.g. ā), in them along with a few emoji. Trying to write a file with those characters on a FAT volume results in an error. I considered creating separate metadata files to include the names alongside the images and videos, but that is really annoying, so I sanitised the names by stripping out accents and emoji, and substituting double vowels for letters with macrons. I considered switching to NTFS, but reading about the Linux support for writing to NTFS made me worry it wouldn’t be robust enough.

Bundling for the Pi

My favourite discovery of this project is esbuild. I usually deploy serverless applications using cloud tools, or web apps bundled with create-react-app, so I wasn’t sure what approach to take for a simple server-side app. I didn’t want to deploy a full node_modules directory: I just wanted something that could roll my dependencies up into a single file — and that’s exactly what esbuild does. Once I’d installed the NPM, this line was all that was required to create a file called server.js that can be copied to the Pi and run standalone:

./node_modules/.bin/esbuild index.js — platform=node — bundle — outfile=server.js

Next steps

The tool will retry a whole album, but it won’t retry a single photo failure at the moment — preferring to continue downloading as many other in an album as it can. I need to add a mechanism to track individual photos that haven’t downloaded yet and retry on the next run.

I will also add support for orphan photos that don’t appear in an album yet and download these to their own directory.

--

--

Gareth Cronin

Technology leader in Auckland, New Zealand: start-up founder, father of two, maker of t-shirts and small software products