SaaS prepper part two: backing up Evernote with a Raspberry Pi
In an earlier story, I talked about my fear of the collapse of SaaS products that I have come to rely on, and explained my offline backup solution for Flickr. The TL;DR is that I have a Raspberry Pi with an external hard drive connected. A couple of times a week, a Cron job kicks of a small NodeJS application that interrogates the Flickr API and takes a copy of any new albums or changes to albums. If the end of days comes for Flickr, I’ll still have all my photos neatly organised into albums ready to be migrated to some other platform.
Along with Flickr, Evernote plays a big role in my life and work. I’ve been paying for premium subscription for over ten years, mainly because it lets me store reasonably large files. Here’s just some of the ways I use Evernote regularly:
- Journalling coding and other technical projects, including screenshots, URLs, snippets of code
- Noting big purchases, including serial numbers, and a photo of the invoice and/or receipt
- Evaluating big purchases, making notes comparing different products and services, contact details, quotes
- Storing PDFs of important documents, e.g. contracts and amendments
- Noting how I’ve configured devices and appliances
- Storing recipes (sometimes with photos of printed pages with annotations)
- Keeping long-running to-do lists
- Planning travel (including capturing screenshots of confirmations, addresses, PDF e-receipts)
Like Flickr, Evernote has had some wobbles as a business. Here’s some example media from recent years:
- NY Times 2019 “A Unicorn Lost in the Valley, Evernote Blows Up the ‘Fail Fast’ Gospel”
- Profitwell.com 2019 “The Rise, Fall, and Future of Evernote”
- TechCrunch 2018 “Evernote just slashed 54 jobs, or 15 percent of its workforce”
There’s been plenty of improvement to the product in the last few years, and the core value proposition that drew me in the first place still holds up:
- It’s cross-platform with timely synchronisation — it stays in sync across Web, native Mac, and Android
- It works well offline
- It accepts all kinds of attachments inline, including PDF, with embedded viewing
- It supports all kinds of useful formatting: plain text for code and config, headers, tables, bullet lists
Hopefully that all bodes well for the future, but I’m not taking any chances.
The requirements
The requirements are pretty much the same as for my Flickr backup:
- Back up not just of the body of the notes, but also any attachments: e.g. images, PDFs, MS docs, CSVs
- Run unattended and automatically back up any new notes or changes to notes
- Recover from a failure or restart without repeating a whole lot of backing up unnecessarily
- It shouldn’t cost me anything on an ongoing basis
The solution
My solution is on the same Raspberry Pi as the Flickr back-up. It’s running Raspberry Pi OS (formerly Raspbian) Linux with an external USB hard drive, and a single Node.js script calling the Evernote API. A Cron job runs the script once a day.
The hardware and OS
I covered the hardware and OS set-up in the Flickr story, so I won’t repeat it here. I’m on the same FAT filesystem for the reasons I explained in that story too. FAT is about as portable as it gets should the dark days of the SaaSpocalypse™️ arrive.
Consuming the Evernote API
Like Flickr, Evernote has a very usable API and a decent developer experience. Some of the getting-started docs are a little rough but they are accurate. The docs site has quick start guides and a full API reference. The docs recommend developing against the Evernote API sandbox before moving to production. The sandbox is a full version of the main application, and it was just a matter of registering a new account at sandbox.evernote.com. This also turned out to be really convenient, because I could create notebooks and notes to represent my edge cases for testing, without my production data getting in the way.
I thought I might have reached the day where I have to come up with a solution for headless applications and OAuth2, but fortune smiled on me again and Evernote has a token-based access system alongside it’s OAuth2 authentication. Evernote call the approach “developer tokens”. A self-service tool generates a persistent secret that can then be used to authenticate with the API. It’s a single string of about 100 characters that starts with s=
. The self-service console also support revocation.
Some time in the last few years, Evernote have disabled the self-service developer token tool for the production site as a security precaution. It’s not clear on the page hosting the tool how to proceed, but I found a community message that asked anyone wanting to use it to raise a support ticket. When I was ready to use production, I raised a ticket and a representative responded very quickly to say that it had been enabled for me. Once that was done, it worked right away with no other issues.
Evernote have an official JavaScript SDK for NodeJS. The README doesn’t give a developer token authentication example, but I took a punt and supplied the developer token rather than an issued OAuth token to the “token” field in the SDK client config, and it worked:
Crawling the notes
Enumerating the notebooks and notes was very straightforward. The client configured in the example above can return a “note store” with getNoteStore()
. Calling listNotebooks()
on the store returns an array of data structures with each notebook identified by name and GUID.
The note store has a findNotesMetadata()
method with a hard limit of 250 notes per call. It includes an offset
parameter, so I set up my script to page through the metadata, 250 notes at a time, until I reached the last note. The note metadata has the GUID and name of the note, so fetching the body is just a matter of calling getNote()
with the GUID and reading the content
property. The content XML conform’s to Evernote’s XML schema. Most of it is styled XHTML, but it has a few extensions to support features like attached media.
Each note has a resources
property with an array of media referenced in the content. The getResource()
method on the note store will return a resource by GUID, including any original filename from the upload, the MIME type, and the raw content.
Output
As with my Flickr backup, I wanted the output to be self-describing. I output each note with the name of the note and an .xml
file extension. If there are media resources attached, I create a directory with the note name and then dump each attachment either with the original filename, or with a numbered “resource-n” filename, using the second part of the MIME type as the file extension.
As with the Flickr solution, using names with a FAT filesystem means a swathe of sanitising is required to meet FAT’s naming restrictions.
Side note on Lodash and modern JavaScript
Before “modern” JavaScript I was a big Lodash user. I love Lodash’s fluid syntax and consistent model for functional style programming. But — I don’t like adding dependencies to really small applications, especially if modern JavaScript features have superseded them. The Lodash method that I like to use a lot and still isn’t there is keyBy()
. I found this very handy Github repo with equivalent terse ES for Lodash methods. It has a one-liner to create a keyBy() without pulling Lodash in:
const keyBy = (array, key) => (array || []).reduce((r, x) => ({ ...r, [key ? x[key] : x]: x }), {});
Deployment
I explained the joys of esbuild in the Flickr story. It makes it a snap to turn a simple NodeJS application into a one file deployment.
Next steps
I’m almost ready for the SaaSpocalypse™️ now. I do intend to explore rendering the Evernote XML content into a format where I can re-embed the media and including it in the backup. It’s not essential — everything is already searchable and readable, but when the dark day comes, it’ll be just that little bit more convenient!