Background and Scope
Data handling and backup can be hard, and everything below is my work-in-progress notes and effort towards achieving a system that works for me.
Being self-employed in technology, I’ve got quite an array of devices, machines, and systems to support even just in my household. Keeping everything backed up and synced in a way that’s sufficiently hands-off and automatic to be reliable, as well as easy to recover from in the event of data loss, is not a simple task. In particular, it’s critical to adopt a 3-2-1 backup strategy: 3 copies of your data, 2 different physical copies, with at least 1 off-site copy.
In the simplest “one laptop” case, this is straightforward: have one backup regularly scheduled to a USB hard drive. Maybe every time you sit down at your desk, you plug this in, and Time Machine (Mac) or File History (Windows) takes care of it. Then, on top of that, install your cloud backup provider of choice to regularly take things off-site. Hint: IDrive.com is an excellent deal, and supports client-side encryption to keep everything private. 3 copies of the data (Laptop, back up USB disk, and IDrive), 2 different physical media (laptop and USB disk, though also IDrive), and 1 off-site copy (IDrive). Why these 3 copies? Two different copies so when you delete a file accidentally, you can quickly go find the backup. So when your laptop gets stolen from the car, you’ve still got everything on your desk. One remote backup so that if your house gets robbed or burns down, your data is still safe in the offsite backup. Or so that when you delete a file while you’re on vacation, you can grab it from the cloud backup. 3-2-1 provides near-perfect data threat coverage, which is why it’s the gold standard.
My case is a bit trickier – I have a bunch of different machines, LOTS of data, some big chunks that need to get shifted around kind of frequently (big virtual machines I shelve and retrieve for infrequent jobs), and so on. In particular, the simple 3-2-1 case above doesn’t work because “production” data, or the “primary copy” isn’t all in one spot. Some “primary” data lives on removable drives. Some lives on my NAS. Some lives on one laptop, some on another machine.
I’ve chosen to build my data management and backup scheme around a Synology NAS, specifically a DS1019+. Synology DSM offers an immense array of software tools to handle nearly anything: out of the box, you get Photo Station/Video Station/Music Station for media serving, “Media Server” for making that content available via DLNA, Synology Drive as basically a feature-by-feature clone of Google Drive, including both backup and “synced folder” functionality… And that’s without even touching third-party packages. With the more powerful units like the “plus” series my DS1019+ hails from, you also get virtual machine capability and, in particular, native Docker support. With Docker, the entire Hub full of containerized applications is ready to install on your NAS. Interestingly, this means there are two officially-supported and almost equally-easy ways to install Plex on your Synology NAS.
Goals
I’ve got a few different goals I’d like to achieve with my home IT setup:
- A Dropbox-like shared folder, synced between all my machines, with the main files I use day to day. This should have the ability to turn off individual files or folder trees, as to keep some heavy-weight data off of machines that won’t need it. This should include version history, so deleted or modified files can be rolled back in time.
- A means of regular system-level backup from each machine to the NAS, such that a complete recovery would be possible from the NAS copy. That is, any given morning I might leave the house, if my laptop were stolen from the car, I wouldn’t lose any data. This might be an image-level or file-level backup, though ideally it would have image-level restore capability (see below).
- A means of copying that system-level backup to cloud storage for offsite safe-keeping.
- Data where the primary copy is on removable storage must also be 3-2-1 backed up, first to the NAS and then to cloud storage, and this process should be as automatic as possible. Ideally, this should be set up as immediate sync to NAS as soon as the drive gets connected. The use-case here is my Photos drive where I keep my Lightroom catalog and video to edit. The media library is simply too big to remain on internal storage, and primary storage on the NAS device would be too slow for access.
- Probably some more I’m leaving out, TBD.
Implementation Notes
Here’s some interesting stuff I’ve come across.
Image Backup vs File Backup
File backup is a scheme where your system is copied to a backup location on a file-by-file basis. For instance, Dropbox syncs your local folder with a cloud folder, and you can see the individual files and folders on either side. Image backup is a scheme where your entire system is boiled down into one monolithic image file on disk (whatever disk that is). You might be able to browse your files, but only with some kind of tool that lets you peer inside the image. Acronis uses such a scheme, backing up your entire system to a .tib image, which contains everything about your system’s disk. You can browse the image, but only going through the Acronis interface. The image file contains sufficient data to restore your entire system exactly as it was at time of backup.
Some schemes adopt a bit of a blend of both, specifically Time Machine. TM is a file-level backup, and doesn’t bother copying a number of system directories easily restored by the macOS installer. Yet, it DOES save everything to a monolithic disk image (or slightly-less-monolithic .sparsebundle), and allows for “exactly as it was” system-level restore. You can browse the backup filesystem using either the Time Machine UI, or simply by mounting the .sparsebundle on your mac.
Convenient Features of Synology DSM
DSM has come a long way since I started using it, in particular by implementing support for Btrfs. Btrfs is a copy-on-write filesystem, which has huge implications for deduplication and filesystem integrity. It enables a concept called “snapshotting” where, in effect, data is “tagged” at a particular point in time, and any additions/modifications/deletions from that point forward are stored as a delta. Synology DSM uses this to allow for scheduled snapshots, where ever hour, say, the entire filesystem is frozen in time without taking up any more disk space than “every unique byte that’s ever been written.” This is a great way to make a kind of pseudo-backup that protects against user error/accident like “rm -r” or accidentally saving a document after deleting half its contents.
Synology Drive Server takes this a step further, and offers versioning, where, if enabled, every file change or deletion creates a new version of the share that can be rewound to any point in time or any version of any file, and any recently deleted files can be recovered. Eventually, depending on version retention settings, old versions and files deleted too long ago can be expunged automatically.
Finally, across all shared folders, DSM supports the concept of recycling bins, where any deleted files will be placed in purgatory until some scheduled future deletion.
Keep these features in mind when considering the following.
Synology Universal Search for Deleted Files
Synology Universal Search is cool, since it lets you search for any document name across your entire NAS (at least, within indexed folders), as well as search on contents of certain kinds of files. You might be wondering, with Synology Drive file history and Recycle Bins, can I search for deleted files or old versions? No. Universal Search DOES NOT index recycle bins, and DOES NOT index the Drive file history database. I’ve tested and searched at length, and can’t find a way to make it search in these, even though at least in the case of recycle bins, it’s only a ‘#’ away. Synology support swears Universal Search WILL look in the Drive history database, but I can’t make that work or find a way to enable it, so I’m pretty sure it’s not actually true. Given the recent test that US ignores #recycle directories, it seems even more likely.
Using GoodSync with Synology
Of all “folder sync” utilities, GoodSync is the most robust I’ve come across in terms of feature set. I say “folder sync” utilities because lots of tools do “syncing of files.” For instance, Dropbox, Synology Drive, Sync.com, etc etc. The set of tools is smaller that tackles the task “take a folder on the left and a folder on the right and make them equal” – tools like FreeFileSync, SyncToy, SyncBack, and of course GoodSync.
There are actually a handful of ways to use GoodSync with a Synology NAS – since GS supports all kinds of folder pairs, you can have your pick of access technologies on the NAS side, like FTP, SMB, AFP, and so on. I’ll call “use a normal remote file access scheme on one side” method 1. The other way, method 2, is to use GoodSync Connect on the Synology side. This is a DSM package, available in Package Center, that theoretically improves performance by working around the inherent performance limitations of whatever file access protocol you would otherwise use.
GoodSync and Synology Recycle Bins
One question I have about GS Connect is how it behaves with the recycle bin functionality on the NAS, since that’s quite a useful feature to avoid data loss. If you connect the NAS-side via SMB (or presumably FTP/AFP/etc), it works just as you’d hope: delete a file on the left-side, GS syncs the deletion to the right side (on the NAS), and that file goes into the recycle bin instead of actually getting deleted. There’s one caveat: Synology supports ITS OWN recycle bin functionality, INCLUDING versioning, in Options>Save deleted/replaced files to recycle bin. If you do this, the “deleted” file actually just gets moved into a hidden _gsdata_ folder, and thus never triggers the Synology recycle bin functionality. Watch out for this, since it’s a bit confusing. Although the upshot of doing it this way is that Universal Search will index things inside the _gsdata_ version of a recycle bin, but NOT the Synology #recycle directory. If you connect the NAS-side via GS Connect, things you delete from the local-side of the sync pair get deleted from the NAS without #recycling. If you’re going to use the GS server on your Synology, you’ll want to use GoodSync’s built-in recycle bin, if you want a recycle bin.
Acronis True Image Considerations
Acronis True Image is an image-level backup utility that works across Mac and Windows. With respect to Mac clients, it’s a little less-ideal than Time Machine because, while you can opt to keep a version history and while you can browse files on the backup, the interface to do so is way less seamless vs TM. However, Time Machine backed up to a network location has always been a bit fiddly and error prone. I’ve never had a network TM backup that didn’t eventually require substantial manual repair effort, or a complete reset. Mac disk images, from an engineering perspective, are unfortunately a bit old-fashioned and fragile.
One major question with respect to keeping a True Image backup on the NAS is whether I can simply mirror that backup to the cloud, in a sane way, to satisfy requirement 4 above. It’s probably OK in my case to keep an image-level backup in the cloud AND file-level backups separately, even though that’s duplicated storage: IDrive is, for now, very cheap. What I DON’T want to do is copy the entire .tib image up to the cloud every time it changes. Luckily, IDrive advertises “block-level incremental backups and restores to optimize transfer speed,” which I’d like to test in the case of a .tib backup. We’d expect that, if I make an Acronis backup, then add one file to the desktop, then make an incremental, the .tib file checksum will change, but the IDrive incremental backup of that file should happen almost-instantly.
To test this, I set up a macOS VM, and made a True Image backup of it to my NAS. I then set IDrive running on the Synology to back up that directory. True Image it seems compresses its backup image, and the initial image size was 8.61GB. I created an IDrive backup set and clicked “go” on the Synology, and used my ISP’s router to measure that, indeed, about that much data got uplinked in 15 minutes or so (9.04GB per the router and my calculation based on “average traffic over the measurement period”).
I then downloaded a beefy Ubuntu 20.04 ISO as a stand-in for “changed a bunch of stuff on the computer” and fired off a new True Image backup. The image file grew from 8.61GB to 11.98GB, or by about 3.37GB – notably more than the 2.27GB the downloaded image takes up, so I’m not sure where the extra came in. The image backup IS set to store versions, so perhaps enough other files changed to require this delta space.
Regardless, I then ran an incremental backup on the same set with IDrive on the Synology. It claimed to be backing up 11.98GB, but the initial few GB blew by very quickly, and looking at the router statistics, only a bit over 2.5GB actually uploaded in the roughly 15 minutes the incremental took. Note that the incremental was quite a bit slower than the initial backup, and I’m not sure if that’s down to processing time or down to “IDrive throttled my upload for whatever reason.” So the long and short of it is, I’ve verified that, indeed, IDrive does a block-level incremental backup, and will only upload the CHANGES in a given file. Thus, IDrive is a good option for replicating image files from NAS storage to cloud storage.
“Acronis True Image” is damaged and can’t be opened (Mac error)
Installing and uninstalling True Image 2020, I ran into an error where I couldn’t reinstall TI from a fresh download. I came across this forum post which led me to the answer: Acronis makes a “cleanup tool for <windows/mac>” which you can run to eliminate all traces of True Image. If you run into the error above, use the cleanup tool, then installation should succeed, which is how things worked out for me. As such, it seems like the cleanup tool is probably the best way to accomplish a clean, full uninstall if you ultimately decide against True Image.
To Be Continued
There’s much more to report here, but in the interest of ever publishing this article I’ll leave it at this for now.
I have a Synology DS-918+ and multiple Macs. I backup the Macs and certain folders on my Synology with Arq to Amazon Glacier. This is in addition to iCloud backups for the Macs.
I don’t bother with Time Machine.
I haven’t come across Arq, I’ll have to have a look. Otherwise, I tend to agree: I LOVE time machine in theory. In practice, it tends to break down a lot. My experiences restoring from a network location are mediocre at best. You REALLY want local storage for restore speed and my old NAS was pretty slow to begin with. Copying a network TM backup to USB storage for the sake of restore isn’t impossible, but it IS tricky. And finally, backing up in-use sparse bundles is a very tricky business itself. You’d think it should just work, but it’s super easy to catch the image in an inconsistent state. And I don’t think I’ve ever had a network TM backup I didn’t EVENTUALLY have to rebuild.
Thank you Alex for you very informative notes, tips, experience, and recommendations regarding Synology NAS backups. I just set up my first NAS on my LAN, a Synology DS720+ with just 2 bay, each with one 2Tb SSD and using RAID1, along with basically all Mac laptops, desktops, etc. So far so good. Have been backing up the NAS to local SSDs on a regular basis with good results.
Now it’s time to connect some off-site backup for the NAS. I don’t need to sync, but I do want incremental/differential backups. I’ve seized upon BackBlaze as one possible off-site vendor, or, Synology also seems to host it’s own in-house servers, but I don’t know which has a better rating in terms of security, longevity, reliability, etc. I’d think Synology would be a better fit in terms of compatibility, however BackBlaze also has tremendous support pages on their site to deal with Synology NAS backups, so really would appreciate some advice on this point.
Next, “Hyper Backup” keeps rearing it’s head everywhere, but I honestly don’t understand “exactly” what makes it Hyper? Speed, compatibly, reliability, granularity, ease of use, configurability? All of the above? And most important, if I experience a catastrophic failure and have absolutely no other way to restore everything that I had on my NAS to, say a new MacBook Pro, I’m just not seeing that much detailed information even on Synology’s site on how to do this, and how flexible it would be. I’m assuming I’d download a Mac app, which would then log me back into Synology”s servers (assuming I go with Synology vs. BackBlaze, that is) and I could the hopefully pick and choose what folders to download to my new laptop/desktop and they would reappear locally as normal APFS or HFS Mac folders and files?
Hi Again Alex,
I subscribed to the “Odd Engineer” newsletter, but I don’t see this option for your publications specifically. Do you have such an email alert subscription that would notify me specifically of just your publishings?
If so, how or where can I sign up, or can you please add me to you list.
All the best,
John Hay
Hey John, I’m so flattered! As far as subscribing, I don’t have an email list or anything. I think this blog generates an RSS feed? But honestly I don’t even know. As far as backup: I don’t have personal experience with either Synology’s C2 or BackBlaze, but I think both will support Synology Cloud Sync’s client-side encryption just the same which is kind of all you need to worry about. I ended up choosing neither because of cost, and found iDrive to be the cheapest and most flexible option. iDrive doesn’t use the cloud sync or hyperbackup interfaces like others do; it has its own client app for Synology DSM. I don’t love it, but it IS very flexible – you can choose folders individually to include or exclude, there are plenty of options, and iDrive makes it pretty easy to use your space quota for both DSM backups and individual client backups as well. For instance, this weekend I’m setting up backups so my laptop backs up to the diskstation, that backs up to iDrive, and my laptop also backs up direct to iDrive for reliable continuous backup while I’m away from home. Slightly inefficient but maximally flexible, and iDrive storage is cheap.
Cloud Sync and HyperBackup are subtly different, but I don’t have a lot of experience with the latter since I use iDrive. CloudSync is designed to bolt your synology together with cloud storage. For instance, your network share could be synced with Google Drive so that your laptop at home can work on your google drive files without syncing them all locally. Obviously, having the data multiple places feels like backup, but it’s not necessarily the same use case. HyperBackup can use a bunch of different storage backends, including cloud providers like google drive, as an incremental backup target – the ransomware-proof “I need to restore my entire NAS to its state ten days ago” scenario. I think HyperBackup is probably what you want in your case.
Now, when you say you’re regularly backing up to local disks: I imagine that means you’re already using the HyperBackup app to do it?
I have a question , can GoodSync connect speed up by local machinery , even I am not in lan with nas ? I heard it is p2p
Tony