April 9th, 2024 - Alexander Mattoni, Co-Founder and Head of Engineering

Reliable Backups in a Multi-Cloud World

Proper backups are universally acknowledged as essential, yet they grow increasingly tedious and prone to error as DevOps complexity escalates. While some managed database services offer automated backup solutions, the scope of your backup requirements is likely to expand as the business and products scale. There's a considerable chance you'll find yourself hosting your own databases or stateful services, a task that can seem daunting in its demand for precision and reliability.

To add to the woes of the modern day DevOps/SR engineer, more and more businesses are choosing to go multi or hybrid cloud, or in some cases migrate OFF cloud to full on-prem solutions, where the logistical challenges of backing up critical systems truly reveal themselves.

At Cycle.io, we see many unique situations involving data requirements, retention, and security. As a multi/hybrid cloud and LowOps platform, we've had to come up with some pretty interesting solutions for backups and data protection. In this article, I'll dive into the challenges faced with backing up/restoring stateful container data, and how we make it flexible, yet easy to do on the Cycle platform.

The True Value of Backups

It's estimated that by 2025, humanity will be generating a staggering 180-200 Zettabytes (that's 200 TRILLION Gigabytes) of information per year. This means a few things:

  1. Data storage is cheap.
  2. Managing the sheer amount of data is complicated.
  3. Legal and regulatory challenges increase.
  4. Data security requirements increase.

Proper backups mean more now than just making copies of a few database tables. As the lifeblood of modern businesses, your data not only supports day-to-day operations but also informs strategic decisions, powers innovation, and safeguards customer trust. Backups extend far beyond preventing data loss - they are pivotal in ensuring business continuity. It may even be required to keep a certain number or scale of backups for regulatory and compliance purposes.

What do I mean by business continuity? Every business faces threats, and it is necessary to maintain critical business functions during a disaster and after it has occurred. From ransomware and cyber attack to natural disasters and even simple mistakes, significant loss of data tends to also mean loss of the business. With a proper backup strategy, it's possible to hedge against disaster scenarios, and at the same time make your data available for processing and analysis, informing growth strategies as the organization scales.

Challenges with Backing Up Data the Right Way

Of course, if creating proper backups were simple, everyone would be doing it. Unfortunately it's just not that easy to set up, test, and verify a full system backup solution. The more microservices and applications an organization runs, the more difficult it gets.

Challenge 1: Data Sprawl

You've got a lot of data, and it's all over the place. Identifying where critical/sensitive data lies in and of itself could be a complex task. Aggregating everything into a centralized backup store from a myriad of sources, especially if it's across regions and cloud providers (or merging on-prem solutions) can be overwhelming.

Challenge 2: Compliance and Security

Once you've got the backups, storing them in a secure and compliant way is an additional layer of complexity on an already nightmarish logistical problem. Depending on your organization's needs, there may be additional preprocessing necessary to remove personal information, and they may need to be shipped to a secure/trusted location for long term storage, which also poses challenges on the restoration side as well.

Challenge 3: Minimizing Operational Disruption and Recovery Time

Having backups is only half the equation. They're only so useful if you can't efficiently and quickly restore them to a damaged system. Doing restoration is just as complex as the backups, but in reverse.

You also don't want to bring down your operations during a restore, if at all possible. Of course, that gets complicated quickly.

At Cycle, our philosophy is to never trust failover, because failover is rarely tested, and that same philosophy applies to backups. If you aren't regularly testing your backup/restore cycle, it may as well not exist.

Cycle's Approach to Backups

Given the critical nature of backups in DevOps, it made sense for us to build a solution that worked as an integrated, cloud agnostic solution, but provided a granular level of flexibility and wasn't limited in what it could do. Furthermore, it had to be (relatively) easy to use.

Our solution actually ended up being fairly simple. At Cycle, we love when using basic primitives provides a powerful tool for accomplishing a complex task (just like how Cycle utilizes basic DNS to provide network abstractions), and for backups we were able to do the same.

Generating Backups

In short, Cycle allows us to set two commands in a container config: a backup command and a restore command. Cycle will automate running the backup command inside the stateful container on a predefined interval. The command outputs its backup contents to stdout, and Cycle takes the generated output, compresses it, then sends it to a configured backup integration (such as Backblaze B2). This method solves a few different problems:

  • Cycle cannot make any assumptions about WHAT it is backing up. We have companies running everything from healthcare infrastructure to airline software, and blindly freezing a volume for taking a snapshot could be detrimental.
  • It provides a high level of flexibility, since your command could execute a script inside the container process with more complex logic. You could create a custom compressed tar of various files and package those up as well, for example
  • Since backups don't stop or interfere with the container's main process, they don't need to be scheduled to happen only during periods of reduced demand.
  • Cycle is still able to automate a rather complex process, with user-defined parameters.
  • It's integrated on Cycle, and works across any infrastructure connected to the platform. Meaning, it is trivial to create cross-cloud/region backups and aggregate them into a single location.

Setting up an automated backup/restore routine is extremely simple. It can be done via our Portal/UI, or via API (everything in Cycle can be automated via API).

If you're using the portal, here's how easy it is:

Backup command configuration

Restoring Backups

Recovering data from a backup works exactly like taking a backup, but in reverse. When a user chooses to restore a backup on a stateful container, Cycle will execute the restore command inside the container instance and provide the backup to stdin for the process to consume. This provides just as much flexibility for restoration. Using a script, it's possible to get really fancy on exactly how a backup is processed. And, the instance stays online during the whole process.

A seamless restore is one click (or api call) away:

Backups dash

Comparing Cycle's Approach to Backups on Kubernetes

As a Kubernetes alternative, Cycle is often compared against Kubernetes and other Kubernetes alternatives. So how does Cycle's backup automation compare to setting it up on one or more k8s clusters?

  • Backing up Kubernetes is extremely involved. While Cycle provides a managed control plane, Kubernetes requires you to maintain the control plane yourself. To be successful on k8s, you need to backup various configuration files, application configurations for apps deployed with kubectl AND via helm charts, etcd databases, configmaps and secrets, and any custom resources. This is before we even touch your application volumes, and is per cluster, meaning you may need to multiply this process a few times. If you're doing this in a multi-cloud ecosystem, well…good luck.
  • When configuring backups for your application's persistent volumes, you need to dive deep into the weeds, knowing terminology like CSI driver and whether or not the one you chose supports volume snapshots, and 'PVC' for restoring.
  • Restoring a PVC is a kind of 'choose your own pain', depending on which CSI driver you choose. Many times it will require your container to be restarted, or able to handle sudden remounts of volumes.
  • You need to configure your off-site backups, and know how to use tools like rsync to ship them off and bring them back. The network failures are also your problem now too.
  • If you're using a managed kubernetes service, you're still responsible for the above, but may be able to take advantage of something like AWS EBS Volumes supported by CSI Drivers, though be warned: there are many acronyms on that page, and it still requires extremely technical knowledge on the k8s ecosystem, and most of the above will still apply.

While it's possible to set up robust automated backups with Kubernetes, it's not for the faint of heart. It generally requires one or more people with in-depth knowledge of Kubernetes, your cloud provider of choice, and has more moving parts – thus more points for failure.

Drop-In Backup/Restore Scripts For Common Applications On Cycle

Given the ease and flexibility of Cycle's backup system, I'll share some quick one-liners for backup/restore of common stateful applications.

(Note, the double quotes are important. They allow the usage of environment variables when the script executes).

MySQL
Backup Command "mysqldump --all-databases -u root -p$MYSQL_ROOT_PASSWORD"
Restore Command "mysql -u root -p $MYSQL_ROOT_PASSWORD < /dev/stdin"
Postgres
Backup Command "PGPASSWORD=$POSTGRES_PASSWORD pg_dump -U $POSTGRES_USER --clean -Ft $POSTGRES_DB"
Restore Command "PASSWORD=$POSTGRES_PASSWORD pg_restore -U $POSTGRES_USER -Ft -d $POSTGRES_DB"
MongoDB
Backup Command "mongodump --archive -u $USERNAME -p $PASSWORD"
Restore Command "mongorestore --archive -u $USERNAME -p $PASSWORD"
Tar
Backup Command "tar czf /dev/stdout -C (path) (directory)"
Restore Command "tar xzf /dev/stdin -C (path)"

Embracing Cloud Agnostic Backups with Cycle

Up until now, multi-cloud/region backups have been a luxury reserved for only the largest teams with dozens of DevOps engineers on staff. It's no wonder. When diving into the rabbit hole of Kubernetes backups, it's easy to become 'lost in the sauce' and spend days, weeks, or months trying to build out a solution. Our goal with Cycle is to bring an automated, robust backup solution to everyone, and not require exorbitant amounts of your precious time to do so. After all, that's why we call Cycle the LowOps platform.

💡 Interested in trying the Cycle platform? Create your account today! Want to drop in and have a chat with the Cycle team? We'd love to have you join our public Cycle Slack community!