CouchDB: Backup strategies
Aniello Musella

Aniello Musella @animusna

About: Tech Lead, Head of Development, Software Engineer

Location:
Italy
Joined:
Aug 5, 2020

CouchDB: Backup strategies

Publish Date: Aug 22
0 0

CouchDB is a good NoSQL DBMS with some features that, according to me, are awesome like, for instance, the multi-master synchronization and the HTTP/JSON API to access to data (check my previous article about CouchDB where I talk about the multi-master sync using Docker).

Today, I want to describe some of the possible strategies to back up a CouchDB database.

Manual strategy with tools

You can use CouchDB's Built-in Backup Tools to back up and restore a database. These tools allow you to create backups of your databases and restore them when needed. You can find this tools on GitHub.

Following some examples:

couchdbbackup -d my_database -o my_database_backup.couch --config couchdb_config.json
Enter fullscreen mode Exit fullscreen mode

Where the file couchdb_config.json contains data to connect to the server:

{
  "username": "my_username",
  "password": "my_password",
  "url": "http://localhost:5984"
}

Enter fullscreen mode Exit fullscreen mode

To restore, you'll use the following command:

couchdbrestore -d my_database -i my_database_backup.couch --config couchdb_config.json
Enter fullscreen mode Exit fullscreen mode

Incremental backup with tools

With these tools you can define an incremental backup like that:

couchdbbackup -c couchdb_config.json --incremental
Enter fullscreen mode Exit fullscreen mode

defining in the json configuration:

{
  "username": "my_username",
  "password": "my_password",
  "url": "http://localhost:5984",
  "database": "my_database",
  "output": "my_incremental_backup.couch",
  "incremental": true,
  "since": "<last_backup_sequence>"
}
Enter fullscreen mode Exit fullscreen mode

Last backup sequence is an information that you get from database information. To get db information you can run:

curl -X GET http://localhost:5984/my_database
Enter fullscreen mode Exit fullscreen mode

getting somenthing like that:

{
  "db_name": "my_database",
  "doc_count": 100,
  "doc_del_count": 0,
  "update_seq": 150,
  "purge_seq": 0,
  "compact_running": false,
  "disk_size": 2048000,
  "data_size": 2048000,
  "instance_start_time": "0",
  "disk_format_version": 0,
  "committed_update_seq": 150
}
Enter fullscreen mode Exit fullscreen mode

The update_seq information indicates the sequence number of the last update made to the database, and this information will be used to make incremental backup.

So when you perform an incremental backup, you specify the update_seq value from your last backup. This tells the CouchDB backup tool to only include documents that have been added or modified since that sequence number.

Continuous Replication Strategy with CouchDB

I love this strategy because you use CouchDB to back up CouchDB.
To do that, set up continuous replication to another CouchDB instance. This provides real-time backup and ensures that your data is always up-to-date in the backup instance.

Snapshot Backups

If you're using CouchDB as a cloud service or on a virtual machine, you can take snapshots of the entire disk. This can be a quick way to back up the entire database state.

Scheduled Backups

You can schedule regular backups using cron jobs or similar scheduling tools. Depending on your data change frequency, you'll define your schedule (daily, weekly, etc.).

Final considerations

The best strategy it's mainly up to the architecture where CouchDB is running and the business requirements you must stick to.
In general, once you have chosen a tailored backup strategy for your business case it's advised following best practices like storing backups offsite, testing your backups, protecting your data with security and encryption and so on.

Suggestions and corrections are welcome.

Comments 0 total

    Add comment