In todays blog post I am looking into a way to create a crash-consistent snapshot of an cloud-native application that uses multiple persistent volumes. The native functionality within k8s does deliver snapshotting capabilities, but lacks support for group snapshotting. This is what Dell adds using a Container Storage Module (CSM). Of course a demo video is also available (see below).
Why do we need a CSM module for snapshotting?
I know I know, so there is already a snapshotting capability in the CSI specification. This functionality allows you to create, mount and delete array-based snapshots. So why do we need a specific extra module? Well, just consider an application that uses multiple persistent volumes. If you snapshot those volumes using the CSI snapshotting capability, the snapshots will never be made at the exact same time, and exactly this is the issue: In order to snapshot such an application even in a crash-consistent manner, all persistent volumes need to be snapshotted at the EXACT same time. And this is where volume groups come in.
Group snapshotting through Dell CSM
The Dell CSM snapshotting module solved this issue; as the underlying array has an understanding of volume groups, why not just add all relevant persistent volumes into a volume group that can then in turn be snapshotted as a group? This is what this CSM module accomplishes: It allows for persistent volumes to be grouped and snapped. And out comes a crash-consistent snapshot of an entire app!
Architecture of the CSM Snapshot capability. A number of volumes are present on the array, and they are grouped in a Volume Group(1). The volume group can be snapshotted and a crash-consistent snapshot is the result (2). Finally those snapshots can be mounted back as persistent storage into the k8s layer (3).
How volume group snapshots are created
By using a simple piece of yaml code we can instruct the k8s cluster to pick a group of volumes, group those together and then create a snapshot of the group, all in one go. Below is an example piece of yaml code that will perform this:
apiVersion: volumegroup.storage.dell.com/v1
kind: DellCsiVolumeGroupSnapshot
metadata:
name: "vg-snaprun1"
namespace: "default"
spec:
driverName: "csi-powerstore.dellemc.com"
memberReclaimPolicy: "Retain"
volumesnapshotclass: "powerstore-snapshot"
timeout: 90
pvcList:
- "pg-sql-1"
- "redis-1"
In this example we use a Dell PowerStore array, but it also functions on other Dell storage platforms like PowerFlex. The pvcList specified here can also be replaced by a pvcLabel where all PVC’s with a specific labelling would be included into the volume group.
By simply executing this piece of yaml we can now create volume group, add the relevant persistent volumes and group snapshot them.
If you want to watch this story in a demo video, check it out below!