Validating Dell CSI for YOUR unique environment

As we start to see more flavors of Linux, more flavors of k8s and more flavors of k8s platforms and we add the Dell’s storage portfolio to that, just how many different combinations would you need to validate? Yes you are right, the number would probably get close to “infinity”. So we may need a different approach, and the Dell Cert-CSI application may be a very good place to start.

Linux Flavors

Starting at the bottom: Linux flavors. We all know (and love?) the well-known flavors like Ubuntu, Debian, CentOS, RHEL, Fedora, OpenSUSE, SLES… I could go on and easily name 10. Or more.

And then we have the “container optimized” versions like Alpine, CoreOS, RancherOS, PhotonOS etc.

Most customers today from my view are using Red Hat OpenShift (so using CoreOS) or Suse Rancher (so using SUSE Linux Enterprise Micro). It might make sense to support those last ones, but certainly supporting “all” would be next-to impossible.

k8s and k8s platform flavors

On top of your favorite flavor of Linux will sit a version of k8s. This could be something simple like vanilla / upstream k8s or k3s. These mostly manually installed versions are quite basic and will run on mostly any flavor of Linux.

It could also be something more “beefy” like Red Hat OpenShift or Suse Rancher with RKE2. In these cases usually the flavor of Linux and k8s is more or less pre-determined, limiting the number of combinations to be validated.

Storage Architectures: Why Dell has different storage platforms

Dell hardly believes in “one platform that rules them all”. Dell has a storage PORTFOLIO, and for good reason: Some storage architectures are good at most things, but don’t excel in any. Other excel in certain areas, but have less efficiency in other areas. There is something for everyone here, and I think Dell is one of the few vendors that actually have products in all of the four storage architecture categories I know:

Architecture 1 – Dual controller, scale up disks (aka “clustered architecture”)
This architecture is the most popular. Relatively easy to build (although some even cut that short by building the architecture active/passive) and relatively cheap. This is “the swiss army knives” of storage: They do many things pretty well. Examples of arrays in these category are Dell Unity XT and PowerStore (although PowerStore has some smart features that make it lean toward being a hybrid with architectures 2 and 3)

Architecture 2 – Tightly coupled, scale out
Arrays in this category are true performance monsters with exceptional high uptimes. They have multiple controllers (typically more than 2), and all controllers share caching, disks and IO bandwidth amongst each other. An example of an array in this category is the Dell PowerMax series.

Architecture 3 – Loosely coupled, scale out nodes
This architecture I would say is the second most popular architecture. Like architecture 1 it is relatively easy to build, and has lots of benefits. This architecture was pushed mainly by public cloud providers and the birth of HCI; the idea to use nodes with local disk, connecting these together to form shared storage is basically what this architecture is all about. Examples of arrays in this category are VMware VSAN, Dell PowerScale and Dell PowerFlex. Especially the last two scale to near-infinity regarding capacity and performance. Scary stuff 😉

Architecture 4 – Distributed, share nothing
Somewhat similar to architecture 3, but distributed. Data gets distributed in a lazy, non-transactional way. The data will become consistent across distributed sites (often distributed across the entire world!), but at SOME point in time. Typically these arrays deliver object storage. An example of such an array is Dell ObjectScale.

Just so many testing you can do in a quarter…

I diverged – The general idea is that between all k8s (platform) flavors, installed on all these Linux flavors that need support on all different Dell storage platforms… Wow: That would easily add up to multiple hundreds of different environments to build, maintain and run tests against. Every quarter…

Time for a different approach.

Enter Dell “Cert-CSI”

Dell cert-csi is open source, and you can find it on GitHub. The idea is that you take your hardware, you install your favorite flavor of Linux, your favorite shape or form of k8s, the CSI driver of the Dell array you want to test against… OR just use what you have today. Next you run this tool. The good news: If it passes 100%, Dell will support your specific configuration!

How’s that for size? Up till now we have always validated “by customer demand”. It’s not hard to figure out you need to validate the Red Hat OpenShift or Suse Rancher platform, but what about k3s on Debian? Exactly, the ability to add support to any unique combination of k8s and Linux flavor would be a great leap forward.

Running the Cert-CSI test suite

To test things, I took my older k3s-on-debian setup that has a CSI driver into a Unity array. I updated the CSI driver and the Unity to latest-and-greatest. These versions I have now running:

k3s version: 1.24.7+k3s1
Linux: Debian 10 “Buster”
CSI driver: Dell Unity CSI driver 2.8.0
Unity array version: 5.3.1.0.5.008

I took the test suite, adjusted the yaml config file to exclude RAWBLOCK, RWX and RWOP modes, and ran the test on the iSCSI storageClass:

storageClasses:
  - name: unity-iscsi # storage-class-name 
    minSize: 5Gi # minimal size for your sc 
    rawBlock: false # is Raw Block supported
    expansion: true # is volume expansion supported 
    clone: true # is volume cloning supported 
    snapshot: true # is volume snapshotting supported 
    RWX: false # is ReadWriteMany volume access mode supported for non RawBlock volumes        
    volumeHealth: false # set this to enable the execution of the VolumeHealthMetricsSuite.
    VGS: false # set this to enable the execution of the VolumeGroupSnapSuite.
    RWOP: false # set this to enable the execution of the MultiAttachSuite

After the test had been running for a while, I checked back and found this:

[2023-12-13 06:40:21]  INFO Avg time of a run:   963.35s
[2023-12-13 06:40:21]  INFO Avg time of a del:   41.72s
[2023-12-13 06:40:21]  INFO Avg time of all:     1005.19s
[2023-12-13 06:40:21]  INFO During this run 100.0% of suites succeeded

SUCCESS!! Using cert-csi I got the “easy route” to a community supported config you can find on github here. The cert-csi test suite also draws up detailed reports. The “tabulated” report gives a nice overview of the test results:

Cert-CSI Report

CERT-CSI RESULTS

2023-12-13 06:40:21
unity-vsa01-iscsi
🗙 0 ✔ 6

Leave a Reply

Your email address will not be published. Required fields are marked *