Exploring Kadalu Storage in k3d Cluster - GlusterFS

2021-Aug-12 • Tags: kubernetes, kadalu

Like less lines of code might not always be considered as an optimization, a lengthy post need not to be viewed as a complex one to follow 😄 .

For this blog post I'll be strictly adhering to the part of title 'Exploring' and go through how we can learn by experimenting and observing the outcome in each stage.

Although it's a bit easier for me to state above considering I know most (not actual gluster per-se 😉 ) of the inner workings, I'll try my best not to deviate from simplicity and so will not be presenting any code walk-through.

Let's address this first, 'GlusterFS' is a widely adopted distributed and scalable network filesystem among enterprises and individual users with a lot of features, so I couldn't possibly explain how gluster works without cutting any corners. Only when it's absolutely necessary I'll be expanding on gluster side workings.

Previously we looked into Operator and CSI components and this post fills the gap between those two and completes the series.

Introduction §

From perspective of an application developer/user, they need a persistent storage when asked for one and doesn't bother about how and where the storage is being served.

We'll be looking from creation of kubernetes cluster to fulfilling the user request of supplying persistent storage. Before we start off, let's look into some of the terminologies for understanding functionalities mentioned in rest of the post.

Terminologies §

1. Gluster Deployment:

Internal: Kadalu operator takes care of gluster volume life cycle and deploys containerized gluster server/client pods
External: User has to manage gluster volume creation, deletion and gluster server typically resides outside of kubernetes and internal gluster only used as a client (~fuse mount) in this case

2. Storage:

Storage Pool: Combination of gluster bricks served from internal gluster from which Persistent Volumes (PVs) are provided to end user
Volumes: Used to signify PVs in this blog post
Gluster Volume: Represents external gluster volume managed by user

3. Quota:

Simple Quota: Used by internal gluster and is a part of kadalu's fork of upstream glusterfs. You can refer this rfc to learn more
Kadalu Quotad: Typically runs as a daemon process in external gluster server to help internal simple quota set xattrs correctly. This can only work for gluster volumes which are of non-distributed type
Gluster Quota: Kadalu can delegate quota operations to gluster native quota when an SSH Key Pair is added in Kadalu namespace before deploying the operator

4. Kadalu Format:

native: Default option for kadalu_format in storage pool config. In this format, each volume (~pvc) is created as a fuse-subdir and thus support volume expansion as well
non-native: When kadalu_format is set to non-native, whole storage pool can be used only for single pvc and so no expansion is possible (unless you hack on underlying bricks 😅 )

Now, we can proceed with creation of kubernetes cluster using k3d. Kadalu can create storage pool backed by raw devices or xfs mounted paths or any available PVC in k8s cluster. For more info please refer kadalu docs

I'll be using 3 of my devices from host and for rest of the post, server and agent docker containers created by k3d can be considered as k8s master and worker nodes respectively.

At the time of publishing this post, the last commit to kadalu repo is b8e9c5f, latest release is 0.8.4 and features mentioned here if not released already will be part of next release.

K3D Cluster §

As simple as it gets, k3d makes it very easy to create kubernetes cluster running k3s. I created mine with below command, snippet is from here:

# Number of worker nodes
agents=3

# Pods need a shared mount
mkdir -p /tmp/k3d/kubelet/pods

# Create k3d test cluster
k3d cluster create test -a $agents \
    -v /tmp/k3d/kubelet/pods:/var/lib/kubelet/pods:shared \
    -v /dev/sdc:/dev/sdc -v /dev/sdd:/dev/sdd \
    -v /dev/sde:/dev/sde \
    -v ~/.k3d/registries.yaml:/etc/rancher/k3s/registries.yaml \
    --k3s-server-arg "--kube-apiserver-arg=feature-gates=EphemeralContainers=true" \
    --k3s-server-arg --disable=local-storage

We can see single server (control-plane) with three agents.

1 -> kubectl get ns
2 NAME              STATUS   AGE
3 default           Active   3m39s
4 kube-system       Active   3m39s
5 kube-public       Active   3m39s
6 kube-node-lease   Active   3m39s
7 
8 -> kubectl get nodes -o wide
9 NAME                STATUS   ROLES                  AGE     VERSION        INTERNAL-IP   EXTERNAL-IP   OS-IMAGE   KERNEL-VERSION            CONTAINER-RUNTIME
10 k3d-test-server-0   Ready    control-plane,master   3m39s   v1.21.2+k3s1   172.18.0.2    <none>        Unknown    5.12.14-300.fc34.x86_64   containerd://1.4.4-k3s2
11 k3d-test-agent-0    Ready    <none>                 3m39s   v1.21.2+k3s1   172.18.0.3    <none>        Unknown    5.12.14-300.fc34.x86_64   containerd://1.4.4-k3s2
12 k3d-test-agent-1    Ready    <none>                 3m39s   v1.21.2+k3s1   172.18.0.4    <none>        Unknown    5.12.14-300.fc34.x86_64   containerd://1.4.4-k3s2
13 k3d-test-agent-2    Ready    <none>                 3m39s   v1.21.2+k3s1   172.18.0.6    <none>        Unknown    5.12.14-300.fc34.x86_64   containerd://1.4.4-k3s2
14 
15 -> kubectl version --short=true
16 Client Version: v1.20.2
17 Server Version: v1.21.2+k3s1

Kadalu Setup §

We'll create a secret before deploying kadalu to be used with external gluster towards the end of the post.

1 -> kubectl create namespace kadalu
2 namespace/kadalu created
3 
4 -> kubectl create secret generic glusterquota-ssh-secret --from-literal=glusterquota-ssh-username=root --from-file=ssh-privatekey=/root/.ssh/id_rsa -n kadalu
5 secret/glusterquota-ssh-secret created
6 
7 -> kubectl config set-context --current --namespace=kadalu
8 
9 -> kubectl get all
10 No resources found in kadalu namespace.
11 
12 -> kubectl get csidrivers
13 No resources found

There are no CSIDrivers currently deployed (above, line 12)

1 # Warning can be ignored here
2 -> curl -s https://raw.githubusercontent.com/kadalu/kadalu/devel/manifests/kadalu-operator.yaml | sed 's/"no"/"yes"/' | kubectl apply -f -
3 Warning: resource namespaces/kadalu is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
4 namespace/kadalu configured
5 serviceaccount/kadalu-operator created
6 serviceaccount/kadalu-csi-nodeplugin created
7 serviceaccount/kadalu-csi-provisioner created
8 serviceaccount/kadalu-server-sa created
9 customresourcedefinition.apiextensions.k8s.io/kadalustorages.kadalu-operator.storage created
10 clusterrole.rbac.authorization.k8s.io/pod-exec created
11 clusterrole.rbac.authorization.k8s.io/kadalu-operator created
12 clusterrolebinding.rbac.authorization.k8s.io/kadalu-operator created
13 deployment.apps/operator created
14 
15 -> curl -s https://raw.githubusercontent.com/kadalu/kadalu/devel/manifests/csi-nodeplugin.yaml | sed 's/"no"/"yes"/' | kubectl apply -f -
16 clusterrole.rbac.authorization.k8s.io/kadalu-csi-nodeplugin created
17 clusterrolebinding.rbac.authorization.k8s.io/kadalu-csi-nodeplugin created
18 daemonset.apps/kadalu-csi-nodeplugin created
19 
20 -> kubectl get all
21 NAME                              READY   STATUS    RESTARTS   AGE
22 pod/operator-88bd4784c-4ldbv      1/1     Running   0          115s
23 pod/kadalu-csi-provisioner-0      5/5     Running   0          110s
24 pod/kadalu-csi-nodeplugin-pzbf7   3/3     Running   0          95s
25 pod/kadalu-csi-nodeplugin-chc7d   3/3     Running   0          95s
26 pod/kadalu-csi-nodeplugin-lf5ml   3/3     Running   0          95s
27 pod/kadalu-csi-nodeplugin-6hlw9   3/3     Running   0          95s
28 
29 NAME                                   DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
30 daemonset.apps/kadalu-csi-nodeplugin   4         4         4       4            4           <none>          95s
31 
32 NAME                       READY   UP-TO-DATE   AVAILABLE   AGE
33 deployment.apps/operator   1/1     1            1           115s
34 
35 NAME                                 DESIRED   CURRENT   READY   AGE
36 replicaset.apps/operator-88bd4784c   1         1         1       115s
37 
38 NAME                                      READY   AGE
39 statefulset.apps/kadalu-csi-provisioner   1/1     110s

Installation is split into two manifests to ease upgrades and at the same time, there's still room for improvement here. Typical kubernetes deployment can be seen above, operator reconciled the state and created provisioner, all necessary RBAC permissions.

Later, nodeplugin is deployed as a daemonset which will be running on every node. Usage of sed in above listing is just to trigger verbose logging.

Let's see what else is deployed and look at resources which are important to us.

1 -> kubectl get csidriver
2 NAME     ATTACHREQUIRED   PODINFOONMOUNT   STORAGECAPACITY   TOKENREQUESTS   REQUIRESREPUBLISH   MODES        AGE
3 kadalu   true             false            false             <unset>         false               Persistent   72s
4 
5 -> kubectl describe cm kadalu-info 
6 Name:         kadalu-info
7 Namespace:    kadalu
8 Labels:       <none>
9 Annotations:  <none>
10 
11 Data
12 ====
13 uid:
14 ----
15 f6689df0-c4e3-4ecb-a9d4-d788f0edd487
16 volumes:
17 ----
18 
19 Events:  <none>
20 
21 -> kubectl get sc
22 No resources found
23 
24 -> kubectl get kds
25 No resources found in kadalu namespace.

At this point csidriver (line 1) and kadalu-info config map (line 23) is easily the most important resources. PODINFOONMOUNT should've be true and will deliver more info in gRPC calls to kadalu csi driver.

With release of kubernetes v1.22, GVK (Group, Version, Kind) of many important resources needs to be updated and above will be fixed in that.

Internal gluster uses neither glusterd nor gluster cli. Operator fills up kadalu-info dynamically while performing operations based on storage pool and internal gluster reads kadalu-info and constructs necessary volfiles.

We can see no storage classes (line 21) and no kds (short for kadalustorages, line 24) yet.

By the end of this stage, if we are facing any issues or any of above pods aren't in ready state, logs in operator followed by kadalu-provisioner container in kadalu-csi-provisioner-0 pod and kadalu-nodeplugin in kadalu-csi-nodeplugin-* pods need to be looked for finding the issue.

Storage Operations §

We'll deploy internal gluster and connect to gluster as well and perform volume operations.

Internal Gluster §

All operations performed here are described in detail so that by the time we reach looking into external gluster ops, we'll have a good understanding how things work internally.

Pool Creation §

We'll deploy a storage pool of Replica3 type and look at all the resources that are created. Internal gluster supports storage pools of below types:

Pure distribute (Replica1)
Replica2, Replica3
Disperse

If we supply two multiples of disks, a distributed storage pool is created.

Note: It's recommended to create XFS filesytem on disks before supplying them to Kadalu. This ensures that XFS version on host is always compatible with shared disks.

Please refer inline comments in below listing

1 # Our intention is to create a storage pool of type `Replica3` using devices
2 # spread across three nodes.
3 -> bat --plain ../storage-config-device.yaml | tee /dev/tty | kubectl apply -f -
4 ---
5 apiVersion: kadalu-operator.storage/v1alpha1
6 kind: KadaluStorage
7 metadata:
8   name: replica3
9 spec:
10   type: Replica3
11   storage:
12     - node: k3d-test-agent-0
13       device: /dev/sdc
14     - node: k3d-test-agent-1
15       device: /dev/sdd
16     - node: k3d-test-agent-2
17       device: /dev/sde
18 kadalustorage.kadalu-operator.storage/replica3 created
19 
20 # One `server` pod per `device` (~brick) is created and a Headless Service is deployed with it.
21 -> kubectl get all -l app.kubernetes.io/component=server -o wide
22 NAME                      READY   STATUS    RESTARTS   AGE   IP           NODE               NOMINATED NODE   READINESS GATES
23 pod/server-replica3-0-0   1/1     Running   0          20s   10.42.1.15   k3d-test-agent-0   <none>           <none>
24 pod/server-replica3-1-0   1/1     Running   0          20s   10.42.2.11   k3d-test-agent-1   <none>           <none>
25 pod/server-replica3-2-0   1/1     Running   0          19s   10.42.3.8    k3d-test-agent-2   <none>           <none>
26 
27 NAME               TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)     AGE   SELECTOR
28 service/replica3   ClusterIP   None         <none>        24007/TCP   18s   app.kubernetes.io/component=server,app.kubernetes.io/name=server,app.kubernetes.io/part-of=kadalu
29 
30 NAME                                 READY   AGE   CONTAINERS   IMAGES
31 statefulset.apps/server-replica3-0   1/1     20s   server       docker.io/kadalu/kadalu-server:devel
32 statefulset.apps/server-replica3-1   1/1     20s   server       docker.io/kadalu/kadalu-server:devel
33 statefulset.apps/server-replica3-2   1/1     19s   server       docker.io/kadalu/kadalu-server:devel
34 
35 # We can contact each `server` pod at below mentioned `Endpoints` or resolve to them using
36 # their DNS name and k8s will internally handle the dns query
37 -> kubectl describe svc
38 Name:              replica3
39 Namespace:         kadalu
40 Labels:            app.kubernetes.io/component=server
41                    app.kubernetes.io/name=replica3-service
42                    app.kubernetes.io/part-of=kadalu
43 Annotations:       <none>
44 Selector:          app.kubernetes.io/component=server,app.kubernetes.io/name=server,app.kubernetes.io/part-of=kadalu
45 Type:              ClusterIP
46 IP Family Policy:  SingleStack
47 IP Families:       IPv4
48 IP:                None
49 IPs:               None
50 Port:              brickport  24007/TCP
51 TargetPort:        24007/TCP
52 Endpoints:         10.42.1.15:24007,10.42.2.11:24007,10.42.3.8:24007
53 Session Affinity:  None
54 Events:            <none>
55 
56 # Below are the backend 'data' bricks and it can be referred that an xfs filesytem is created
57 # on the supplied device and mounted at '/bricks/<storage-pool-name>/data`'
58 # please observe kadalu-info configmap in next listing to find this path
59 -> for i in 0 1 2; do kubectl exec -it server-replica3-$i-0 -- sh -c 'hostname; df -hT | grep bricks; ls -lR /bricks/replica3/data'; done;
60 server-replica3-0-0
61 /dev/sdc            xfs        10G  104M  9.9G   2% /bricks/replica3/data
62 /bricks/replica3/data:
63 total 0
64 drwxr-xr-x. 3 root root 24 Aug 11 05:38 brick
65 
66 /bricks/replica3/data/brick:
67 total 0
68 server-replica3-1-0
69 /dev/sdd            xfs        10G  104M  9.9G   2% /bricks/replica3/data
70 /bricks/replica3/data:
71 total 0
72 drwxr-xr-x. 3 root root 24 Aug 11 05:38 brick
73 
74 /bricks/replica3/data/brick:
75 total 0
76 server-replica3-2-0
77 /dev/sde            xfs        10G  104M  9.9G   2% /bricks/replica3/data
78 /bricks/replica3/data:
79 total 0
80 drwxr-xr-x. 3 root root 24 Aug 11 05:38 brick
81 
82 /bricks/replica3/data/brick:
83 total 0
84 
85 # After creation of storage pool, above pods & services are deployed and a directory
86 # structure is created in backend brick but nothing can be seen in `provisioner` and
87 # `nodeplugin` yet. Just showing `secret-volume` below which holds SSH Key Pair info
88 # of external gluster
89 -> kubectl exec -it kadalu-csi-provisioner-0 -c kadalu-provisioner -- sh -c 'df -h | grep -P secret'
90 tmpfs                3.9G  8.0K  3.9G   1% /etc/secret-volume
91 tmpfs                3.9G   12K  3.9G   1% /run/secrets/kubernetes.io/serviceaccount
92 
93 -> kubectl exec -it kadalu-csi-provisioner-0 -c kadalu-provisioner -- sh -c 'df -h | grep kadalu'
94 command terminated with exit code 1

As we didn't touch either provisioner or nodeplugin we should still look for any errors in operator, server logs. Before moving further, using busybox in provisioner pod, please confirm the access to server pods.

1 # Ping should be successfully and port `24007` should be reachable
2 -> kubectl exec -it sts/kadalu-csi-provisioner -c kadalu-logging -- sh -c 'ping -c 5 server-replica3-0-0.replica3; nc -zv server-replica3-0-0.replica3 24007'
3 PING server-replica3-0-0.replica3 (10.42.1.15): 56 data bytes
4 64 bytes from 10.42.1.15: seq=0 ttl=62 time=13.834 ms
5 64 bytes from 10.42.1.15: seq=1 ttl=62 time=0.319 ms
6 64 bytes from 10.42.1.15: seq=2 ttl=62 time=0.350 ms
7 64 bytes from 10.42.1.15: seq=3 ttl=62 time=0.286 ms
8 64 bytes from 10.42.1.15: seq=4 ttl=62 time=0.311 ms
9 
10 --- server-replica3-0-0.replica3 ping statistics ---
11 5 packets transmitted, 5 packets received, 0% packet loss
12 round-trip min/avg/max = 0.286/3.020/13.834 ms
13 server-replica3-0-0.replica3 (10.42.1.15:24007) open

Please refer this workaround if you face any issues contacting server pods, especially if you deployed kubernetes in vmware machines.

Next stop to find out how server pod is able to pick up correct devices and performed necessary operations on those. For simplicity, I'm not showing gluster process in any of the pods and so consider whenever there's a mount (of type XFS) available on server or a mount (of type fuse.glusterfs) on any of provisioner/nodeplugin/app pods then glusterfs as a daemon will be running on those containers.

Read about the magic, here and here then it'll not be a magic anymore 😂 .

1 # This here, I'd say is the most important piece gluing operator, server and csi pods.
2 # Right off the bat you can see all the required info needed to create a volfile which
3 # when supplied to glusterfs binary does what it does the best, pooling storage and
4 # serving it under a single namespace
5 
6 # Json formatted with 'python -mjson.tool' for readability
7 -> kubectl describe cm kadalu-info
8 Name:         kadalu-info
9 Namespace:    kadalu
10 Labels:       <none>
11 Annotations:  <none>
12 
13 Data
14 ====
15 volumes:
16 ----
17 
18 replica3.info:
19 ----
20 {
21     "namespace": "kadalu",
22     "kadalu_version": "devel",
23     "volname": "replica3",
24     "volume_id": "5e39a614-fa66-11eb-a07e-56a8d556e557",
25     "kadalu_format": "native",
26     "type": "Replica3",
27     "pvReclaimPolicy": "delete",
28     "bricks": [
29         {
30             "brick_path": "/bricks/replica3/data/brick",
31             "kube_hostname": "k3d-test-agent-0",
32             "node": "server-replica3-0-0.replica3",
33             "node_id": "node-0",
34             "host_brick_path": "",
35             "brick_device": "/dev/sdc",
36             "pvc_name": "",
37             "brick_device_dir": "",
38             "decommissioned": "",
39             "brick_index": 0
40         },
41         {
42             "brick_path": "/bricks/replica3/data/brick",
43             "kube_hostname": "k3d-test-agent-1",
44             "node": "server-replica3-1-0.replica3",
45             "node_id": "node-1",
46             "host_brick_path": "",
47             "brick_device": "/dev/sdd",
48             "pvc_name": "",
49             "brick_device_dir": "",
50             "decommissioned": "",
51             "brick_index": 1
52         },
53         {
54             "brick_path": "/bricks/replica3/data/brick",
55             "kube_hostname": "k3d-test-agent-2",
56             "node": "server-replica3-2-0.replica3",
57             "node_id": "node-2",
58             "host_brick_path": "",
59             "brick_device": "/dev/sde",
60             "pvc_name": "",
61             "brick_device_dir": "",
62             "decommissioned": "",
63             "brick_index": 2
64         }
65     ],
66     "disperse": {
67         "data": 0,
68         "redundancy": 0
69     },
70     "options": {}
71 }
72 
73 uid:
74 ----
75 f6689df0-c4e3-4ecb-a9d4-d788f0edd487
76 Events:  <none>

Volume Operations §

We have a storage pool available ready to be used, however if we look around carefully, we'll find two more important resources are created and one being a storageClass the de-facto in kubernetes for carving dynamics PVCs.

1 # Moment of truth, this is what we finally want a 'storageClass' to create PV
2 # Look at 'ALLOWVOLUMEEXPANSION', it's set to 'true' as we deployed 'kadalu_format' in 'native' mode in storage pool config
3 # Observe the naming, name of storage pool is 'replica3' and it deploys a 'sc' with name 'kadalu.replica3'
4 -> kubectl get kds,sc
5 NAME                                             AGE
6 kadalustorage.kadalu-operator.storage/replica3   31m
7 
8 NAME                                          PROVISIONER   RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
9 storageclass.storage.k8s.io/kadalu.replica3   kadalu        Delete          Immediate           true                   31m
10 
11 # As expected no PV or PVC yet
12 -> kubectl get pv,pvc
13 No resources found

Even at this stage, if there are any issues operator logs need to be referred. Onto the next stage with carving PVC from storage pool.

We can create a PVC from kadalu.replica3 storageClass and let's observe what all resources are created and what changes happen in CSI pods.

1 # Ask for an 1Gi pvc from 'kadalu.replica3' storageClass
2 -> bat --plain ../sample-pvc.yaml -r :13 | tee /dev/tty | kubectl apply -f -
3 # File: sample-pvc.yaml
4 ---
5 kind: PersistentVolumeClaim
6 apiVersion: v1
7 metadata:
8   name: replica3-pvc
9 spec:
10   storageClassName: kadalu.replica3
11   accessModes:
12     - ReadWriteMany
13   resources:
14     requests:
15       storage: 1Gi
16 persistentvolumeclaim/replica3-pvc created
17 
18 # PVC is created and PV is bounded to dynamically created PVC
19 # Observe 'RECLAIM POLICY', currently it's 'Delete', if we want to retain PVC when a delete request is
20 # received on it, storage pool need to created with 'pvReclaimPolicy' as 'archive'
21 -> kubectl get pv,pvc
22 NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                 STORAGECLASS      REASON   AGE
23 persistentvolume/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7   1Gi        RWX            Delete           Bound    kadalu/replica3-pvc   kadalu.replica3            42s
24 
25 NAME                                 STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
26 persistentvolumeclaim/replica3-pvc   Bound    pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7   1Gi        RWX            kadalu.replica3   43s
27 
28 # Now things start to change in 'provisioner', the sidecar (csi-provisioner) listens to k8s api-server
29 # sends gRPC call to 'kadalu-provisioner' for creation of PVC
30 -> kubectl exec -it kadalu-csi-provisioner-0 -c kadalu-provisioner -- bash
31 root@kadalu-csi-provisioner-0:/# df -hT | grep kadalu
32 kadalu:replica3     fuse.glusterfs   10G  207M  9.8G   3% /mnt/replica3
33 root@kadalu-csi-provisioner-0:/# ls /mnt/replica3/
34 info  stat.db  subvol
35 root@kadalu-csi-provisioner-0:/# ls /mnt/replica3/subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7/
36 root@kadalu-csi-provisioner-0:/# cat /mnt/replica3/info/subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7.json 
37 {"size": 1073741824, "path_prefix": "subvol/c3/02"}root@kadalu-csi-provisioner-0:/#

Referring above listing we can deduct:

Except glusterfs process in server pod, glusterfs process will be acting as a client (~mount) in other (provisioner/nodeplugin/app) pods
Filesystem kadalu:<storage-pool> of type fuse.glusterfs is mounted on /mnt/<storage-pool>
Storage pool has three entries:
- info: Holds subvol info and mirrors subvol directory structure but at the leaf it'll have pvc info (in json format)
- stat.db: Holds pvc entries and summary tables which belong to the storage pool, used in pretty interesting ways in kubectl_kadalu among other places
- subvol: At the leaf contains actual PVC directory and currently it's empty as expected

Most of the operations happen in kadalu-csi-provisioner-0 pod and should be looked for errors if PVC is stuck in pending state. If logs are clean and issue still persist, server pod logs should be analysed.

Sample App §

Now we got a PVC, let's use it in an app pod and observe changes that happen predominantly in nodeplugin and find out all the places we can access data.

1 # Normal 'busybox' but with a volume mount referring to above created PVC
2 -> bat --plain ../sample-app.yaml -r :22 | tee /dev/tty | kubectl apply -f -
3 # File: sample-app.yaml
4 ---
5 apiVersion: v1
6 kind: Pod
7 metadata:
8   name: replica3-pvc-pod
9 spec:
10   containers:
11   - name: replica3-pvc-pod
12     image: busybox
13     imagePullPolicy: IfNotPresent
14     command:
15       - '/bin/tail'
16       - '-f'
17       - '/dev/null'
18     volumeMounts:
19     - mountPath: '/mnt/replica3-pvc'
20       name: replica3-pvc
21   volumes:
22   - name: replica3-pvc
23     persistentVolumeClaim:
24       claimName: replica3-pvc
25 pod/replica3-pvc-pod created
26 
27 # Pod is scheduled on node 'k3d-test-server-0', let's find nodeplugin which is running on that node
28 -> kubectl get pods replica3-pvc-pod -o wide
29 NAME               READY   STATUS    RESTARTS   AGE   IP          NODE                NOMINATED NODE   READINESS GATES
30 replica3-pvc-pod   1/1     Running   0          50s   10.42.0.8   k3d-test-server-0   <none>           <none>
31 
32 # We are interested in below nodeplugin
33 -> kubectl get pods -o wide | grep k3d-test-server-0
34 kadalu-csi-nodeplugin-6hlw9   3/3     Running   0          95m   10.42.0.6    k3d-test-server-0   <none>           <none>
35 replica3-pvc-pod              1/1     Running   0          20m   10.42.0.8    k3d-test-server-0   <none>           <none>
36 
37 # Wait a sec, why do nodeplugin also has whole storage pool mounted, but not only the PVC?
38 # Remember, we use fuse-subdir functionality of glusterfs to only surface required PVC to app pod
39 -> kubectl exec -it kadalu-csi-nodeplugin-6hlw9 -c kadalu-nodeplugin -- bash
40 root@kadalu-csi-nodeplugin-6hlw9:/# df -hT | grep kadalu
41 kadalu:replica3     fuse.glusterfs   10G  207M  9.8G   3% /mnt/replica3
42 root@kadalu-csi-nodeplugin-6hlw9:/# ls /mnt/replica3/
43 info  stat.db  subvol
44 root@kadalu-csi-nodeplugin-6hlw9:/# ls /mnt/replica3/subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7/
45 root@kadalu-csi-nodeplugin-6hlw9:/# cat /mnt/replica3/info/subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7.json
46 {"size": 1073741824, "path_prefix": "subvol/c3/02"}root@kadalu-csi-nodeplugin-6hlw9:/#
47 
48 # Before traversing how pvc is surfaced to app pod, let's write some data to the mount
49 # Ok, nodeplugin saw whole storage pool but app pod is seeing only 1Gi with same filesystem name?
50 # We'll look how above is possible in next listing
51 # Anyways, we can see an increase of 100M and is also reflected in nodeplugin as well
52 ->  k exec -it replica3-pvc-pod -- sh
53 / # df -hT | grep kadalu
54 kadalu:replica3      fuse.glusterfs  1.0G         0      1.0G   0% /mnt/replica3-pvc
55 / # cd /mnt/replica3-pvc/
56 /mnt/replica3-pvc # cat /dev/urandom | tr -dc [:space:][:print:] | head -c 100m > 100Mfile;
57 /mnt/replica3-pvc # df -h | grep kadalu
58 kadalu:replica3           1.0G    100.0M    924.0M  10% /mnt/replica3-pvc
59 
60 -> kubectl exec -it kadalu-csi-nodeplugin-6hlw9 -c kadalu-nodeplugin -- sh -c 'df -hT | grep kadalu'
61 kadalu:replica3     fuse.glusterfs   10G  307M  9.7G   3% /mnt/replica3

Well, most of the resources has an associated uid and let's start from there to find out what kubelet did for replica3-pvc-pod to have access to only 1Gi of storage pool and role of nodeplugin as well.

1 # We get the uid's and hunt for volumes mounted in pods by travesing corresponding 'kubelet' volumes directory
2 -> kubectl get pods -o=custom-columns='NAME:.metadata.name,NODE:.spec.nodeName,UID:.metadata.uid' | grep -P 'provisioner|nodeplugin-6hlw9|replica3-pvc'
3 kadalu-csi-provisioner-0      k3d-test-agent-2    c92d4fa9-90cf-46a5-8bd6-93aeebaca6a9
4 kadalu-csi-nodeplugin-6hlw9   k3d-test-server-0   98166ddd-5157-4572-ba6c-0a577efd6127
5 replica3-pvc-pod              k3d-test-server-0   4ad3ba8c-457e-4b81-9404-b793967165b2
6 
7 -> kubectl get pvc
8 NAME           STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
9 replica3-pvc   Bound    pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7   1Gi        RWX            kadalu.replica3   3h10m
10 
11 # Entries in nodes running 'nodeplugin' and 'provisioner' pods are not interesting
12 # They are self-explanatory and 'projected' just holds access keys for api-server
13 -> docker exec -it k3d-test-agent-2 sh -c 'ls /var/lib/kubelet/pods/c92d4fa9-90cf-46a5-8bd6-93aeebaca6a9/volumes'
14 kubernetes.io~configmap  kubernetes.io~projected
15 kubernetes.io~empty-dir  kubernetes.io~secret
16 
17 -> docker exec -it k3d-test-server-0 sh -c 'ls /var/lib/kubelet/pods/98166ddd-5157-4572-ba6c-0a577efd6127/volumes'
18 kubernetes.io~configmap  kubernetes.io~empty-dir  kubernetes.io~projected
19 
20 # Entry 'kubernetes.io~csi' is interesting for us and this uid belongs to app pod (replica3-pvc-pod)
21 -> docker exec -it k3d-test-server-0 sh -c 'ls /var/lib/kubelet/pods/4ad3ba8c-457e-4b81-9404-b793967165b2/volumes'
22 kubernetes.io~csi  kubernetes.io~projected
23 
24 # We (nodeplugin running on node where 'replica3-pvc-pod' to be scheduled) received a request to mount pvc at 'target_path'
25 # 'kubelet' waits for nodeplugin to respond and after confirming that mount is available app pod is scheduled to that node
26 # Btw, 'nodeplugin' only mounts PVC subdir and that is limited to 1Gi by simple-quota and surfaced to 'replica3-pvc-pod'
27 -> kubectl logs kadalu-csi-nodeplugin-6hlw9 -c kadalu-nodeplugin | sed -n '/Received/,/Mounted PV/p'
28 [2021-08-11 06:51:41,627] DEBUG [nodeserver - 71:NodePublishVolume] - Received a valid mount request     request=volume_id: "pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7"
29 target_path: "/var/lib/kubelet/pods/4ad3ba8c-457e-4b81-9404-b793967165b2/volumes/kubernetes.io~csi/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7/mount"
30 [...]
31 volume_context {
32   key: "path"
33   value: "subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7"
34 }
35 volume_context {
36   key: "pvtype"
37   value: "subvol"
38 }
39 [...]
40 volume_context {
41   key: "type"
42   value: "Replica3"
43 }
44  voltype=Replica3 hostvol=replica3 pvpath=subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7 pvtype=subvol
45 [2021-08-11 06:51:41,769] DEBUG [nodeserver - 104:NodePublishVolume] - Mounted Hosting Volume    pv=pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7 hostvol=replica3 mntdir=/mnt/replica3
46 [2021-08-11 06:51:41,841] INFO [nodeserver - 113:NodePublishVolume] - Mounted PV         volume=pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7 pvpath=subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7 pvtype=subvol hostvol=replica3 duration_seconds=0.21625757217407227

Above all is done by kadalu-nodeplugin container in kadalu-csi-nodeplugin-* pod and so logs of that should be referred for any errors.

We can access above written 100Mfile from below locations, please note it isn't intended to be accessed outside of app pod. Definitely not from backend bricks, if storage pool is of type disperse then you can't read the data directly from brick.

1 # From app pod 'replica3-pvc-pod'
2 -> kubectl exec -it replica3-pvc-pod -- sh -c 'ls -lh /mnt/replica3-pvc'
3 total 100M
4 -rw-r--r--    1 root     root      100.0M Aug 11 07:22 100Mfile
5 
6 # From the node where app pod is running
7 -> docker exec -it k3d-test-server-0 sh -c 'ls -lh /var/lib/kubelet/pods/4ad3ba8c-457e-4b81-9404-b793967165b2/volumes/kubernetes.io~csi/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7/mount'
8 total 100M
9 -rw-r--r-- 1 0 0 100M Aug 11 07:22 100Mfile
10 
11 # From nodeplugin on the node where app pod is running
12 -> kubectl exec -it kadalu-csi-nodeplugin-6hlw9 -c kadalu-nodeplugin -- sh -c 'ls -lh /mnt/replica3/subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7'
13 total 100M
14 -rw-r--r--. 1 root root 100M Aug 11 07:22 100Mfile
15 
16 # From provisioner, all quota related operations are performed in provisioner pod
17 -> kubectl exec -it kadalu-csi-provisioner-0 -c kadalu-provisioner -- sh -c 'ls -lh /mnt/replica3/subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7'
18 total 100M
19 -rw-r--r--. 1 root root 100M Aug 11 07:22 100Mfile
20 
21 # Lastly from any of the backend bricks, you can't read data if pool is of type `disperse`
22 -> kubectl exec -it server-replica3-0-0 -- sh -c 'ls -lh /bricks/replica3/data/brick/subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7'
23 total 100M
24 -rw-r--r--. 2 root root 100M Aug 11 07:22 100Mfile

Let's perform one last operation before looking into external gluster i.e., expand pvc from 1Gi to 2Gi

1 # As of now, app pod sees only 1Gi of size
2 -> kubectl exec -it replica3-pvc-pod -- sh -c 'df -h /mnt/replica3-pvc'
3 Filesystem                Size      Used Available Use% Mounted on
4 kadalu:replica3           1.0G    100.0M    924.0M  10% /mnt/replica3-pvc
5 
6 # Just change 'storage' to 2Gi
7 -> bat --plain ../sample-pvc.yaml -r :13 | k apply -f -
8 # File: sample-pvc.yaml
9 ---
10 kind: PersistentVolumeClaim
11 apiVersion: v1
12 metadata:
13   name: replica3-pvc
14 spec:
15   storageClassName: kadalu.replica3
16   accessModes:
17     - ReadWriteMany
18   resources:
19     requests:
20       storage: 2Gi
21 persistentvolumeclaim/replica3-pvc configured
22 
23 # As the operation invovled is only manipulating quota ops, it'll be near instant
24 # to reflect updated 'Size' in app pod
25 -> kubectl exec -it replica3-pvc-pod -- sh -c 'df -h /mnt/replica3-pvc'
26 Filesystem                Size      Used Available Use% Mounted on
27 kadalu:replica3           2.0G    100.0M      1.9G   5% /mnt/replica3-pvc

External Gluster §

I created a one brick volume on external gluster cluster to demo kadalu capabilities. We'll be moving relatively faster as most of the ground is already covered.

1 # Make sure volume is started and reachable from outside the cluster
2 # We just need below info:
3 # gluster_host: 10.x.x.x
4 # gluster_volname: dist
5 -> ssh ext-gluster 'gluster volume info'
6 
7 Volume Name: dist
8 Type: Distribute
9 Volume ID: 09f25449-c7d8-4904-9293-a45b848221ac
10 Status: Started
11 Snapshot Count: 0
12 Number of Bricks: 1
13 Transport-type: tcp
14 Bricks:
15 Brick1: 10.x.x.x:/bricks/brick1/dist
16 Options Reconfigured:
17 storage.fips-mode-rchecksum: on
18 transport.address-family: inet
19 nfs.disable: on
20 
21 # Enable quota on gluster volume to be able to use in kadalu native format
22 -> ssh ext-gluster 'gluster volume quota dist enable'
23 volume quota : success
24 
25 -> kubectl get kds,sc
26 NAME                                             AGE
27 kadalustorage.kadalu-operator.storage/replica3   4h35m
28 
29 NAME                                          PROVISIONER   RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
30 storageclass.storage.k8s.io/kadalu.replica3   kadalu        Delete          Immediate           true                   4h35m
31 
32 -> kubectl exec -it kadalu-csi-provisioner-0 -c kadalu-provisioner -- sh -c 'df -hT | grep kadalu'
33 kadalu:replica3     fuse.glusterfs   10G  307M  9.7G   3% /mnt/replica3

Pool Creation §

I'll continue using the same setup as above to show the possibility of using external and internal gluster without any issues.

1 # We are using same CRD for both internal and external gluster, operator will validate parameters based on
2 # storage pool type, below is of type 'External'
3 -> bat --plain ../storage-config-external.yaml | tee /dev/tty | kubectl apply -f -
4 ---
5 apiVersion: kadalu-operator.storage/v1alpha1
6 kind: KadaluStorage
7 metadata:
8   name: ext-conf
9 spec:
10   type: External
11   details:
12     gluster_host:  10.x.x.x
13     gluster_volname: dist
14     gluster_options: log-level=DEBUG
15 kadalustorage.kadalu-operator.storage/ext-conf created
16 
17 # As we saw earlier, 'kadalu.' is prepended to applied CR name
18 # However, please note 'ALLOWVOLUMEEXPANSION' is set true, only if we are using kadalu_format as 'native'
19 # and an SSH key pair secret for accessing external cluster is mounted in provisioner pod
20 -> kubectl get kds,sc
21 NAME                                             AGE
22 kadalustorage.kadalu-operator.storage/replica3   4h37m
23 kadalustorage.kadalu-operator.storage/ext-conf   41s
24 
25 NAME                                          PROVISIONER   RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
26 storageclass.storage.k8s.io/kadalu.replica3   kadalu        Delete          Immediate           true                   4h37m
27 storageclass.storage.k8s.io/kadalu.ext-conf   kadalu        Delete          Immediate           true                   36s

No server pods are created, operator just tries to connect to external gluster cluster and get the volfile, on success it fills kadalu-info config map to be later used in provisioner and nodeplugin pods

Below info is added to config map (trimmed existing entries in the listing)

ext-conf.info:
----
{
    "volname": "ext-conf",
    "volume_id": "19ce687a-fa8e-11eb-a07e-56a8d556e557",
    "type": "External",
    "pvReclaimPolicy": "delete",
    "kadalu_format": "native",
    "gluster_hosts": "10.x.x.x",
    "gluster_volname": "dist",
    "gluster_options": "log-level=DEBUG"
}

Volume Operations §

Let's create a 1Gi PVC from 'External' storage pool and look for pv & pvc

1 -> bat --plain ../sample-pvc.yaml -r 14:25 | tee /dev/tty | kubectl apply -f -
2 ---
3 kind: PersistentVolumeClaim
4 apiVersion: v1
5 metadata:
6   name: ext-pvc
7 spec:
8   storageClassName: kadalu.ext-conf
9   accessModes:
10     - ReadWriteMany
11   resources:
12     requests:
13       storage: 1Gi
14 persistentvolumeclaim/ext-pvc created
15 
16 -> kubectl get pv,pvc
17 NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                 STORAGECLASS      REASON   AGE
18 persistentvolume/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7   2Gi        RWX            Delete           Bound    kadalu/replica3-pvc   kadalu.replica3            4h3m
19 persistentvolume/pvc-958bcfeb-65ff-4b41-8c36-6762a0e255f8   1Gi        RWX            Delete           Bound    kadalu/ext-pvc        kadalu.ext-conf            19s
20 
21 NAME                                 STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
22 persistentvolumeclaim/replica3-pvc   Bound    pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7   2Gi        RWX            kadalu.replica3   4h3m
23 persistentvolumeclaim/ext-pvc        Bound    pvc-958bcfeb-65ff-4b41-8c36-6762a0e255f8   1Gi        RWX            kadalu.ext-conf   38s

Similar mounts as seen earlier will appear in provisioner pod when a create PVC request is fulfilled, however let's see what happens in external gluster

1 # Similar directory structure as discussed earlier
2 -> ssh ext-gluster 'ls -l /bricks/brick1/dist/'
3 total 20
4 drwxr-xr-x. 3 root root    20 Aug 11 15:54 info
5 -rw-r--r--. 2 root root 20480 Aug 11 15:54 stat.db
6 drwxr-xr-x. 3 root root    16 Aug 11 15:54 subvol
7 
8 -> ssh ext-gluster 'cat /bricks/brick1/dist/info/subvol/b0/e9/pvc-958bcfeb-65ff-4b41-8c36-6762a0e255f8.json '
9 {"size": 1073741824, "path_prefix": "subvol/b0/e9"}
10 
11 -> ssh ext-gluster 'ls -l /bricks/brick1/dist/subvol/b0/e9/pvc-958bcfeb-65ff-4b41-8c36-6762a0e255f8'
12 total 0

Sample App §

Let's deploy a sample pod with busybox image to use 'ext-pvc'. To cut short the explanation, you can see for yourself apart from storage pool creation kadalu tries to provide similar interface for using both internally auto-managed and externally self-managed gluster cluster.

Below listings are provided to compare and contrast against previous set of operations performed.

-> bat --plain ../sample-app.yaml -r 23:43 | tee /dev/tty | kubectl apply -f -
---
apiVersion: v1
kind: Pod
metadata:
  name: ext-pvc-pod
spec:
  containers:
  - name: ext-pvc-pod
    image: busybox
    imagePullPolicy: IfNotPresent
    command:
      - '/bin/tail'
      - '-f'
      - '/dev/null'
    volumeMounts:
    - mountPath: '/mnt/ext-pvc'
      name: ext-pvc
  volumes:
  - name: ext-pvc
    persistentVolumeClaim:
      claimName: ext-pvc
pod/ext-pvc-pod created

-> kubectl get pods ext-pvc-pod -o wide
NAME          READY   STATUS    RESTARTS   AGE   IP           NODE               NOMINATED NODE   READINESS GATES
ext-pvc-pod   1/1     Running   0          34s   10.42.2.12   k3d-test-agent-1   <none>           <none>

-> kubectl exec -it ext-pvc-pod -- sh
/ # df -h /mnt/ext-pvc
Filesystem                Size      Used Available Use% Mounted on
10.x.x.x:dist       972.8M         0    972.8M   0% /mnt/ext-pvc
/ # cd /mnt/ext-pvc/
/mnt/ext-pvc # cat /dev/urandom | tr -dc [:space:][:print:] | head -c 100m > 100Mfile
/mnt/ext-pvc # ls -lh
total 100M
-rw-r--r--    1 root     root      100.0M Aug 11 10:34 100Mfile

-> ssh ext-gluster 'ls -lh /bricks/brick1/dist/subvol/b0/e9/pvc-958bcfeb-65ff-4b41-8c36-6762a0e255f8'
total 100M
-rw-r--r--. 2 root root 100M Aug 11 16:04 100Mfile

Let's expand PVC from 1Gi to 2Gi. Re-iterating the pre-requisite, PVC expansion for external storage pools can only be performed if kadalu_format is native and SSH Key pair is available for Kadalu to delegate quota operations to external gluster cluster.

-> bat --plain ../sample-pvc.yaml -r 14:25 | tee /dev/tty | kubectl apply -f -
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: ext-pvc
spec:
  storageClassName: kadalu.ext-conf
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 2Gi
persistentvolumeclaim/ext-pvc configured

-> kubectl exec -it ext-pvc-pod -- sh -c 'df -h | grep ext-pvc'
10.x.x.x:dist         1.9G    100.0M      1.8G   5% /mnt/ext-pvc

As the operations are same, the place to look for logs when things go wrong is also same in each stage. For easy reference, we can have a look at sidecar container logs as well to get a feel for how communications happen between our csi driver via csi.sock with sidecar container and kubernetes api.

Kadalu Cleanup §

We almost reached the end, let's take a step back and also look into cleaning up kadalu storage

# Obviously, we need to delete app pods if no longer needed however PVC will stay intact
# incase app pod needs to access data again (as container restarts may happen)
-> kubectl delete pod replica3-pvc-pod ext-pvc-pod
pod "replica3-pvc-pod" deleted
pod "ext-pvc-pod" deleted

-> kubectl get sc,pvc,pv
NAME                                          PROVISIONER   RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
storageclass.storage.k8s.io/kadalu.replica3   kadalu        Delete          Immediate           true                   5h5m
storageclass.storage.k8s.io/kadalu.ext-conf   kadalu        Delete          Immediate           true                   21m

NAME                                 STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
persistentvolumeclaim/replica3-pvc   Bound    pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7   2Gi        RWX            kadalu.replica3   4h22m
persistentvolumeclaim/ext-pvc        Bound    pvc-958bcfeb-65ff-4b41-8c36-6762a0e255f8   2Gi        RWX            kadalu.ext-conf   20m

NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                 STORAGECLASS      REASON   AGE
persistentvolume/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7   2Gi        RWX            Delete           Bound    kadalu/replica3-pvc   kadalu.replica3            4h22m
persistentvolume/pvc-958bcfeb-65ff-4b41-8c36-6762a0e255f8   2Gi        RWX            Delete           Bound    kadalu/ext-pvc        kadalu.ext-conf            20m

# Important: If you try to delete 'kds' resource but a PVC is still being in use from a storage pool
# kubectl may state that resource is deleted but in reality, it'll not get deleted.
# We need to re-apply same storage pool config (operator reconciles state), delete any PVC and
# then storage pool can be deleted
-> kubectl delete pvc --all
persistentvolumeclaim "replica3-pvc" deleted
persistentvolumeclaim "ext-pvc" deleted

As a general note, if you didn't create a resource, let's say storageClass or kadalu-info config map, you shouldn't be deleting it and currently it's not always guaranteed that operator can reconcile from this state as well.

We can confirm that after deletion of PVCs, pvc directory is deleted but remember the structure is intact, we can enhance to remove the structure if no PVC is being served at the leaf directory 😄

-> ssh ext-gluster 'ls -R /bricks/brick1/dist'
/bricks/brick1/dist:
info
stat.db
subvol

/bricks/brick1/dist/info:
subvol

/bricks/brick1/dist/info/subvol:
b0

/bricks/brick1/dist/info/subvol/b0:
e9

/bricks/brick1/dist/info/subvol/b0/e9:

/bricks/brick1/dist/subvol:
b0

/bricks/brick1/dist/subvol/b0:
e9

/bricks/brick1/dist/subvol/b0/e9:

-> kubectl exec -t kadalu-csi-provisioner-0 -c kadalu-provisioner -- sh -c 'ls -R /mnt/replica3'
/mnt/replica3:
info
stat.db
subvol

/mnt/replica3/info:
subvol

/mnt/replica3/info/subvol:
c3

/mnt/replica3/info/subvol/c3:
02

/mnt/replica3/info/subvol/c3/02:

/mnt/replica3/subvol:
c3

/mnt/replica3/subvol/c3:
02

/mnt/replica3/subvol/c3/02:

We didn't delete kds or specific storage-pool yet and so kadalu-info config map still holds the details to carve another PVC if requested. Trimmed most of the data as it'll be same as before.

# Json formatted with 'python -mjson.tool' for readability
-> kubectl describe cm kadalu-info 
Name:         kadalu-info
Namespace:    kadalu
Labels:       <none>
Annotations:  <none>

Data
====
ext-conf.info:
----
{
    "volname": "ext-conf",
    "volume_id": "19ce687a-fa8e-11eb-a07e-56a8d556e557",
    [...]
}
replica3.info:
----
{
    "namespace": "kadalu",
    "kadalu_version": "devel",
    "volname": "replica3",
    "volume_id": "5e39a614-fa66-11eb-a07e-56a8d556e557",
    "kadalu_format": "native",
    "type": "Replica3",
    "pvReclaimPolicy": "delete",
    "bricks": [
        [...]
    ],
    "disperse": {
        "data": 0,
        "redundancy": 0
    },
    "options": {}
}

uid:
----
f6689df0-c4e3-4ecb-a9d4-d788f0edd487
volumes:
----

Events:  <none>

When we delete kds (storage pool) as well then we are left with resources deployed from operator and csi-nodeplugin alone.

-> kubectl delete kds --all
kadalustorage.kadalu-operator.storage "replica3" deleted
kadalustorage.kadalu-operator.storage "ext-conf" deleted

-> kubectl get all
NAME                              READY   STATUS    RESTARTS   AGE
pod/operator-88bd4784c-4ldbv      1/1     Running   0          5h23m
pod/kadalu-csi-provisioner-0      5/5     Running   0          5h23m
pod/kadalu-csi-nodeplugin-pzbf7   3/3     Running   0          5h22m
pod/kadalu-csi-nodeplugin-chc7d   3/3     Running   0          5h22m
pod/kadalu-csi-nodeplugin-lf5ml   3/3     Running   0          5h22m
pod/kadalu-csi-nodeplugin-6hlw9   3/3     Running   0          5h22m

NAME                                   DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/kadalu-csi-nodeplugin   4         4         4       4            4           <none>          5h22m

NAME                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/operator   1/1     1            1           5h23m

NAME                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/operator-88bd4784c   1         1         1       5h23m

NAME                                      READY   AGE
statefulset.apps/kadalu-csi-provisioner   1/1     5h23m

-> kubectl describe cm kadalu-info 
Name:         kadalu-info
Namespace:    kadalu
Labels:       <none>
Annotations:  <none>

Data
====
uid:
----
f6689df0-c4e3-4ecb-a9d4-d788f0edd487
volumes:
----

Events:  <none>

Supply manifests which deployed csi-nodeplugin, operator and all dependent RBAC roles etc for deleting them as below

-> curl -s https://raw.githubusercontent.com/kadalu/kadalu/devel/manifests/csi-nodeplugin.yaml | sed 's/"no"/"yes"/' | kubectl delete -f -
clusterrole.rbac.authorization.k8s.io "kadalu-csi-nodeplugin" deleted
clusterrolebinding.rbac.authorization.k8s.io "kadalu-csi-nodeplugin" deleted
daemonset.apps "kadalu-csi-nodeplugin" deleted

-> curl -s https://raw.githubusercontent.com/kadalu/kadalu/devel/manifests/kadalu-operator.yaml | sed 's/"no"/"yes"/' | kubectl delete -f -
namespace "kadalu" deleted
serviceaccount "kadalu-operator" deleted
serviceaccount "kadalu-csi-nodeplugin" deleted
serviceaccount "kadalu-csi-provisioner" deleted
serviceaccount "kadalu-server-sa" deleted
customresourcedefinition.apiextensions.k8s.io "kadalustorages.kadalu-operator.storage" deleted
clusterrole.rbac.authorization.k8s.io "pod-exec" deleted
clusterrole.rbac.authorization.k8s.io "kadalu-operator" deleted
clusterrolebinding.rbac.authorization.k8s.io "kadalu-operator" deleted
deployment.apps "operator" deleted

Finally, if we don't want a staged cluster cleanup (or) cluster state is inconsistent and no fixes are found, we can apply below cleanup script after deleting app pods in kadalu namespace if there are any.

-> curl -s https://raw.githubusercontent.com/kadalu/kadalu/devel/extras/scripts/cleanup | bash
Error from server (NotFound): statefulsets.apps "kadalu-csi-provisioner" not found
[...]
Error from server (NotFound): namespaces "kadalu" not found

Miscellaneous §

We only covered general usecase(s) upto now and there are other specialized features (considering whole kadalu project) which comes handy if needed, below just mentions about them. Please refer docs or raise an issue for more info.

Decommissioning of storage pool bricks
Re-creating storage pool based on volume id
Migration from Heketi (in progress)

If you are willing to contribute to the project, below are some of the ideas to get you started:

Usage of internal gluster outside of CSI as a general NAS solution
Enabling CSI Snapshots
Support for SMB and windows workloads
Support for volumeMode: Block
Enhance CI infra
Enhance Prometheus monitoring

Unlike previous posts I intentionally left out debugging/code walk through as it may not be of particular interest from a user/admin perspective. However, please reach out by any means (or simply comment here) for more info.

Summary §

Let's recap what we've have discussed so far and I hope you'll be better equipped to approach some of the CSI solutions after a couple of reads of this post.

Operator responds to events on kadalustorages CRD
Captures the required info into kadalu-info config map
Bring up server pods or connect to external gluster if required and write directory structure on brick
All containers which require volfiles will fill pre-made jinja templates with kadalu-info or ask for volfile incase of external gluster and starts brick/mount process
Sidercars in CSI pods will be listening to api-server and delegates requests to already deployed CSI pods
Based on the invoked gRPC call, process running in CSI pods fulfills the request

And this brings a closure to the series Exploring Kadalu Storage unless something exciting happens or I get any new requests to blog around kadalu specifics.

Send an email for any comments. Kudos for making it to the end. Thanks!

1	-> kubectl get ns
2	NAME STATUS AGE
3	default Active 3m39s
4	kube-system Active 3m39s
5	kube-public Active 3m39s
6	kube-node-lease Active 3m39s
7
8	-> kubectl get nodes -o wide
9	NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
10	k3d-test-server-0 Ready control-plane,master 3m39s v1.21.2+k3s1 172.18.0.2 <none> Unknown 5.12.14-300.fc34.x86_64 containerd://1.4.4-k3s2
11	k3d-test-agent-0 Ready <none> 3m39s v1.21.2+k3s1 172.18.0.3 <none> Unknown 5.12.14-300.fc34.x86_64 containerd://1.4.4-k3s2
12	k3d-test-agent-1 Ready <none> 3m39s v1.21.2+k3s1 172.18.0.4 <none> Unknown 5.12.14-300.fc34.x86_64 containerd://1.4.4-k3s2
13	k3d-test-agent-2 Ready <none> 3m39s v1.21.2+k3s1 172.18.0.6 <none> Unknown 5.12.14-300.fc34.x86_64 containerd://1.4.4-k3s2
14
15	-> kubectl version --short=true
16	Client Version: v1.20.2
17	Server Version: v1.21.2+k3s1

1	-> kubectl create namespace kadalu
2	namespace/kadalu created
3
4	-> kubectl create secret generic glusterquota-ssh-secret --from-literal=glusterquota-ssh-username=root --from-file=ssh-privatekey=/root/.ssh/id_rsa -n kadalu
5	secret/glusterquota-ssh-secret created
6
7	-> kubectl config set-context --current --namespace=kadalu
8
9	-> kubectl get all
10	No resources found in kadalu namespace.
11
12	-> kubectl get csidrivers
13	No resources found

1	# Warning can be ignored here
2	-> curl -s https://raw.githubusercontent.com/kadalu/kadalu/devel/manifests/kadalu-operator.yaml \| sed 's/"no"/"yes"/' \| kubectl apply -f -
3	Warning: resource namespaces/kadalu is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
4	namespace/kadalu configured
5	serviceaccount/kadalu-operator created
6	serviceaccount/kadalu-csi-nodeplugin created
7	serviceaccount/kadalu-csi-provisioner created
8	serviceaccount/kadalu-server-sa created
9	customresourcedefinition.apiextensions.k8s.io/kadalustorages.kadalu-operator.storage created
10	clusterrole.rbac.authorization.k8s.io/pod-exec created
11	clusterrole.rbac.authorization.k8s.io/kadalu-operator created
12	clusterrolebinding.rbac.authorization.k8s.io/kadalu-operator created
13	deployment.apps/operator created
14
15	-> curl -s https://raw.githubusercontent.com/kadalu/kadalu/devel/manifests/csi-nodeplugin.yaml \| sed 's/"no"/"yes"/' \| kubectl apply -f -
16	clusterrole.rbac.authorization.k8s.io/kadalu-csi-nodeplugin created
17	clusterrolebinding.rbac.authorization.k8s.io/kadalu-csi-nodeplugin created
18	daemonset.apps/kadalu-csi-nodeplugin created
19
20	-> kubectl get all
21	NAME READY STATUS RESTARTS AGE
22	pod/operator-88bd4784c-4ldbv 1/1 Running 0 115s
23	pod/kadalu-csi-provisioner-0 5/5 Running 0 110s
24	pod/kadalu-csi-nodeplugin-pzbf7 3/3 Running 0 95s
25	pod/kadalu-csi-nodeplugin-chc7d 3/3 Running 0 95s
26	pod/kadalu-csi-nodeplugin-lf5ml 3/3 Running 0 95s
27	pod/kadalu-csi-nodeplugin-6hlw9 3/3 Running 0 95s
28
29	NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
30	daemonset.apps/kadalu-csi-nodeplugin 4 4 4 4 4 <none> 95s
31
32	NAME READY UP-TO-DATE AVAILABLE AGE
33	deployment.apps/operator 1/1 1 1 115s
34
35	NAME DESIRED CURRENT READY AGE
36	replicaset.apps/operator-88bd4784c 1 1 1 115s
37
38	NAME READY AGE
39	statefulset.apps/kadalu-csi-provisioner 1/1 110s

1	-> kubectl get csidriver
2	NAME ATTACHREQUIRED PODINFOONMOUNT STORAGECAPACITY TOKENREQUESTS REQUIRESREPUBLISH MODES AGE
3	kadalu true false false <unset> false Persistent 72s
4
5	-> kubectl describe cm kadalu-info
6	Name: kadalu-info
7	Namespace: kadalu
8	Labels: <none>
9	Annotations: <none>
10
11	Data
12	====
13	uid:
14	----
15	f6689df0-c4e3-4ecb-a9d4-d788f0edd487
16	volumes:
17	----
18
19	Events: <none>
20
21	-> kubectl get sc
22	No resources found
23
24	-> kubectl get kds
25	No resources found in kadalu namespace.

1	# Our intention is to create a storage pool of type `Replica3` using devices
2	# spread across three nodes.
3	-> bat --plain ../storage-config-device.yaml \| tee /dev/tty \| kubectl apply -f -
4	---
5	apiVersion: kadalu-operator.storage/v1alpha1
6	kind: KadaluStorage
7	metadata:
8	name: replica3
9	spec:
10	type: Replica3
11	storage:
12	- node: k3d-test-agent-0
13	device: /dev/sdc
14	- node: k3d-test-agent-1
15	device: /dev/sdd
16	- node: k3d-test-agent-2
17	device: /dev/sde
18	kadalustorage.kadalu-operator.storage/replica3 created
19
20	# One `server` pod per `device` (~brick) is created and a Headless Service is deployed with it.
21	-> kubectl get all -l app.kubernetes.io/component=server -o wide
22	NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
23	pod/server-replica3-0-0 1/1 Running 0 20s 10.42.1.15 k3d-test-agent-0 <none> <none>
24	pod/server-replica3-1-0 1/1 Running 0 20s 10.42.2.11 k3d-test-agent-1 <none> <none>
25	pod/server-replica3-2-0 1/1 Running 0 19s 10.42.3.8 k3d-test-agent-2 <none> <none>
26
27	NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
28	service/replica3 ClusterIP None <none> 24007/TCP 18s app.kubernetes.io/component=server,app.kubernetes.io/name=server,app.kubernetes.io/part-of=kadalu
29
30	NAME READY AGE CONTAINERS IMAGES
31	statefulset.apps/server-replica3-0 1/1 20s server docker.io/kadalu/kadalu-server:devel
32	statefulset.apps/server-replica3-1 1/1 20s server docker.io/kadalu/kadalu-server:devel
33	statefulset.apps/server-replica3-2 1/1 19s server docker.io/kadalu/kadalu-server:devel
34
35	# We can contact each `server` pod at below mentioned `Endpoints` or resolve to them using
36	# their DNS name and k8s will internally handle the dns query
37	-> kubectl describe svc
38	Name: replica3
39	Namespace: kadalu
40	Labels: app.kubernetes.io/component=server
41	app.kubernetes.io/name=replica3-service
42	app.kubernetes.io/part-of=kadalu
43	Annotations: <none>
44	Selector: app.kubernetes.io/component=server,app.kubernetes.io/name=server,app.kubernetes.io/part-of=kadalu
45	Type: ClusterIP
46	IP Family Policy: SingleStack
47	IP Families: IPv4
48	IP: None
49	IPs: None
50	Port: brickport 24007/TCP
51	TargetPort: 24007/TCP
52	Endpoints: 10.42.1.15:24007,10.42.2.11:24007,10.42.3.8:24007
53	Session Affinity: None
54	Events: <none>
55
56	# Below are the backend 'data' bricks and it can be referred that an xfs filesytem is created
57	# on the supplied device and mounted at '/bricks/<storage-pool-name>/data`'
58	# please observe kadalu-info configmap in next listing to find this path
59	-> for i in 0 1 2; do kubectl exec -it server-replica3-$i-0 -- sh -c 'hostname; df -hT \| grep bricks; ls -lR /bricks/replica3/data'; done;
60	server-replica3-0-0
61	/dev/sdc xfs 10G 104M 9.9G 2% /bricks/replica3/data
62	/bricks/replica3/data:
63	total 0
64	drwxr-xr-x. 3 root root 24 Aug 11 05:38 brick
65
66	/bricks/replica3/data/brick:
67	total 0
68	server-replica3-1-0
69	/dev/sdd xfs 10G 104M 9.9G 2% /bricks/replica3/data
70	/bricks/replica3/data:
71	total 0
72	drwxr-xr-x. 3 root root 24 Aug 11 05:38 brick
73
74	/bricks/replica3/data/brick:
75	total 0
76	server-replica3-2-0
77	/dev/sde xfs 10G 104M 9.9G 2% /bricks/replica3/data
78	/bricks/replica3/data:
79	total 0
80	drwxr-xr-x. 3 root root 24 Aug 11 05:38 brick
81
82	/bricks/replica3/data/brick:
83	total 0
84
85	# After creation of storage pool, above pods & services are deployed and a directory
86	# structure is created in backend brick but nothing can be seen in `provisioner` and
87	# `nodeplugin` yet. Just showing `secret-volume` below which holds SSH Key Pair info
88	# of external gluster
89	-> kubectl exec -it kadalu-csi-provisioner-0 -c kadalu-provisioner -- sh -c 'df -h \| grep -P secret'
90	tmpfs 3.9G 8.0K 3.9G 1% /etc/secret-volume
91	tmpfs 3.9G 12K 3.9G 1% /run/secrets/kubernetes.io/serviceaccount
92
93	-> kubectl exec -it kadalu-csi-provisioner-0 -c kadalu-provisioner -- sh -c 'df -h \| grep kadalu'
94	command terminated with exit code 1

1	# Ping should be successfully and port `24007` should be reachable
2	-> kubectl exec -it sts/kadalu-csi-provisioner -c kadalu-logging -- sh -c 'ping -c 5 server-replica3-0-0.replica3; nc -zv server-replica3-0-0.replica3 24007'
3	PING server-replica3-0-0.replica3 (10.42.1.15): 56 data bytes
4	64 bytes from 10.42.1.15: seq=0 ttl=62 time=13.834 ms
5	64 bytes from 10.42.1.15: seq=1 ttl=62 time=0.319 ms
6	64 bytes from 10.42.1.15: seq=2 ttl=62 time=0.350 ms
7	64 bytes from 10.42.1.15: seq=3 ttl=62 time=0.286 ms
8	64 bytes from 10.42.1.15: seq=4 ttl=62 time=0.311 ms
9
10	--- server-replica3-0-0.replica3 ping statistics ---
11	5 packets transmitted, 5 packets received, 0% packet loss
12	round-trip min/avg/max = 0.286/3.020/13.834 ms
13	server-replica3-0-0.replica3 (10.42.1.15:24007) open

1	# This here, I'd say is the most important piece gluing operator, server and csi pods.
2	# Right off the bat you can see all the required info needed to create a volfile which
3	# when supplied to glusterfs binary does what it does the best, pooling storage and
4	# serving it under a single namespace
5
6	# Json formatted with 'python -mjson.tool' for readability
7	-> kubectl describe cm kadalu-info
8	Name: kadalu-info
9	Namespace: kadalu
10	Labels: <none>
11	Annotations: <none>
12
13	Data
14	====
15	volumes:
16	----
17
18	replica3.info:
19	----
20	{
21	"namespace": "kadalu",
22	"kadalu_version": "devel",
23	"volname": "replica3",
24	"volume_id": "5e39a614-fa66-11eb-a07e-56a8d556e557",
25	"kadalu_format": "native",
26	"type": "Replica3",
27	"pvReclaimPolicy": "delete",
28	"bricks": [
29	{
30	"brick_path": "/bricks/replica3/data/brick",
31	"kube_hostname": "k3d-test-agent-0",
32	"node": "server-replica3-0-0.replica3",
33	"node_id": "node-0",
34	"host_brick_path": "",
35	"brick_device": "/dev/sdc",
36	"pvc_name": "",
37	"brick_device_dir": "",
38	"decommissioned": "",
39	"brick_index": 0
40	},
41	{
42	"brick_path": "/bricks/replica3/data/brick",
43	"kube_hostname": "k3d-test-agent-1",
44	"node": "server-replica3-1-0.replica3",
45	"node_id": "node-1",
46	"host_brick_path": "",
47	"brick_device": "/dev/sdd",
48	"pvc_name": "",
49	"brick_device_dir": "",
50	"decommissioned": "",
51	"brick_index": 1
52	},
53	{
54	"brick_path": "/bricks/replica3/data/brick",
55	"kube_hostname": "k3d-test-agent-2",
56	"node": "server-replica3-2-0.replica3",
57	"node_id": "node-2",
58	"host_brick_path": "",
59	"brick_device": "/dev/sde",
60	"pvc_name": "",
61	"brick_device_dir": "",
62	"decommissioned": "",
63	"brick_index": 2
64	}
65	],
66	"disperse": {
67	"data": 0,
68	"redundancy": 0
69	},
70	"options": {}
71	}
72
73	uid:
74	----
75	f6689df0-c4e3-4ecb-a9d4-d788f0edd487
76	Events: <none>

1	# Moment of truth, this is what we finally want a 'storageClass' to create PV
2	# Look at 'ALLOWVOLUMEEXPANSION', it's set to 'true' as we deployed 'kadalu_format' in 'native' mode in storage pool config
3	# Observe the naming, name of storage pool is 'replica3' and it deploys a 'sc' with name 'kadalu.replica3'
4	-> kubectl get kds,sc
5	NAME AGE
6	kadalustorage.kadalu-operator.storage/replica3 31m
7
8	NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
9	storageclass.storage.k8s.io/kadalu.replica3 kadalu Delete Immediate true 31m
10
11	# As expected no PV or PVC yet
12	-> kubectl get pv,pvc
13	No resources found

1	# Ask for an 1Gi pvc from 'kadalu.replica3' storageClass
2	-> bat --plain ../sample-pvc.yaml -r :13 \| tee /dev/tty \| kubectl apply -f -
3	# File: sample-pvc.yaml
4	---
5	kind: PersistentVolumeClaim
6	apiVersion: v1
7	metadata:
8	name: replica3-pvc
9	spec:
10	storageClassName: kadalu.replica3
11	accessModes:
12	- ReadWriteMany
13	resources:
14	requests:
15	storage: 1Gi
16	persistentvolumeclaim/replica3-pvc created
17
18	# PVC is created and PV is bounded to dynamically created PVC
19	# Observe 'RECLAIM POLICY', currently it's 'Delete', if we want to retain PVC when a delete request is
20	# received on it, storage pool need to created with 'pvReclaimPolicy' as 'archive'
21	-> kubectl get pv,pvc
22	NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
23	persistentvolume/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7 1Gi RWX Delete Bound kadalu/replica3-pvc kadalu.replica3 42s
24
25	NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
26	persistentvolumeclaim/replica3-pvc Bound pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7 1Gi RWX kadalu.replica3 43s
27
28	# Now things start to change in 'provisioner', the sidecar (csi-provisioner) listens to k8s api-server
29	# sends gRPC call to 'kadalu-provisioner' for creation of PVC
30	-> kubectl exec -it kadalu-csi-provisioner-0 -c kadalu-provisioner -- bash
31	root@kadalu-csi-provisioner-0:/# df -hT \| grep kadalu
32	kadalu:replica3 fuse.glusterfs 10G 207M 9.8G 3% /mnt/replica3
33	root@kadalu-csi-provisioner-0:/# ls /mnt/replica3/
34	info stat.db subvol
35	root@kadalu-csi-provisioner-0:/# ls /mnt/replica3/subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7/
36	root@kadalu-csi-provisioner-0:/# cat /mnt/replica3/info/subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7.json
37	{"size": 1073741824, "path_prefix": "subvol/c3/02"}root@kadalu-csi-provisioner-0:/#

1	# Normal 'busybox' but with a volume mount referring to above created PVC
2	-> bat --plain ../sample-app.yaml -r :22 \| tee /dev/tty \| kubectl apply -f -
3	# File: sample-app.yaml
4	---
5	apiVersion: v1
6	kind: Pod
7	metadata:
8	name: replica3-pvc-pod
9	spec:
10	containers:
11	- name: replica3-pvc-pod
12	image: busybox
13	imagePullPolicy: IfNotPresent
14	command:
15	- '/bin/tail'
16	- '-f'
17	- '/dev/null'
18	volumeMounts:
19	- mountPath: '/mnt/replica3-pvc'
20	name: replica3-pvc
21	volumes:
22	- name: replica3-pvc
23	persistentVolumeClaim:
24	claimName: replica3-pvc
25	pod/replica3-pvc-pod created
26
27	# Pod is scheduled on node 'k3d-test-server-0', let's find nodeplugin which is running on that node
28	-> kubectl get pods replica3-pvc-pod -o wide
29	NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
30	replica3-pvc-pod 1/1 Running 0 50s 10.42.0.8 k3d-test-server-0 <none> <none>
31
32	# We are interested in below nodeplugin
33	-> kubectl get pods -o wide \| grep k3d-test-server-0
34	kadalu-csi-nodeplugin-6hlw9 3/3 Running 0 95m 10.42.0.6 k3d-test-server-0 <none> <none>
35	replica3-pvc-pod 1/1 Running 0 20m 10.42.0.8 k3d-test-server-0 <none> <none>
36
37	# Wait a sec, why do nodeplugin also has whole storage pool mounted, but not only the PVC?
38	# Remember, we use fuse-subdir functionality of glusterfs to only surface required PVC to app pod
39	-> kubectl exec -it kadalu-csi-nodeplugin-6hlw9 -c kadalu-nodeplugin -- bash
40	root@kadalu-csi-nodeplugin-6hlw9:/# df -hT \| grep kadalu
41	kadalu:replica3 fuse.glusterfs 10G 207M 9.8G 3% /mnt/replica3
42	root@kadalu-csi-nodeplugin-6hlw9:/# ls /mnt/replica3/
43	info stat.db subvol
44	root@kadalu-csi-nodeplugin-6hlw9:/# ls /mnt/replica3/subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7/
45	root@kadalu-csi-nodeplugin-6hlw9:/# cat /mnt/replica3/info/subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7.json
46	{"size": 1073741824, "path_prefix": "subvol/c3/02"}root@kadalu-csi-nodeplugin-6hlw9:/#
47
48	# Before traversing how pvc is surfaced to app pod, let's write some data to the mount
49	# Ok, nodeplugin saw whole storage pool but app pod is seeing only 1Gi with same filesystem name?
50	# We'll look how above is possible in next listing
51	# Anyways, we can see an increase of 100M and is also reflected in nodeplugin as well
52	-> k exec -it replica3-pvc-pod -- sh
53	/ # df -hT \| grep kadalu
54	kadalu:replica3 fuse.glusterfs 1.0G 0 1.0G 0% /mnt/replica3-pvc
55	/ # cd /mnt/replica3-pvc/
56	/mnt/replica3-pvc # cat /dev/urandom \| tr -dc [:space:][:print:] \| head -c 100m > 100Mfile;
57	/mnt/replica3-pvc # df -h \| grep kadalu
58	kadalu:replica3 1.0G 100.0M 924.0M 10% /mnt/replica3-pvc
59
60	-> kubectl exec -it kadalu-csi-nodeplugin-6hlw9 -c kadalu-nodeplugin -- sh -c 'df -hT \| grep kadalu'
61	kadalu:replica3 fuse.glusterfs 10G 307M 9.7G 3% /mnt/replica3

1	# We get the uid's and hunt for volumes mounted in pods by travesing corresponding 'kubelet' volumes directory
2	-> kubectl get pods -o=custom-columns='NAME:.metadata.name,NODE:.spec.nodeName,UID:.metadata.uid' \| grep -P 'provisioner\|nodeplugin-6hlw9\|replica3-pvc'
3	kadalu-csi-provisioner-0 k3d-test-agent-2 c92d4fa9-90cf-46a5-8bd6-93aeebaca6a9
4	kadalu-csi-nodeplugin-6hlw9 k3d-test-server-0 98166ddd-5157-4572-ba6c-0a577efd6127
5	replica3-pvc-pod k3d-test-server-0 4ad3ba8c-457e-4b81-9404-b793967165b2
6
7	-> kubectl get pvc
8	NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
9	replica3-pvc Bound pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7 1Gi RWX kadalu.replica3 3h10m
10
11	# Entries in nodes running 'nodeplugin' and 'provisioner' pods are not interesting
12	# They are self-explanatory and 'projected' just holds access keys for api-server
13	-> docker exec -it k3d-test-agent-2 sh -c 'ls /var/lib/kubelet/pods/c92d4fa9-90cf-46a5-8bd6-93aeebaca6a9/volumes'
14	kubernetes.io~configmap kubernetes.io~projected
15	kubernetes.io~empty-dir kubernetes.io~secret
16
17	-> docker exec -it k3d-test-server-0 sh -c 'ls /var/lib/kubelet/pods/98166ddd-5157-4572-ba6c-0a577efd6127/volumes'
18	kubernetes.io~configmap kubernetes.io~empty-dir kubernetes.io~projected
19
20	# Entry 'kubernetes.io~csi' is interesting for us and this uid belongs to app pod (replica3-pvc-pod)
21	-> docker exec -it k3d-test-server-0 sh -c 'ls /var/lib/kubelet/pods/4ad3ba8c-457e-4b81-9404-b793967165b2/volumes'
22	kubernetes.io~csi kubernetes.io~projected
23
24	# We (nodeplugin running on node where 'replica3-pvc-pod' to be scheduled) received a request to mount pvc at 'target_path'
25	# 'kubelet' waits for nodeplugin to respond and after confirming that mount is available app pod is scheduled to that node
26	# Btw, 'nodeplugin' only mounts PVC subdir and that is limited to 1Gi by simple-quota and surfaced to 'replica3-pvc-pod'
27	-> kubectl logs kadalu-csi-nodeplugin-6hlw9 -c kadalu-nodeplugin \| sed -n '/Received/,/Mounted PV/p'
28	[2021-08-11 06:51:41,627] DEBUG [nodeserver - 71:NodePublishVolume] - Received a valid mount request request=volume_id: "pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7"
29	target_path: "/var/lib/kubelet/pods/4ad3ba8c-457e-4b81-9404-b793967165b2/volumes/kubernetes.io~csi/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7/mount"
30	[...]
31	volume_context {
32	key: "path"
33	value: "subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7"
34	}
35	volume_context {
36	key: "pvtype"
37	value: "subvol"
38	}
39	[...]
40	volume_context {
41	key: "type"
42	value: "Replica3"
43	}
44	voltype=Replica3 hostvol=replica3 pvpath=subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7 pvtype=subvol
45	[2021-08-11 06:51:41,769] DEBUG [nodeserver - 104:NodePublishVolume] - Mounted Hosting Volume pv=pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7 hostvol=replica3 mntdir=/mnt/replica3
46	[2021-08-11 06:51:41,841] INFO [nodeserver - 113:NodePublishVolume] - Mounted PV volume=pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7 pvpath=subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7 pvtype=subvol hostvol=replica3 duration_seconds=0.21625757217407227

1	# From app pod 'replica3-pvc-pod'
2	-> kubectl exec -it replica3-pvc-pod -- sh -c 'ls -lh /mnt/replica3-pvc'
3	total 100M
4	-rw-r--r-- 1 root root 100.0M Aug 11 07:22 100Mfile
5
6	# From the node where app pod is running
7	-> docker exec -it k3d-test-server-0 sh -c 'ls -lh /var/lib/kubelet/pods/4ad3ba8c-457e-4b81-9404-b793967165b2/volumes/kubernetes.io~csi/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7/mount'
8	total 100M
9	-rw-r--r-- 1 0 0 100M Aug 11 07:22 100Mfile
10
11	# From nodeplugin on the node where app pod is running
12	-> kubectl exec -it kadalu-csi-nodeplugin-6hlw9 -c kadalu-nodeplugin -- sh -c 'ls -lh /mnt/replica3/subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7'
13	total 100M
14	-rw-r--r--. 1 root root 100M Aug 11 07:22 100Mfile
15
16	# From provisioner, all quota related operations are performed in provisioner pod
17	-> kubectl exec -it kadalu-csi-provisioner-0 -c kadalu-provisioner -- sh -c 'ls -lh /mnt/replica3/subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7'
18	total 100M
19	-rw-r--r--. 1 root root 100M Aug 11 07:22 100Mfile
20
21	# Lastly from any of the backend bricks, you can't read data if pool is of type `disperse`
22	-> kubectl exec -it server-replica3-0-0 -- sh -c 'ls -lh /bricks/replica3/data/brick/subvol/c3/02/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7'
23	total 100M
24	-rw-r--r--. 2 root root 100M Aug 11 07:22 100Mfile

1	# As of now, app pod sees only 1Gi of size
2	-> kubectl exec -it replica3-pvc-pod -- sh -c 'df -h /mnt/replica3-pvc'
3	Filesystem Size Used Available Use% Mounted on
4	kadalu:replica3 1.0G 100.0M 924.0M 10% /mnt/replica3-pvc
5
6	# Just change 'storage' to 2Gi
7	-> bat --plain ../sample-pvc.yaml -r :13 \| k apply -f -
8	# File: sample-pvc.yaml
9	---
10	kind: PersistentVolumeClaim
11	apiVersion: v1
12	metadata:
13	name: replica3-pvc
14	spec:
15	storageClassName: kadalu.replica3
16	accessModes:
17	- ReadWriteMany
18	resources:
19	requests:
20	storage: 2Gi
21	persistentvolumeclaim/replica3-pvc configured
22
23	# As the operation invovled is only manipulating quota ops, it'll be near instant
24	# to reflect updated 'Size' in app pod
25	-> kubectl exec -it replica3-pvc-pod -- sh -c 'df -h /mnt/replica3-pvc'
26	Filesystem Size Used Available Use% Mounted on
27	kadalu:replica3 2.0G 100.0M 1.9G 5% /mnt/replica3-pvc

1	# Make sure volume is started and reachable from outside the cluster
2	# We just need below info:
3	# gluster_host: 10.x.x.x
4	# gluster_volname: dist
5	-> ssh ext-gluster 'gluster volume info'
6
7	Volume Name: dist
8	Type: Distribute
9	Volume ID: 09f25449-c7d8-4904-9293-a45b848221ac
10	Status: Started
11	Snapshot Count: 0
12	Number of Bricks: 1
13	Transport-type: tcp
14	Bricks:
15	Brick1: 10.x.x.x:/bricks/brick1/dist
16	Options Reconfigured:
17	storage.fips-mode-rchecksum: on
18	transport.address-family: inet
19	nfs.disable: on
20
21	# Enable quota on gluster volume to be able to use in kadalu native format
22	-> ssh ext-gluster 'gluster volume quota dist enable'
23	volume quota : success
24
25	-> kubectl get kds,sc
26	NAME AGE
27	kadalustorage.kadalu-operator.storage/replica3 4h35m
28
29	NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
30	storageclass.storage.k8s.io/kadalu.replica3 kadalu Delete Immediate true 4h35m
31
32	-> kubectl exec -it kadalu-csi-provisioner-0 -c kadalu-provisioner -- sh -c 'df -hT \| grep kadalu'
33	kadalu:replica3 fuse.glusterfs 10G 307M 9.7G 3% /mnt/replica3

1	# We are using same CRD for both internal and external gluster, operator will validate parameters based on
2	# storage pool type, below is of type 'External'
3	-> bat --plain ../storage-config-external.yaml \| tee /dev/tty \| kubectl apply -f -
4	---
5	apiVersion: kadalu-operator.storage/v1alpha1
6	kind: KadaluStorage
7	metadata:
8	name: ext-conf
9	spec:
10	type: External
11	details:
12	gluster_host: 10.x.x.x
13	gluster_volname: dist
14	gluster_options: log-level=DEBUG
15	kadalustorage.kadalu-operator.storage/ext-conf created
16
17	# As we saw earlier, 'kadalu.' is prepended to applied CR name
18	# However, please note 'ALLOWVOLUMEEXPANSION' is set true, only if we are using kadalu_format as 'native'
19	# and an SSH key pair secret for accessing external cluster is mounted in provisioner pod
20	-> kubectl get kds,sc
21	NAME AGE
22	kadalustorage.kadalu-operator.storage/replica3 4h37m
23	kadalustorage.kadalu-operator.storage/ext-conf 41s
24
25	NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
26	storageclass.storage.k8s.io/kadalu.replica3 kadalu Delete Immediate true 4h37m
27	storageclass.storage.k8s.io/kadalu.ext-conf kadalu Delete Immediate true 36s

1	-> bat --plain ../sample-pvc.yaml -r 14:25 \| tee /dev/tty \| kubectl apply -f -
2	---
3	kind: PersistentVolumeClaim
4	apiVersion: v1
5	metadata:
6	name: ext-pvc
7	spec:
8	storageClassName: kadalu.ext-conf
9	accessModes:
10	- ReadWriteMany
11	resources:
12	requests:
13	storage: 1Gi
14	persistentvolumeclaim/ext-pvc created
15
16	-> kubectl get pv,pvc
17	NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
18	persistentvolume/pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7 2Gi RWX Delete Bound kadalu/replica3-pvc kadalu.replica3 4h3m
19	persistentvolume/pvc-958bcfeb-65ff-4b41-8c36-6762a0e255f8 1Gi RWX Delete Bound kadalu/ext-pvc kadalu.ext-conf 19s
20
21	NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
22	persistentvolumeclaim/replica3-pvc Bound pvc-dcbb7812-a5a2-402a-8ffd-2a0020a6ecc7 2Gi RWX kadalu.replica3 4h3m
23	persistentvolumeclaim/ext-pvc Bound pvc-958bcfeb-65ff-4b41-8c36-6762a0e255f8 1Gi RWX kadalu.ext-conf 38s

1	# Similar directory structure as discussed earlier
2	-> ssh ext-gluster 'ls -l /bricks/brick1/dist/'
3	total 20
4	drwxr-xr-x. 3 root root 20 Aug 11 15:54 info
5	-rw-r--r--. 2 root root 20480 Aug 11 15:54 stat.db
6	drwxr-xr-x. 3 root root 16 Aug 11 15:54 subvol
7
8	-> ssh ext-gluster 'cat /bricks/brick1/dist/info/subvol/b0/e9/pvc-958bcfeb-65ff-4b41-8c36-6762a0e255f8.json '
9	{"size": 1073741824, "path_prefix": "subvol/b0/e9"}
10
11	-> ssh ext-gluster 'ls -l /bricks/brick1/dist/subvol/b0/e9/pvc-958bcfeb-65ff-4b41-8c36-6762a0e255f8'
12	total 0