thoughtexpo

... @leelavg's corner on the internet, night or day, small or big.

Exploring Kadalu Storage in k3d Cluster - CSI Driver

2021-Mar-25 • Tags: kubernetes, kadalu, csi

In the previous article we setup a k3d cluster and discussed about a typical workflow. We'll be utilising earlier concepts and deploy a CSI Driver on the k3d cluster, perform minimal operations while exploring Kadalu storage as Persistence Storage via CSI.

Even though I'll be concentrating on the Kadalu CSI Driver component in current blog post, in itself has many moving parts. Due to that, I'll be making cross references than re-iterating the details and add extra context only when it's needed. On that note, let's get started.

Introduction §

In short, kadalu is a/an:

Kadalu storage can be published to various Container Orchestrators (Kubernetes, RKE, OpenShift, Microk8s)

If you have a running Kubernetes cluster and want to deploy Kadalu storage please refer quick-start from the docs. However, this blog post deals with local testing/development with k3d and it's a bit involving when deploying any CSI storage on a docker based environment alone, so please follow along.

You can use one of devices or directory path or persistent volumes to act as an underlying storage for gluster. We'll reserve all minute details around Operator and Gluster storage in containers for a later post and concentrate on CSI Driver for now.

If you are feeling adventurous and just want a script to setup and teardown k3d cluster with kadalu storage please refer this script but it carries a huge disclaimer that do not run without checking what it does or else your devices (sdc, sdd, sde) will get formatted. ⚠️

Kindly raise a github issue if any of the processes stated here resulting in an error

Kadalu in k3d cluster §

Storage systems in Kubernetes need a bi-directional mount to the underlying host and in our case we need to have a shared directory (for storing secret tokens) with k3d mapping to host system as well.

Please create cluster with below commands, I strongly recommend going through previous article to get to know about local container registry, importing images into k3d cluster etc.:

# I'll be using below directories for gluster storage
-> df -h | grep /mnt
/dev/sdc                             10G  104M  9.9G   2% /mnt/sdc
/dev/sdd                             10G  104M  9.9G   2% /mnt/sdd
/dev/sde                             10G  104M  9.9G   2% /mnt/sde

# Make a dir to be used for shared mount
-> mkdir -p /tmp/k3d/kubelet/pods

# My local registry (optional, if not used remove corresponding arg while creating the cluster)
-> bat ~/.k3d/registries.yaml  --plain
mirrors:
  "registry.localhost:5000":
    endpoint:
      - "http://registry.localhost:5000"

# Create a k3d cluster with volume mounts and local registry
-> k3d cluster create test -a 3 -v /tmp/k3d/kubelet/pods:/var/lib/kubelet/pods:shared \
-v /mnt/sdc:/mnt/sdc -v /mnt/sdd:/mnt/sdd -v /mnt/sde:/mnt/sde \
-v ~/.k3d/registries.yaml:/etc/rancher/k3s/registries.yaml
[...]
INFO[0000] Created volume 'k3d-test-images'
INFO[0001] Creating node 'k3d-test-server-0'
[...]
INFO[0044] Starting helpers...
INFO[0044] Starting Node 'k3d-test-serverlb'
[...]
kubectl cluster-info

# Deploy kadalu operator with setting 'verbose' to 'yes'
-> curl -s https://raw.githubusercontent.com/kadalu/kadalu/devel/manifests/kadalu-operator.yaml \
| sed 's/"no"/"yes"/' | kubectl apply -f -

Once kadalu operator is deployed it reconciles the state as per config and deploys nodeplugin as daemonset,provisioner (~controller) as statefulset and watches CRD for creating kadalu storage among others.

Things to take note of:

  1. You can refer above stated script for importing local docker images into k3d cluster before deploying the operator.
  2. For installing operator through helm please refer github
  3. At the time of this writing, HEAD on devel branch is at commit 9fe6ad4

Verify all the pods are deployed and are in running state in kadalu namespace. You can install kubectx and kubens for easy navigation across contexts and namespaces.

-> kubectl get pods -n kadalu -o wide
NAME                          READY   STATUS    RESTARTS   AGE   IP          NODE                NOMINATED NODE   READINESS GATES
operator-88bd4784c-bkzlt      1/1     Running   0          23m   10.42.0.5   k3d-test-server-0   <none>           <none>
kadalu-csi-nodeplugin-8ttmk   3/3     Running   0          23m   10.42.3.3   k3d-test-agent-2    <none>           <none>
kadalu-csi-nodeplugin-fv57x   3/3     Running   0          23m   10.42.1.5   k3d-test-agent-0    <none>           <none>
kadalu-csi-nodeplugin-ngfm2   3/3     Running   0          23m   10.42.2.4   k3d-test-agent-1    <none>           <none>
kadalu-csi-nodeplugin-7qwhm   3/3     Running   0          23m   10.42.0.6   k3d-test-server-0   <none>           <none>
kadalu-csi-provisioner-0      5/5     Running   0          23m   10.42.3.4   k3d-test-agent-2    <none>           <none>

# Using mounted volumes for creating storage pool
-> bat ../storage-config-path.yaml --plain; kubectl apply -f ../storage-config-path.yaml
---
apiVersion: kadalu-operator.storage/v1alpha1
kind: KadaluStorage
metadata:
  name: replica3
spec:
  type: Replica3
  storage:
    - node: k3d-test-agent-0
      path: /mnt/sdc
    - node: k3d-test-agent-1
      path: /mnt/sdd
    - node: k3d-test-agent-2
      path: /mnt/sde
kadalustorage.kadalu-operator.storage/replica3 created

# Verify server pods are up and running
-> kubectl get pods -l app.kubernetes.io/component=server
NAME                  READY   STATUS    RESTARTS   AGE
server-replica3-1-0   1/1     Running   0          4m28s
server-replica3-2-0   1/1     Running   0          4m27s
server-replica3-0-0   1/1     Running   0          4m29s

The end, you can follow official docs for creating pv, pvs from above created kadalu.replica3 storage class and use them in app pods and comfortably skip what follows next or continue if you want to know about debugging Kadalu CSI Driver (or running a debug container in general).

Debugging Kadalu CSI Driver §

I read a couple of blog posts discussing about debugging a (python) application running in a container however they didn't fit my needs well (either they are editor dependent or time taking 😕 ).

I'm not saying the methods shared here are superior however they are making my workflow a tad bit easier rather than making changes to source code, committing the docker container and re-deploying cycle or running a server accessible to editor and debugging the code.

We have one server (master) and three agents (worker) in our k3d cluster, to ease the things you can get away with running a single server node and debug your application. However, I'm more interested in simulating a user environment as much as possible and so is the distributed nature of the environment.

Prerequisite (or Good to know info) §

About CSI volume plugin & driver implementation:

About Python debugger:

About Kubectl cp and port-forward:

Miscellaneous:

Alright, as we got hold of the basics, on to the problem statement, implementation and debugging code.

Note: I can't possibly go through every minute detail, please start a discussion in comments section.

Problem Statement §

I've created a PVC and mounted it in a container, we just need to return PVC Volume name with the minimum required fields as per the proto definition to satisfy ListVolumes RPC

# PVC which is hosted on above created Kadalu Replica3 storage class
-> kubectl get pvc
NAME    STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
pvc2g   Bound    pvc-02072076-468a-43b2-bf40-b33ae6978e19   2Gi        RWX            kadalu.replica3   23h

# Pod which is currently using `pvc2g` claim
-> kubectl describe pvc pvc2g | grep Used
Used By:       pod1

Not that it's tough to implement, just not to lengthen this article we'll cutomize our RPC client call to return all the volumes without tokenizing. Before proceeding with implemention, knowing about how Kadalu storage provisions PVC would be helpful.

Typical Implementation §

Note that most if not all of the RPC calls should be idempotent and all the methods that implement them internally should try to reach to required state or log and fail with error.

One of the commits de-coupled the code at the process level which enabled the separation of concerns with monitoring state and reconciling the process to required state without which steps followed in the rest of the article will not be possible.

Before we proceed further, let's invoke ListVolumes method with no code change and then arrive at the solution. We'll deploy socat pod as a daemonset on all k3d agents which exposes CSI UDS as tcp connection and use csc to connect to the tcp port.

As provisioner pod uses an emptyDir we need to access that differently and use a deployment with 1 replica and schedule that on node where provisioner pod is deployed.

Important: The beauty of one of the recommended approaches packaging all services in a single binary we can get away with not having extra deployment on provisioner pod's node. The downside is, when we access Controller Services the log messages end up in Node Service Pods. For brevity, I'm using separate pod for accessing provisioner csi.sock file.

A pod manifest exists in the repo however below is a modified form:

1-> bat tests/test-csi/sanity-debug.yaml --plain
2---
3apiVersion: apps/v1
4kind: DaemonSet
5metadata:
6 namespace: kadalu
7 name: sanity-ds
8 labels:
9 name: sanity-ds
10spec:
11 selector:
12 matchLabels:
13 name: sanity-ds
14 template:
15 metadata:
16 labels:
17 name: sanity-ds
18 spec:
19 containers:
20 - name: socat
21 image: alpine/socat:1.0.5
22 args:
23 - tcp-listen:10000,fork,reuseaddr
24 - unix-connect:/plugin/csi.sock
25 volumeMounts:
26 - name: csi-sock
27 mountPath: /plugin/csi.sock
28 volumes:
29 - name: csi-sock
30 hostPath:
31 path: /var/lib/kubelet/plugins_registry/kadalu/csi.sock
32 type: Socket
33---
34apiVersion: apps/v1
35kind: Deployment
36metadata:
37 namespace: kadalu
38 name: sanity-dp
39 labels:
40 name: sanity-dp
41spec:
42 replicas: 1
43 selector:
44 matchLabels:
45 name: sanity-dp
46 template:
47 metadata:
48 labels:
49 name: sanity-dp
50 spec:
51 affinity:
52 podAffinity:
53 requiredDuringSchedulingIgnoredDuringExecution:
54 - labelSelector:
55 matchExpressions:
56 - key: app.kubernetes.io/name
57 operator: In
58 values:
59 - kadalu-csi-provisioner
60 topologyKey: "kubernetes.io/hostname"
61 containers:
62 - name: socat
63 image: alpine/socat:1.0.5
64 args:
65 - tcp-listen:10001,fork,reuseaddr
66 - unix-connect:/plugin/csi.sock
67 volumeMounts:
68 - name: csi-sock
69 mountPath: /plugin/csi.sock
70 volumes:
71 - name: csi-sock
72 hostPath:
73 # UID of the POD should be replaced before deployment
74 path: '/var/lib/kubelet/pods/POD_UID/volumes/kubernetes.io~empty-dir/socket-dir/csi.sock'
75 type: Socket

Deploying pods after replacing POD_UID in the yaml manifest

1# Store Provisioner Pod UID
2-> POD_UID=$(kubectl get pods kadalu-csi-provisioner-0 -o jsonpath={'.metadata.uid'})
3
4# Applying and verifying the manifest
5-> sed "s/POD_UID/$POD_UID/" tests/test-csi/sanity-debug.yaml | kubectl apply -f -
6daemonset.apps/sanity-ds created
7deployment.apps/sanity-dp created
8
9# Pods after reaching Ready state (Sanitized output)
10-> kubectl get pods --sort-by='{.spec.nodeName}' \
11-o=custom-columns='NAME:.metadata.name,NODE:.spec.nodeName' | grep -P 'sanity|csi|NODE'
12NAME NODE
13kadalu-csi-nodeplugin-fv57x k3d-test-agent-0
14sanity-ds-6mxxc k3d-test-agent-0
15
16sanity-ds-qtz6d k3d-test-agent-1
17kadalu-csi-nodeplugin-ngfm2 k3d-test-agent-1
18
19sanity-ds-z6f5s k3d-test-agent-2
20kadalu-csi-provisioner-0 k3d-test-agent-2
21sanity-dp-67cc596d6c-xknf7 k3d-test-agent-2
22kadalu-csi-nodeplugin-8ttmk k3d-test-agent-2
23
24sanity-ds-2khrd k3d-test-server-0
25kadalu-csi-nodeplugin-7qwhm k3d-test-server-0

You can see from above output that we can have access to csi.sock from every k3d node exposed via sanity pod on port 10000 (10001 for accessing Controller Server). All we have to do is port-foward exposed port and access it with csc.

Here we are port-forward'ing from pod sanity-dp-67cc596d6c-xknf7 so that we can talk with controller service deployed on kadalu-csi-provisioner-0 pod.

# In one pane, run a 'kubectl port-forward'
-> kubectl port-forward pods/sanity-dp-67cc596d6c-xknf7 :10001
Forwarding from 127.0.0.1:41289 -> 10001
Forwarding from [::1]:41289 -> 10001

# In another pane, run `ncat` to keep above port-fowarding alive
-> while true; do nc -vz 127.0.0.1 41289 ; sleep 15 ; done

# Another pane, finally we can access tcp connection to talk with our CSI Controller server
-> csc identity plugin-info -e tcp://127.0.0.1:41289
"kadalu"        "devel"

-> csc controller get-capabilities -e tcp://127.0.0.1:41289
&{type:CREATE_DELETE_VOLUME }
&{type:LIST_VOLUMES }
&{type:EXPAND_VOLUME }

# What we want to implement
-> csc controller list-volumes -e tcp://127.0.0.1:41289
Failed to serialize response

Please use -h,--help for more information

# Logs from provisioner when above is run
-> kubectl logs kadalu-csi-provisioner-0 kadalu-provisioner | tail
[2021-03-25 07:09:53,332] ERROR [_common - 88:_transform] - Exception serializing message!
Traceback (most recent call last):
  File "/kadalu/lib/python3.8/site-packages/grpc/_common.py", line 86, in _transform
    return transformer(message)
TypeError: descriptor 'SerializeToString' for 'google.protobuf.pyext._message.CMessage' objects doesn't apply to a 'NoneType' object

Couple of points to take note of above:

Note: By the time you read this post, the bug may be fixed however our main aim for this post is about the process of debugging CSI Driver

I cloned Kadalu repo and implemented a quick and dirty method definition for ListVolumes and below is the code snippet:

1# csi/controllerserver.py
2# ...
3def ListVolumes(self, request, context):
4 # Return list of all volumes (pvc's) in every hostvol
5
6 errmsg = ''
7 pvcs = []
8
9 try:
10 # Mount hostvol, walk through directories and return pvcs
11 for volume in get_pv_hosting_volumes({}):
12 hvol = volume['name']
13 mntdir = os.path.join(HOSTVOL_MOUNTDIR, hvol)
14 mount_glusterfs(volume, mntdir)
15 json_files = glob.glob(os.path.join(mntdir, 'info', '**',
16 '*.json'),
17 recursive=True)
18 pvcs.extend([
19 name[name.find('pvc'):name.find('.json')]
20 for name in json_files
21 ])
22 except Exception as excep:
23 errrmsg = str(excep)
24
25 if not pvcs or errmsg:
26 errmsg = errmsg or "Unable to find pvcs"
27 logging.error("ERROR: %s", errmsg)
28 context.set_details(errmsg)
29 context.set_code(grpc.StatusCode.NOT_FOUND)
30 return csi_pb2.ListVolumesResponse()
31
32 logging.info(logf("Got list of volumes", pvcs=pvcs))
33 return csi_pb2.ListVolumesResponse(entries=[{
34 "volume": {
35 "volume_id": pvc
36 }
37 } for pvc in pvcs])
38# ...

Debugging or testing the changes §

Now that we have the method implemented we will copy the corresponding src file into container, kill main.py (which registers all CSI services) and reconciler (start.py) monitors/observes the process absense then it'll run main.py as subprocess which'll run our modified python src file.

1# Copy the src file into kadalu-provisioner
2-> kubectl cp csi/controllerserver.py kadalu-csi-provisioner-0:/kadalu/controllerserver.py -c kadalu-provisioner
3
4# Processes running in provisioner container
5-> kubectl exec -it kadalu-csi-provisioner-0 -c kadalu-provisioner -- sh -c 'ps -ef | grep python'
6root 1 0 0 Mar23 ? 00:00:24 python3 /kadalu/start.py
7root 8 1 0 Mar23 ? 00:01:13 python3 /kadalu/main.py
8root 9 1 0 Mar23 ? 00:00:32 python3 /kadalu/exporter.py
9root 246800 0 0 10:33 pts/3 00:00:00 sh -c ps -ef | grep python
10root 246808 246800 0 10:33 pts/3 00:00:00 grep python
11
12# Init process is `start.py` and it runs `main.py` and `exporter.py` as subprocess
13# monitors and tries it's best to keep them running.
14# Killing `main.py` will be singalled to `start.py` and will be re-run again
15-> kubeclt exec -it kadalu-csi-provisioner-0 -c kadalu-provisioner -- sh -c 'kill 8'
16
17# `main.py` is run again and got a PID 246855, as methods from `csi/controllerserver.py` is
18# imported in `main.py` it'll call above modified method
19-> kubectl exec -it kadalu-csi-provisioner-0 -c kadalu-provisioner -- sh -c 'ps -ef | grep python'
20root 1 0 0 Mar23 ? 00:00:24 python3 /kadalu/start.py
21root 9 1 0 Mar23 ? 00:00:32 python3 /kadalu/exporter.py
22root 246855 1 3 10:33 ? 00:00:00 python3 /kadalu/main.py
23root 246897 0 0 10:33 pts/3 00:00:00 sh -c ps -ef | grep python
24root 246904 246897 0 10:33 pts/3 00:00:00 grep python

If in a hurry you call the csc client again to ListVolumes using the same tcp port, you'll be treated with a Connection Closed message (cause it's actually closed upon killing process)

-> csc identity plugin-info -e tcp://127.0.0.1:41289
connection closed

Please use -h,--help for more information

As we have deployed socat pods using deployment and daemonset kind we can delete the pod to be presented a new tcp connection at the worst case or we can perform the same (port-foward and nc) before using csc again

1# Tada! I did get it correct in first try itself :)
2-> csc controller list-volumes -e tcp://127.0.0.1:46171
3"pvc-02072076-468a-43b2-bf40-b33ae6978e19" 0
4
5# Validation
6-> kubectl get pvc
7NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
8pvc2g Bound pvc-02072076-468a-43b2-bf40-b33ae6978e19 2Gi RWX kadalu.replica3 32h
9
10# Logs from provisioner upon calling above RPC method
11-> k logs kadalu-csi-provisioner-0 kadalu-provisioner | tail
12TypeError: descriptor 'SerializeToString' for 'google.protobuf.pyext._message.CMessage' objects doesn't apply to a 'NoneType' object
13Latest consumption on /mnt/replica3/subvol/9a/3a/pvc-02072076-468a-43b2-bf40-b33ae6978e19 : 0
14Latest consumption on /mnt/replica3/subvol/9a/3a/pvc-02072076-468a-43b2-bf40-b33ae6978e19 : 0
15Latest consumption on /mnt/replica3/subvol/9a/3a/pvc-02072076-468a-43b2-bf40-b33ae6978e19 : 0
16[2021-03-25 10:33:39,051] INFO [kadalulib - 369:monitor_proc] - Restarted Process name=csi
17[2021-03-25 10:33:39,403] DEBUG [volumeutils - 812:mount_glusterfs] - Already mounted mount=/mnt/replica3
18[2021-03-25 10:33:39,404] INFO [main - 36:mount_storage] - Volume is mounted successfully hvol=replica3
19[2021-03-25 10:33:39,417] INFO [main - 56:main] - Server started
20[2021-03-25 10:36:37,664] DEBUG [volumeutils - 812:mount_glusterfs] - Already mounted mount=/mnt/replica3
21[2021-03-25 10:36:37,709] INFO [controllerserver - 345:ListVolumes] - Got list of volumes pvcs=['pvc-02072076-468a-43b2-bf40-b33ae6978e19']

Unfortunately, setting a breakpoint() in a grpc context results in bdb.BdbQuit error when attached to TTY of the container. We'll go through using breakpoint() feature in subsequent posts which supports it and below is the brief process:

  1. Wherever we want to pause the execution just introduce breakpoint() function in the src file and perform cp, restart of socat pod and perform the operation which triggers the breakpoint
  2. The execution will be paused at breakpoint and attach the container from daemonset/statefulsets/deployments kinds using command similar to
# Target can be 'ds'/'sts'/deploy' kinds
-> kubectl attach sts/kadalu-csi-provisioner -c kadalu-provisioner -it
Unable to use a TTY - container kadalu-provisioner did not allocate one
If you don't see a command prompt, try pressing enter.

[...]

If we want to test/kill main.py which is the init process, container itself will be killed and replaced with a new pod, so the modified code will not come into effect.

In such cases we need to (docker) commit the container after cp of the code blocks, retag and push to the local registry (remember k3d cluster can access local registry) and change/edit/patch the image source in yaml manifests. (We'll go through this scenario as well in later posts 😃 )

Caveats and tips §

Summary §

If you give a couple of reads you can easily derive below gist:

Cleanup of the cluster: If you have followed previous article and current post, you can delete entire k3d cluster without any trace by following below steps:

-> diff <(df -ha | grep pods | awk '{print $NF}') <(df -h | grep pods | awk '{print $NF}') \
| awk '{print $2}' | xargs umount -l

# Some housekeeping for docker (Don't run these without knowing what they do)
-> docker rmi $(docker images -f "dangling=true" -q)
-> docker volume prune -f
-> docker volume rm $(docker volume ls -qf dangling=true)

As stated earlier script for setup and teardon of k3d cluster is available here, you have been warned, don't run without checking it.

It may seem that we have covered a lot of ground but I had to intentionally drop off some excerpts. I'll be continuing with exploring another component of Kadalu storage in later posts and add any points missed in current post. Stay tuned 👀

Send an email for any comments. Kudos for making it to the end. Thanks!