ROSA,AWS,EBS and why would you do that?
Engineers doing things they know they shouldn't

ROSA,AWS,EBS and why would you do that?

This artice starts with Rhys Powell saying that you really shouldn't do this but, in an unusual twist of situations Philipp Bergsmann says why not!?!?! More on that later...


Our usual role as Black belts is to work with customers to help them understand things, to help them position managed openshift or to get technical difficulities removed. Ocassionally we like to work with some of our hyperscaler partners to get hands on with tech or to get a refresher on things, as we all know the tech world is always changing so constant learning is a good thing.

A recent occurance of this was when the EMEA Black Belt team had a "state of storage review" with some of our friends at AWS. We were very lucky to have Tom Tasker and Rod Wilson come in and give a us a talk. Storage is a big thing in AWS and both of them were able to dig into a lot of topics, espically things our customers often ask, meaning we are better informed to help them all moving forward. They truly did an excellent job of presenting so much information but one part of EBS storage brought around some conversation, an idea, a test and then this blog post.

It's often forgotten that you can change your parts of your EBS volume, things such as the type and the IOPS. This is something that gets considered more often with EC2 instances, getting the right performance vs cost for the work load. Yet in our Openshift world we forget this as storage is abstracted away. Yes we define our storage classes, yes they do hook into the storage in the cloud and then they are forgotten about.

It was at this point that discussion commenced. Points were raised that this wouldn't work or it would break things, with counterpoints of it wont do a thing as the cluster won't know and won't care as its abstracted away.

Would changing the Volume type on the AWS side cause any issues to the persistent volume provisioned through the cluster?

We start off with a simple cluster. In this instance it was a ROSA HCP cluster running in eu-west-2, the speed of standing up and London is the nearest region. Nothing else was changed, the default, out of the box, storage classes were kept gp3-csi (default) and gp2-csi.

CSI or container storage interface is the abstraction that allows k8s to use the available block and file storage systems that exist in the underlying infrastructure, in this case we were using the AWS EBS CSI Driver, which allows for configuring the EBS volume as its requested. Those configurations are, generally, created in the storage class. In normal running a Persistent Volume Claim (PVC) is made by an application, if there is a Persistent Volume (PV) that matches via name, it will get reused, if not, the PV gets created, using the storage class setting as supplied.


We have our cluster, we know how the CSI works, we now need some code.

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: rhys-claim
  namespace: volume-fun
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
  storageClassName: gp2-csi
  volumeMode: Filesystem        

The project (namespace) called volume-fun has already been created, as mentioned earlier, the gp2-csi class is something that comes with ROSA out of the box. We can call for the creation of the pvc

oc create -f test_pvc.yaml        
This image shows a screen shot of the ROSA HCP cluster and the rhys-claim persistent volume clain that has been made. This claim was generated from the previous code
Screen shot of the PVC called rhys-claim generated from the code shown above

The claim has been created and is now in pending state, waiting for the first consumer of it to be created.

This is another screen shot of the pvc, this one shows the events that have happened to the claim and the last message is highlighting that it is waiting for something to want to use it before the vloume gets created
All I need is for a container to use me!


Right now the cluster is aware that it will need a volume, should a container request it, but it won't creat that until that first request is made. We now need a container that will make the request. For this test we also need to be doing something on that mounted claim, so we can confirm it is working and being used and then see what happens if we do change things.

Not wanting to make things too complex, using dd and writing a file of random data of a decent size, was the decision. The container will mount the volume and then we test the IOPs, through the writing of this file. Changing IOPS was chosen as this is the easiest and most significant change between the differing storage options. To see whats happening we can leverage cloudwatch metrics against the volume.

The Dockerfile code

FROM registry.access.redhat.com/ubi9-minimal:9.4-949.1717074713
RUN mkdir /data1
CMD ["dd", "if=/dev/urandom", "of=/data1/file.out", "bs=512M", "count=100"]        

Build the container and push it to a container registry

podman build -t quay.io/rhpowell_mobb/volume_fun  .
podman push quay.io/rhpowell_mobb/volume_fun        

Deploy the container to the cluster, this will then cause the the claim to kick off the provisioning process.

apiVersion: v1
kind: Pod
metadata:
  name: iowrite
  labels:
    app: iowrite
  namespace: volume-fun
spec:
  containers:
    - name: iowrite
      image: "quay.io/rhpowell_mobb/volume_fun:latest"
      volumeMounts:
        - mountPath: /data1
          name: data1
  volumes:
    - name: data1
      persistentVolumeClaim:
        claimName: rhys-claim        

As you can see the code creates a pod, the pod has the container called iowrite which was just built and pushed to Quay. This container has a vloume mount called data1, this matches the folder that was part of the command in the container. That volume mount is attached to the volume that matched the PVC clamin name.

Deploy the container

oc apply -f pv-pod.yaml        

This will give us a warning as, for clarity, we have been a little lazy and not set a number of security settings but the container will run.

This is a screen shot showing the newly deployed pod is up and running
This image shows that the pod and container are running

We also now see that the volume has been created and has been created from the rhys-claim

A screen shot showing all of the persistent vloumes that are part of the cluster including the now created rhys-claim volume
Third claim down is the one for our PVC

We now need to look for this volume in the cloud watch metrics. We can then take the read ops and the write ops, reduce the period to 1 second. Add a new maths field that we add m1 and m2 that will give us the total IOPs. Not strictly necessary as we only want to confirm changes and see if things go wrong and the app is all write not really read but its good to have everything covered just in case. As our job runs only writing 1/2G blocks 100 times it will stop the container once its completed. The cluster will automatically restart the pod once it dies. We should see a flat line, with a dip everytime the process completes, the container dies and then gets restarted.

Cloudwatch metrics showing the IOPS against the vloume


We now need to play with the volume! This is as simple as selecting the volume inthe AWS console and then hitting modify. The choice here was to the extreme, from gp2 running at 300 IOPS we are making the leap to IO2 and throwing 30000 IOPS at it. After changing the settings, we just hit modify, not restarting, no disconnecting, nothing other than modify and AWS just takes care of it all in the back ground with no interruption to the storage as its running.

A screen shot showing all of the volumes in the AWS console the message at the top shows the volume used for rhys-claim has just been modified and there is text showing that it is undergoing modification

We watch as it goes through the modifying process until that hits 100%, we then take a look at the metrics and see that the IOPS has instantly jumped up and that the task is taking far less time to complete than it did previously.

The same cloudwatch metrics as previously but now showing the effect the changes to the volume have made on the IOPs

So all is good... Just to make sure it wasn't a fluke, the logs files of the container and the CSI are checked and no errors are shown, the pod is even deleted and recreated and nothing was shown to cause an issue.

What did we prove?

The CSI driver doesn't look at what the actual volume type is after creation. Bewarned, this has only been tested on AWS.

Is this good, well as Philipp points out and his argument for giving it a try is that it is an opportunity to test theoris around causes of bettle necks or that even gp3 might be and over provision and you can drop down to good ols magnetic storage. If you have huge vloumes this could considerable optimise your costs when its covered across a number of environments. While Rhys say no you shouldn't he really understands where Philipp is coming from as it is a great opportunity to test but it goes against one of the very key principles that has been pushed for years of no manual changes, so do it but make sure you plan to correct the changes needed right away.

Conclusions

The abstraction works, the ability to change disk types could be useful. We will leave it up to you to decide if you should or should od it. Finally, all learning is fun as it helps with deeper understanding and if this has even slightly sparked an interest in running managed openshift, reach out to Philipp Bergsmann or Rhys Powell as they do mostly do serious stuff!


If you want the code to play with this yourself it can be found here


Thank you for the collaboration!

回复

要查看或添加评论,请登录

Rhys Powell的更多文章

  • Maximo Application Suite on OpenShift

    Maximo Application Suite on OpenShift

    Maximo Application Suite (MAS) is the latest version of Maximo and a move from Maximo EAM, this is a big move for many…

    2 条评论
  • The Gold in OpenShift

    The Gold in OpenShift

    Operators and the OpenShift hub Nope, not the next big boy band (I might be showing my age talking about that kind of…

  • What is OpenShift?

    What is OpenShift?

    Really, you don't know??? Really? Ok, let me take you on a run through. Firstly, if you want the official run through…

    2 条评论
  • MOBB, who, what why?

    MOBB, who, what why?

    Hi, my name is Rhys and you might know me from roles such as linux sysadmin, Devops engineer, platform engineer or even…

    5 条评论

社区洞察

其他会员也浏览了