Provision Object Storage via Ceph
Olaniyi Odeleye (MBA)
Cloud Operations Engineer @ DigitalOcean | MBA, DevOps, Kubernetes, Cloud-Native Infrastructure, Terraform, Argo Workflow, ArgoCD
Introduction
Rook and Ceph provides the platform to expose the following storage types:
In this documentation, we will focus on object storage configuration and how we can access the object bucket for write and read operations. Object storage exposes an S3 API to the storage cluster for applications to put and get data.
Prerequisites
This guide assumes a Rook-Ceph cluster has been setup as explained here.
Configure an Object Store
Rook has the ability to either deploy an object store in Kubernetes or to connect to an external RGW service. Most commonly, the object store will be configured locally by Rook. Alternatively, if you have an existing Ceph cluster with Rados Gateways, see this?link?on how to consume it from Rook.
Create a Local Object Store
The below manifest will create a CephObjectStore that starts the RGW service in the cluster with an S3 API.
NOTE: This manifest requires?at least 3 bluestore OSDs, with each OSD located on a?different node.
The OSDs must be located on different nodes, because the?failureDomain?is set to host and the erasureCoded chunk settings require at least 3 different OSDs (2 dataChunks + 1 codingChunks).
apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
name: niyez-bm-store
namespace: rook-ceph
spec:
metadataPool:
failureDomain: host
replicated:
size: 3
dataPool:
failureDomain: host
erasureCoded:
dataChunks: 2
codingChunks: 1
preservePoolsOnDelete: true
gateway:
sslCertificateRef:
port: 80
# securePort: 443
instances: 1
healthCheck:
bucket:
disabled: false
interval: 60s
After the CephObjectStore is created, the Rook operator will then create all the pools and other resources necessary to start the service. This may take a minute to complete.
The manifest above is stored in a?repo?that has ArgoCD referencing it through kustomization.yaml file.
resources:
- ceph-object-store.yaml
- ceph-bucket-storageclass.yaml
- object-bucket-claim.yaml
# To confirm the object store is configured, wait for the rgw pod to start
kubectl -n rook-ceph get pod -l app=rook-ceph-rgw
?Connect to the Object Store
You can now get and access the store via:
kubectl -n rook-ceph get svc -l app=rook-ceph-rgw
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
rook-ceph-rgw-niyez-bm-store ClusterIP 10.42.15.8 <none> 80/TCP 4d23h
Any pod from your cluster can now access this endpoint:
$ curl 10.42.15.8:80
<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="https://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>
It is also possible to use the internally registered DNS name:
curl rook-ceph-rgw-niyez-bm-store.rook-ceph:80
<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="https://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>
The DNS name is created with the following schema rook-ceph-rgw-$STORE_NAME.$NAMESPACE.
Create a Bucket
Now that the object store is configured, next we need to create a bucket where a client can read and write objects. A bucket can be created by defining a storageClass, similar to the pattern used by block and file storage. First, we define the storageClass that will allow object clients to create a bucket. The storageClass defines the object storage system, the bucket retention policy, and other properties required by the administrator.?
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-bucket
# labels:
# aws-s3/object
# Change "rook-ceph" provisioner prefix to match the operator namespace if needed
provisioner: rook-ceph.ceph.rook.io/bucket
parameters:
objectStoreName: niyez-bm-store
objectStoreNamespace: rook-ceph
region: us-east-1
# bucketName: {}
reclaimPolicy: Delete
If you’ve deployed the Rook operator in a namespace other than rook-ceph, change the prefix in the provisioner to match the namespace you used. For example, if the Rook operator is running in the namespace?my-namespace?the provisioner value should be?my-namespace.ceph.rook.io/bucket.?Once the StorageClass manifest file is ready, you can sync it with ArgoCD for deployment to kubernetes cluster.
Based on this storageClass, an object client can now request a bucket by creating an Object Bucket Claim (OBC). When the OBC is created, the Rook-Ceph bucket provisioner will create a new bucket. Notice that the OBC references the storageClass that was created above.?
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
name: ceph-bucket
# namespace: rook-ceph
spec:
# bucketName: {}
generateBucketName: loki-chunks-ceph-bkt
storageClassName: rook-ceph-bucket
# additionalConfig:
# maxObjects: "1000"
# maxSize: "2G"
To get the?bucketName?from within the cluster, you will need to run the below command to get OBC name and then run?-o yaml?to get the?bucketName.
olaniyi@k8-bastion:~$ kubectl get objectbucketclaims.objectbucket.io -n rook-ceph
Output:
NAME AGE
ceph-bucket 4d23h
Use the OBC above to get the bucketName
olaniyi@k8-bastion:~$ kubectl get objectbucketclaims.objectbucket.io ceph-bucket -n rook-ceph -o yaml
Output:
领英推荐
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"objectbucket.io/v1alpha1","kind":"ObjectBucketClaim","metadata":{"annotations":{},"labels":{"app.kubernetes.io/instance":"rook-ceph-pool-sc"},"name":"ceph-bucket","namespace":"rook-ceph"},"spec":{"generateBucketName":"loki-chunks-ceph-bkt","storageClassName":"rook-ceph-bucket"}}
creationTimestamp: "2021-11-03T12:54:51Z"
finalizers:
- objectbucket.io/finalizer
generation: 4
labels:
app.kubernetes.io/instance: rook-ceph-pool-sc
bucket-provisioner: rook-ceph.ceph.rook.io-bucket
name: ceph-bucket
namespace: rook-ceph
resourceVersion: "6052"
uid: e0d54581-1212-424a-b19d-92626ae7693e
spec:
bucketName: loki-chunks-ceph-bkt-83d40a78-4598-457a-80b5-e73cec2a18e9
generateBucketName: loki-chunks-ceph-bkt
objectBucketName: obc-rook-ceph-ceph-bucket
storageClassName: rook-ceph-bucket
status:
phase: Bound
You can see from the output above that the bucketName has been populated with some random character (loki-chunks-ceph-bkt-83d40a78-4598-457a-80b5-e73cec2a18e9) been added to the name (loki-chunks-ceph-bkt) we supplied for generateBucketName in our initial manifest. The idea of allowing the bucketName to be randomly generated is to have a unique buckName and for security purpose.
Now that the claim is created, the operator will create the bucket as well as generate other artifacts to enable access to the bucket. A secret and ConfigMap are created with the same name as the OBC and in the same namespace. The secret contains credentials used by the application pod to access the bucket. The ConfigMap contains bucket endpoint information and is also consumed by the pod. See the?Object Bucket Claim Documentation?for more details on the CephObjectBucketClaims.
Consume the Object Storage
Now that we have the object store configured and a bucket created, we can consume the object storage from an S3 client. We will be testing the connection to the CephObjectStore and uploading and downloading from it. We will need the?Rook toolbox?to accomplish this task.
Deploy Rook Toolbox
The Rook toolbox is a container with common tools used for rook debugging and testing. The toolbox is based on CentOS, so more tools of your choosing can be easily installed with yum.
The toolbox can be run in two modes but we only focus on the interactive mode in this documentation:
Prerequisite: Before running the toolbox you should have a running Rook cluster deployed.
Interactive Toolbox
The rook toolbox can run as a deployment in a Kubernetes cluster where we can connect and run arbitrary Ceph commands.
Save the tools spec as toolbox.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: rook-ceph-tools
namespace: rook-ceph
labels:
app: rook-ceph-tools
spec:
replicas: 1
selector:
matchLabels:
app: rook-ceph-tools
template:
metadata:
labels:
app: rook-ceph-tools
spec:
dnsPolicy: ClusterFirstWithHostNet
containers:
- name: rook-ceph-tools
image: rook/ceph:v1.6.10
command: ["/tini"]
args: ["-g", "--", "/usr/local/bin/toolbox.sh"]
imagePullPolicy: IfNotPresent
env:
- name: ROOK_CEPH_USERNAME
valueFrom:
secretKeyRef:
name: rook-ceph-mon
key: ceph-username
- name: ROOK_CEPH_SECRET
valueFrom:
secretKeyRef:
name: rook-ceph-mon
key: ceph-secret
volumeMounts:
- mountPath: /etc/ceph
name: ceph-config
- name: mon-endpoint-volume
mountPath: /etc/rook
volumes:
- name: mon-endpoint-volume
configMap:
name: rook-ceph-mon-endpoints
items:
- key: data
path: mon-endpoints
- name: ceph-config
emptyDir: {}
tolerations:
- key: "node.kubernetes.io/unreachable"
operator: "Exists"
effect: "NoExecute"
tolerationSeconds: 5
Launch the rook-ceph-tools pod:
kubectl create -f toolbox.yaml
Wait for the toolbox pod to download its container and get to the running state:
kubectl -n rook-ceph rollout status deploy/rook-ceph-tools
Once the rook-ceph-tools pod is running, you can connect to it with:
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
All available tools in the toolbox are ready for our troubleshooting needs.
Example:
Connection Environment Variables
To simplify the s3 client commands, we will have to set the four environment variables for use by our client (ie. inside the toolbox). See below for retrieving the variables for a bucket created by an ObjectBucketClaim.
The following commands extract key pieces of information from the secret and configmap:"
#config-map, secret, OBC will part of default if no specific name space mentioned
export AWS_HOST=$(kubectl -n rook-ceph get cm ceph-bucket -o jsonpath='{.data.BUCKET_HOST}')
export AWS_ACCESS_KEY_ID=$(kubectl -n rook-ceph get secret ceph-bucket -o jsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 --decode)
export AWS_SECRET_ACCESS_KEY=$(kubectl -n rook-ceph get secret ceph-bucket -o jsonpath='{.data.AWS_SECRET_ACCESS_KEY}' | base64 --decode)
The variables for the user generated in this example might be:
export AWS_HOST=rook-ceph-rgw-niyez-bm-store.rook-ceph.svc
export AWS_ENDPOINT=10.42.15.8:80
export AWS_ACCESS_KEY_ID=694QTIG4NGD2FGSUZMTX
export AWS_SECRET_ACCESS_KEY=lpIDn8OCqg0mnYsgoBye4y55cX082WNscu0atpzp
Install s3cmd
To test the CephObjectStore we will install the s3cmd tool into the toolbox pod.
yum --assumeyes install s3cmd
PUT or GET an object
Upload a file to the newly created bucket
echo "Hello Rook" > /tmp/rookObj
s3cmd put /tmp/rookObj --no-ssl --host=${AWS_HOST} --host-bucket= s3://loki-chunks-ceph-bkt-83d40a78-4598-457a-80b5-e73cec2a18e9
Download and verify the file from the bucket
s3cmd get s3://loki-chunks-ceph-bkt-83d40a78-4598-457a-80b5-e73cec2a18e9/rookObj /tmp/rookObj-download --no-ssl --host=${AWS_HOST} --host-bucket=
cat /tmp/rookObj-download
When you are done with testing with the toolbox, you can remove the toolbox deployment:
kubectl -n rook-ceph delete deploy/rook-ceph-tools