Kubernetes Home lab: Setting up Storage (Rook and Ceph)

Introduction In this post, I’ll walk you through setting up storage for a Kubernetes homelab using Rook and Ceph. This setup provides scalable, resilient storage that can simulate production-grade environments for learning and testing. Hardware Setup Kubernetes Engine: Rancher Kubernetes Engine 2(RKE2) Master Node: 1 node (control plane) Worker Nodes: 3 nodes Storage: Each worker node has a 100 GB unpartitioned disk (used by Ceph for storage) Environment: Kubernetes homelab running on virtual machines Why Rook and Ceph? When deploying and managing Kubernetes clusters, one of the challenges often faced is how to manage persistent storage. Containers are not ideal for applications that require persistent storage, especially when that storage needs to survive restarts. This is where the storage solution like rook and ceph comes into play. Rook Rook is an opensource storage orchestrator which automates deployment and management of ceph storage to provide self-managing, self-scaling, and self-healing storage services. Ceph Ceph is a distributed storage system that provides file, block and object storage. In a typical Ceph cluster, data is distributed across multiple nodes in a way that ensures redundancy and high availability. Ceph automatically replicates data and manages failures, making it ideal for mission-critical applications. Architecture Overview Pod Description Rook Operator Manages the lifecycle of the Ceph cluster inside Kubernetes. CSI Plugins Handle the mounting and unmounting of volumes to pods. Provisioner Automatically creates new Ceph volumes when PVCs are made. OSD (Object Storage Daemon) Stores the data physically and handles replication. mgr Manages the overall Ceph cluster, handles monitoring, and exposes the Ceph Dashboard. mon Keeps track of the cluster's health and makes sure all parts of the storage system agree on the current state. rook-ceph-exporter Exports Ceph metrics for Prometheus and monitoring. rbdplugin Manages block storage (RBD) and allows Kubernetes to mount block devices as persistent volumes. Prerequisites Kubernetes Version: v1.28 to 1.32 are supported. When installing a Ceph cluster, allocate storage (HDD or SSD) without creating any partitions. Ceph will manage and combine all the storage across the drives as a single pool. I have allocated 100 GB HDD storage across three worker nodes. Before installation we need to install lvm2 on every kubernetes node. sudo yum install -y lvm2 -- CentOS or RHEL based system sudo apt-get install -y lvm2 -- for Ubuntu or Debian based system Installation Before installing clone the rook git repo which contains all the configurations. git clone --single-branch --branch master https://github.com/rook/rook.git Navigate to the example directory. cd rook/deploy/examples Step 1: Install CRDs and the Rook Operator kubectl create -f crds.yaml -f common.yaml -f operator.yaml crds.yaml: contains all the crd required by Rook. common.yaml: contains RBAC resources, namespace definitions, Service Account and other resources. operator.yaml: This is the actual brain or core of the rook-ceph cluster. It manages the lifecycle of ceph components. Note: Ceph by default takes 3 mons. So, if you have less than 3 worker nodes then edit your cluster.yaml file and change mon value (.spec.mon.count) to 2 or 1 depending on your nodes. But 3 is recommended for a production cluster. Once the operator is up and running we should create a ceph cluster. Step 2: Install Ceph cluster kubectl create -f cluster.yaml cluster.yaml is for bare-metal kubernetes. so if you have a test environment (minikube) or cloud environment then you should apply other yaml files like cluster-test.yaml or cluster-on-pvc.yaml. Optional but important: Also install the toolbox.yaml file to get the status of the ceph cluster. Step 3: Install toolbox kubectl apply -f toolbox.yaml To see the health of the ceph cluster execute the following command inside the tools pod. k exec -it rook-ceph-tools-b48d79d8b-89m8v -- bash You can use command ceph status to check the status of ceph. bash-4.4$ ceph status cluster: id: ff587402-5139-40a2-ac2a-f6a0de371dc6 health: HEALTH_WARN mons a,b,c are low on available space services: mon: 3 daemons, quorum a,b,c (age 40m) mgr: a(active, since 36m), standbys: b osd: 3 osds: 3 up (since 40m), 3 in (since 2d) data: pools: 2 pools, 33 pgs objects: 621 objects, 2.1 GiB usage: 4.2 GiB used, 296 GiB / 300 GiB avail pgs: 33 active+clean io: client: 14 KiB/s wr, 0 op/s rd, 0 op/s wr You can also see OSD using command ceph osd status. bash-4.4$ ceph osd status ID HOST USED AVAIL WR OPS WR DATA RD OPS RD DATA STATE 0 workernode1 1456M 98.5G 0 0

May 6, 2025 - 02:42
 0
Kubernetes Home lab: Setting up Storage (Rook and Ceph)

Introduction

In this post, I’ll walk you through setting up storage for a Kubernetes homelab using Rook and Ceph. This setup provides scalable, resilient storage that can simulate production-grade environments for learning and testing.

Hardware Setup

  • Kubernetes Engine: Rancher Kubernetes Engine 2(RKE2)
  • Master Node: 1 node (control plane)
  • Worker Nodes: 3 nodes
  • Storage: Each worker node has a 100 GB unpartitioned disk (used by Ceph for storage)
  • Environment: Kubernetes homelab running on virtual machines

Why Rook and Ceph?

When deploying and managing Kubernetes clusters, one of the challenges often faced is how to manage persistent storage. Containers are not ideal for applications that require persistent storage, especially when that storage needs to survive restarts. This is where the storage solution like rook and ceph comes into play.

Rook

Rook is an opensource storage orchestrator which automates deployment and management of ceph storage to provide self-managing, self-scaling, and self-healing storage services.

Ceph

Ceph is a distributed storage system that provides file, block and object storage. In a typical Ceph cluster, data is distributed across multiple nodes in a way that ensures redundancy and high availability. Ceph automatically replicates data and manages failures, making it ideal for mission-critical applications.

Architecture Overview

Rook Ceph Architecture

Pod Description
Rook Operator Manages the lifecycle of the Ceph cluster inside Kubernetes.
CSI Plugins Handle the mounting and unmounting of volumes to pods.
Provisioner Automatically creates new Ceph volumes when PVCs are made.
OSD (Object Storage Daemon) Stores the data physically and handles replication.
mgr Manages the overall Ceph cluster, handles monitoring, and exposes the Ceph Dashboard.
mon Keeps track of the cluster's health and makes sure all parts of the storage system agree on the current state.
rook-ceph-exporter Exports Ceph metrics for Prometheus and monitoring.
rbdplugin Manages block storage (RBD) and allows Kubernetes to mount block devices as persistent volumes.

Prerequisites

  • Kubernetes Version: v1.28 to 1.32 are supported.
  • When installing a Ceph cluster, allocate storage (HDD or SSD) without creating any partitions. Ceph will manage and combine all the storage across the drives as a single pool. I have allocated 100 GB HDD storage across three worker nodes.
  • Before installation we need to install lvm2 on every kubernetes node.
sudo yum install -y lvm2 -- CentOS or RHEL based system
sudo apt-get install -y lvm2 -- for Ubuntu or Debian based system

Installation

Before installing clone the rook git repo which contains all the configurations.

git clone --single-branch --branch master https://github.com/rook/rook.git

Navigate to the example directory.

cd rook/deploy/examples

Step 1: Install CRDs and the Rook Operator

kubectl create -f crds.yaml -f common.yaml -f operator.yaml
  • crds.yaml: contains all the crd required by Rook.
  • common.yaml: contains RBAC resources, namespace definitions, Service Account and other resources.
  • operator.yaml: This is the actual brain or core of the rook-ceph cluster. It manages the lifecycle of ceph components.

Note: Ceph by default takes 3 mons. So, if you have less than 3 worker nodes then edit your cluster.yaml file and change mon value (.spec.mon.count) to 2 or 1 depending on your nodes. But 3 is recommended for a production cluster.

Once the operator is up and running we should create a ceph cluster.

Step 2: Install Ceph cluster

kubectl create -f cluster.yaml

cluster.yaml is for bare-metal kubernetes. so if you have a test environment (minikube) or cloud environment then you should apply other yaml files like cluster-test.yaml or cluster-on-pvc.yaml.

Optional but important: Also install the toolbox.yaml file to get the status of the ceph cluster.

Step 3: Install toolbox

kubectl apply -f toolbox.yaml

To see the health of the ceph cluster execute the following command inside the tools pod.

k exec -it rook-ceph-tools-b48d79d8b-89m8v -- bash

You can use command ceph status to check the status of ceph.

bash-4.4$ ceph status
  cluster:
    id:     ff587402-5139-40a2-ac2a-f6a0de371dc6
    health: HEALTH_WARN
            mons a,b,c are low on available space

  services:
    mon: 3 daemons, quorum a,b,c (age 40m)
    mgr: a(active, since 36m), standbys: b
    osd: 3 osds: 3 up (since 40m), 3 in (since 2d)

  data:
    pools:   2 pools, 33 pgs
    objects: 621 objects, 2.1 GiB
    usage:   4.2 GiB used, 296 GiB / 300 GiB avail
    pgs:     33 active+clean

  io:
    client:   14 KiB/s wr, 0 op/s rd, 0 op/s wr

You can also see OSD using command ceph osd status.

bash-4.4$ ceph osd status
ID  HOST          USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE
 0  workernode1  1456M  98.5G      0        0       0        0   exists,up
 1  workernode2  1147M  98.8G      0     6553       0        0   exists,up
 2  workernode3  1787M  98.2G      0        0       0        0   exists,up

To check the raw storage and pool use command ceph df

bash-4.4$ ceph df
--- RAW STORAGE ---
CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
hdd    300 GiB  296 GiB  4.3 GiB   4.3 GiB       1.43
TOTAL  300 GiB  296 GiB  4.3 GiB   4.3 GiB       1.43

--- POOLS ---
POOL         ID  PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
.mgr          1    1  577 KiB        2  1.7 MiB      0     93 GiB
homelabpool   2   32  2.1 GiB      626  4.1 GiB   1.45    140 GiB

Creating Storage Classes

We can create 3 types of storage in rook-ceph: Object, Block and Filesystem. So to create these storage we need to create ceph pool and storage class first.

Storage Class tells Kubernetes what type of Ceph storage to provision and how to interact with Rook to provision PVs automatically when a PVC is created.

A Ceph Pool is a logical grouping of storage resources in a Ceph cluster.

Block Storage

Go to location rook/deploy/examples/csi/rbd.

cd rook/deploy/examples/csi/rbd

Apply the storageclass.yaml

kubectl apply -f storageclass.yaml

There are two section under this file.

  • To create CephBlockPool named replica pool.
  • To create a storageclass named rook-ceph-block.

Replicated size of CephBlockPool is 3 so any PVC you create in cluster will be stored at 3 location.

spec:
  failureDomain: host
  replicated:
    size: 3

Filesystem Storage

To create filesystem storage navigate to rook/deploy/examples/csi/cephfs

cd rook/deploy/examples/csi/cephfs

Apply storageclass.yaml

kubectl apply -f storageclass.yaml

Object Storage

Finally to create the Object storage navigate to rook/deploy/examples

cd rook/deploy/examples

and apply object.yaml file

kubectl apply -f object.yaml

Troubleshooting

  • If you don't install lvm2 package then osd pods will fail and monpods won't be create so make sure to install it.
  • If you are facing this issue Still connecting to unix:///csi/csi.sock in csi-rbdplugin pod then use modprobe rbd on all nodes to solve this issue.

Summary

Setting up Rook and Ceph in a Kubernetes homelab provides a powerful way to simulate production-grade storage environments. With proper hardware and configuration, you can create scalable, self-healing, and persistent storage for your workloads. By following this guide, you deployed Rook, configured a Ceph cluster, and enabled block, filesystem, and object storage—all with built-in redundancy and monitoring support. This setup not only enhances your Kubernetes lab experience but also helps you build hands-on skills for real-world scenarios.

GitHub logo abhinav1015 / Devops-Homelab

This is a devops homelab.

DevOps Homelab

This repository contains the configuration files and deployment manifests for various DevOps tools and infrastructure setup used in my personal Kubernetes Homelab. It leverages GitOps with ArgoCD for continuous delivery and management of Kubernetes resources.

Overview

The following tools and services have been deployed and managed in this Kubernetes homelab ArgoCD: GitOps continuous delivery tool for managing Kubernetes applications. GitLab: GitLab instance for CI/CD pipelines and repository management. Cert-Manager: A tool for managing SSL/TLS certificates in Kubernetes. Rook & Ceph: A cloud-native storage solution for Kubernetes using Ceph. MetalLB: Load Balancer for exposing services to external traffic.

Infrastructure Setup

This project follows a modular approach to organize configurations into different sections:

Infrastructure (infra/): Contains all the configuration files required to set up infrastructure components like GitLab, Cert-Manager, Ceph, etc.

Image/Logo Infrastructure Name Description
RKE2 Rancher's next-generation Kubernetes distribution, fully compliant and secured out-of-the-box.
ArgoCD A declarative GitOps continuous