One of the non-straight forward aspects of implementing for applications in Kubernetes is the stateful data store.
I will be showing 2 relational database patterns in Azure Kubernetes Service. Since containers are the fundamental building blocks, these are stateless by default. That is the containers can lose its data when it is terminated or fail. More design, setup and management effort are given to have stateful components in Kubernetes. And is very much the case for relational databases. I will show two stateful persistent database approaches for Azure Kubernetes.
- In-cluster database
- MySQL containerized database in a Kubernetes pod
- Persistent Storage provisioned as Azure Managed Disks
- Azure PaaS Database
- External to the Kubernetes cluster
- With managed Identity for
- Networking via a private link
I will show a containerized MySql database using the demo Deploying WordPress and MySQL with Persistent Volumes. This is representative of a standard web application and a non-high availability relational database architecture.
I have setup the WordPress and MySQL application as public facing. The WordPress deployment is also using Azure Managed disks as a file store. But the focus is of this article is the MySQL database.
Azure Virtual Machine Scale Sets
- Each virtual machine instance in the scale set is a worker node in the Kubernetes cluster. They host the Kubernetes pods which are ephemeral and not immortal. Therefore, need to manage persistent data outside of the pod.
One typical approach to setup persistent storage in an Azure cloud environment is to use Azure Managed Disks. So when its pod is terminated and recreated, the data can be re-attached to the MySQL pod. They are attached to only one VM Node where its containing pods can only read/write to. As a result there is relatively tighter security and lower latency between the VM Node and Azure Managed Disk.
For more details of Azure Managed Disks, here is my brief summary:
- Highly durable and available
- 99.999% availability
- with availability sets to ensure that the disks of VMs in an availability set are sufficiently isolated from each other to avoid a single point of failure
- Server Side encryption (SSE) performed by the storage service
- Azure Disk Encryption (ADE) which can be enabled on the OS and data disks
- Azure Backup
- Create a backup job with time-based backups and backup retention policies
- Role-based access control
- assign specific permissions for a managed disk to one or more users. Managed disks expose a variety of operations, including read, write (create/update), delete, and retrieving a shared access signature (SAS) URI for the disk
For complete details on Azure Managed Disks, read here.
Kubernetes Objects View
In a detailed view with respect to Kubernetes objects, we have the following:
- Pod – The MySQL database pod
- Persistent Volume Claim – The dynamic provision of persistent volume by a user or application. Not by an administrator.
- Storage Class – The type or “class” of storage options made available in the cloud infrastructure.
- Persistent Volume – The abstraction of the piece of storage resource.
For full details of what these are and its properties, you can read Kubernetes Persistent Volumes.
The following is the sequence of operations in provisioning this containerized database and its persistent storage.
- The pod holding the MySQL containerized application has a defined volume referring to a specified Persistent Volume Claim. This is specified in the deployment manifest Yaml.
- The Persistent Volume Claim (PVC) specifies the access mode, the amount of storage and the storage class. This is essentially an assertion or “request” for the specified storage. The type of storage option is specified through storage class.
- The storage class used is built-in with Azure Kubernetes Service and is named “managed premium” that is a premium SSD Azure Managed Disk.
- As “requested” by the PVC, if the persistent volume already exists, then it will bind to it, otherwise, Kubernetes will provision the persistent volume object along with Azure Managed Disk.
- The Azure Managed disk is created and is attached to the VM Node that contains the pod of the containerized database.
From a DevOps engineering or developer perspective, for this database pattern, you just need to focus on creating the persistent volume claim yaml file and referencing in your deployment yaml. This is taking the dynamic provisioning approach and is likely the most typical approach out of the other approaches. There are many pieces to arrange and connecting the dots, but I hope I simplified your understanding as a starting point without reading all the documentation and all possible options.
For a demo and and further explanation, see my YouTube video
Next, I will be looking at a .NET core web app connecting to an Azure SQL Database with Managed Identity. Blog post coming soon.