Transforming a single-tenant system to support multi-tenant with Kubernetes
Here’s a shout-out to Rajula Vineet Reddy & Francisco Borges Aurindo Barros of CERN, which gave a very interesting talk on how the manage multiple Drupal sites using a Kubernetes operator, in KubeCon 2023.
This talk highlighted a problem our customers often encountered and showed a Kubernetes path to solve it.
Single Tenant, Multi-tenant: What is the tenant issue?
A system is often developed with one “customer” in mind – an organization, a department, etc.
It makes sense and simplifies things.
But then, when the time comes to move to a SaaS model to serve multiple organizations (a.k.a. Tenants), they face data and security challenges.
When companies are transitioning to a SaaS model to serve multiple tenants, these questions typically cross their minds:
– How do we make sure data from separate tenants is not mixed, even in cases of bugs?
– How do we separate users, roles, and admins?
– How do we ensure a bad actor in one tenant will not affect others?
Etc.
And re-factoring a single-tenant system to support multi-tenant can be a real headache.
The challenge the CERN experts presented is a great example by itself – the system they needed to be multi-tenant is the Drupal CMS. Certainly, they did not intend to re-write it…
Namespaces to the rescue
Kubernetes presents a relatively simple mechanism to solve this issue – namespaces.
If we have an easy and straightforward deployment method, such as a Helm chart, to install the entire single-tenant system, including all services and databases, within its own namespace, then we can effectively leverage multiple namespaces to solve this.
But is this enough? Only barely.
Deployment is only a single part of the overall lifecycle of managing a multi-tenant application on Kubernetes.
There are typically many operations that must now be applied per tenant. For example:
– Managing tenants – when do we add a tenant? Remove it? How?
– Upgrading deployments, schemas, and software – how do we time it for multiple tenants? Control it?
– Connecting to central services, e.g., billing, and the list goes on.
Operations? Operator!
At this stage, we notice a classic organizational pattern: an IT system is in need of a human operator to manage the various operations involved in the life cycle of an application on Kubernetes.
Time to automate the human operator work, and in Kubernetes Lingo – time to write an “Operator” 🙂
Unlike a helm chart, which has a single event in the lifecycle – “apply”, Kubernetes operators live within the system and can continuously monitor needs (expressed by resource definitions) against real system state.
This process is called “reconciliation,” and when the operator automatically reconciles the actual state with the desired state, it also maintains real-time status, emits relevant logs, and frees humans from manual work while providing observability.
Of course, writing a custom operator is often a non-trivial task, and it can take anything between days to weeks and even months. But it is often the case for the existing single-tenant Kubernetes-based system that we already have most of the tasks automated anyway. If that is the case, then writing the glue code as an operator may actually be much simpler.
Is this the only way?
As the old saying goes – there are many ways to crack a nut. Deploying and maintaining multiple copies of the same application into several namespaces can also be achieved using GitOps patterns or automated with other tools.
But it fits very nicely into the custom operator pattern, which for me, is an indication it is the correct use of that specific technology.
Would you agree? Or would you suggest another alternative?