Many people asked me how Duplo compares to kubernetes and Mesos. Duplo originally started as cloud agnostic container management system but later evolved to be an AWS specific micro-services platform that, in addition to container management, integrates various AWS services with Docker containers, provides a CI\CD pipeline for GitHub Repos, integrates with third party logging, billing and monitoring tools like Sumologic, Cloudability and Signalfx among other things. Duplo’s CI/CD components provides a native build system with the option to use Circle Ci for the same. So in this aspect Container Management that is the primary kubernetes & Mesos functionality is only one of the many Duplo functions. When building Duplo’s Container Management I started by looking at Kubernetes, but instead choose to build it on top of Microsoft’s Azure Pack Framework (** 09/26 - I have added an addition option for using Google OAuth and a bootstrap based UI without using Azure Pack). Following are the reasons kubernetes & Mesos did not fit the requirements:
- Multi-Tenancy: Kubernetes (or Mesos) have no notion of a tenant. They have concepts of tags using which containers can be mapped to specific hosts. They neither have identity nor authentication at a tenant level. The agents themselves are tenant agnostic as well. It is certainly possibly to add a layer of tenancy on top of Kubernetes (or Mesos) and orchestrate the same. Alternatively one can create an isolated deployment of Kubernetes (or Mesos) per tenant. Microsoft for now implements the latter with Mesos to achieve Multi-Tenant container service in its Azure Cloud. Duplo on the other hand is based off Azure Pack which is fundamentally Multi tenant. In Duplo, all the way from a Tenant login in UI, identity provider, API, scheduler, worker pools to the agent the full stack is multi-tenant. Duplo adopts a very typical public cloud architecture suitable for self-service usage while Kubernetes evolved from private cloud suitable for an administrator controlled system like VMWare’s VCenter.
- Cloud Storage and Stateless: Kubernetes again coming from a private cloud use case adopts concepts like data replication across controllers, paxos, zoo keeper or an equivalent leader election protocol. The servers are stateful and backups need to be managed.I wanted servers to be stateless and a cloud backed storage mechanism like S3 where we can rely on the provider’s reliability. Leader election can be achieved by 10 lines of code against dynamo DB. In Duplo architecture each controller host is independent and stateless, they all try to acquire a lock in dynamo with a lease and the leader uses S3 as data store.
- Networking (ELB, SSL Certs and DNS Integration): Kubernetes has a concept of Kube-proxy but is agnostic to any real Loadbalancer in the environment and it is up to an additional layer of orchestration to map the service to the LB. SSL termination and DNS are out-of scope and to be managed by a orchestration layer on top.Duplo natively integrates ELB functionality. Each service is allowed to choose “any” port that is to be loadbalanced. Duplo also allows HTTPS termination at ELB, AWS SSL certificate support and DNS configuration by integrating with Route53 and Ultra DNS. Duplo has a per tenant network overlay with flat container address space.
- Third Party Logging, Billing and Monitoring: When new hosts are added to the pool, Duplo automatically injects third party containers like sumo and signalfx collectors. It sets them up to provide an integrated logging, billing and monitoring experience. In kubernetes this is again to be done by an orchestration layer.
- Administrative Policy Framework: By leveraging Azure Pack Admin portal and apis, Duplo provides a policy framework where administrators can control tenant functions. For example set limits, public access policy, host volume mapping and other privileges.
- User Interface: Azure pack provides a nice Tenant and Admin portals where Duplo is be exposed as a resource provider.
It seemed like retrofitting Kubernetes or mesos with all these functions was not the right approach. So in order to achieve the desired functionality, leveraging Azure Pack’s multi-tenancy model and building a native container management system was a better choice. My background in Microsoft Azure compute and network management systems was big plus.
Duplo is tightly coupled with AWS and that is use case it addresses. If one is looking for a cloud agnostic single tenant container management system without any of the functions described above then Kubernetes and mesos are better choices. Mesos is a well established platform for hadoop work loads.