DuploCloud is a DevOps Robot. It is the subject matter expert in digital form. It is not a DevOps tool that requires a cloud expert to operate, but instead it is the cloud expert itself. The robot has a resume with over 500 skills that include AWS, Azure, Kubernetes, Security and IOT among others. It allows users to build applications via a simple declarative interface and not have to deal with low level infrastructure details like VPC, security groups, IAM, Jenkins setup etc. These constructs are still in play, but Duplo software abstracts it away for you by auto-generating the configuration based on application needs.

DuploCloud is a single tenant software that installs in your cloud account. Users interface with software via the browser UI and/or API calls and all data and configuration stays within your cloud account. All configurations that have been created and applied by the software are transparently available to be reviewed and edited in your cloud account. Essentially, the software auto-generates the same automation scripts that a human being would have written manually. Here is an explainer video about the product:

Pre-requisites

  • Sign-in Account: Duplo works off Google and O365 Oauth. To login to Duplo you either need a Google or Office365 account.

  • Docker Knowledge: If you are deploying a Docker based microservice using Duplo then it is assumed that you are familiar with Docker.

    • You should have a docker image for your application that has been tested locally in your computer. Make sure it runs in detached mode (i.e. docker run -d option)
    • The Image should have been pushed to your repository. If the repository is private, then your will have to set the credentials for Docker hub in your Duplo account.
  • AWS SDK Familiarity: If your are using AWS services like S3, DynamoDB, SQS, and SNS you must be familiar with a very basic knowledge of how these services can be consumed. The interface to create these services via Duplo will be very declarative and self-explanatory. You do not need any access keys in your code to access these services. Use the AWS constructor that does not take credentials but takes only region which must be US-West-2 unless specified otherwise by your enterprise administrator or during signup.

Terminologies

  • Infrastructure: An infrastructure maps 1:1 with a VPC/VNET and can be in any region. Each infrastructure has a set of subnets spread across multiple availability zones. In AWS there is a NAT gateway for private subnets.
  • Tenant or Project: Tenant or Project is a sandbox or unit of deployment. All resources in a given tenant are isolated from another tenant via Security groups and IAM policies (optionally VPC). For applications to be reachable outside the sandbox a Port mapping or ELB must be created.
  • User: This is an individual with a user id. Each user could have access to one more tenants/projects.
  • Host: This is an EC2 instance or VM. This is where your application will run.
  • Service: Service is your appliction code packaged as a single docker image and running as a set of one or more containers. It is specified as - image-name; replicas; env-variables; vol-mappings if any. Duplo also allows running applications that are not packaged as Docker images.
  • ELB: A Service can be exposed outside of the tenant\project via an ELB and DNS name. ELB is defined as - Service name + container-port + External port + Internal-or-internet facing. Optionally a wild card certificate can be chosen for SSL termination. You can choose to make it internal which will expose it only within your VPC to other applications.
  • DNS Name: By default when a Service is exposed via an ELB, Duplo will create a friendly DNS Name. A user can choose to edit this name. The domain name must have been configured in the system by the admin.
  • Docker Host or Fleet Host: If a host is marked as part of the fleet then DuploCloud will use it to deploy containers. If the user needs a host for development purposes say like a test machine then he would mark is a not part of the pool or fleet.

AWS Guide

Infrastructure

A completely new isolated environment can be created for Usecases from Duplo. Proceed to Administrator --> Infrastructure. Click on +Add button. It opens up a form where it asks to enter basic information about the new environment that is needed. Once you submit the form. Duplo will create a new VPC, with 2 subnets (private, public) in each availability zone, creates necessary security groups, NAT Gateway, Internet Gateway, Route tables..etc. It also configures the VPC peering with the master VPC which is initially configured in Duplo. Infrastructure creation takes arround 10mins to complete. Once its complete, a new plan with the same infrastructure name is automatically created and populated with the details of the newly created infrastructure. This plan can then be used to create the tenants.

Tenant

Create a new Tenant from the already created plan. Give a min for the Duplo to create the Tenant specific SG and Instance profiles which acts as a security boundary.

Quick Start - A Docker Microservice

Deployment is a three step process.

  1. Create a Host(VM)
  2. Deploy the service(Application)
  3. Expose the app using LB
Watch this short video that shows how to package your code in a Docker image and deploy:

  • Create a Host from the menu DevOps --> Hosts --> EC2 --> +Add button above the table. Choose the desired instance type. The available instance types are set by your administrator or the plan you choose in Duplo. If you are not using this host for hosting containers then set the pool as none. If you want a public IP for the host, then select the public subnet in the list of availability zones.
  • Deploy a Service (application) from the menu DevOps --> Containers --> EKS/ Native --> +Add Sign above the table. Give a name for your services (no spaces); number of replicas; Dockerimage; volumes (if any). The number of replicas must be less than or equal to the number of hosts in the fleet.
  • Create an ELB from the menu Devops --> Containers --> EKS/ Native --> Select the desired service from the table and click on Load Balancer tab for Configuration. The url suffix you specify under Health Check will be used by Duplo during rolling upgrade i.e. when a service image is changed then Duplo will take down one container replica at a time and bring up a new one, then once its running then the Duplo agent running on the host will make a call to the url and expects a 200 OK. If it does not get a 200 OK then the upgrade is paused and user needs to update with an image that fixes the issue.
  • The DNS name for the service will be present in the Services Table. It takes about 5 to 10 minutes for the DNS name to start resolving.
  • Update a Service from the menu DevOps --> Containers --> EKS/Native --> Click on Edit button under Actions column of the Services table or Select the desired service from the table and click on edit option under the Actions menu button.

Virtual Machines

You can create the EC2 instance from Duplo. Go to DevOps --> Hosts --> EC2 --> +Add button above the table. Choose the desired instance type. The available instance types are set by your administrator or the plan you choose in Duplo. If you are not using this host for hosting containers then set the pool as none. If you want a public IP for the host, then select the public subnet in the list of availability zones.

RDS Database

Create a MySQL, MsftSQL, PostGRES or Aurora DB from the menu DevOps --> Database --> Relational Databases --> +Add button above the table.

Once the DB is created which takes about 5 to 10 mins then you can expand the created DB to fetch the endpoint and credentials. Use this endpoint and credentials to connect to the DB from your application running in the EC2 instance. Note that the DB is accessible only from inside the EC2 instance (inlcuding the containers runining in it) in the current tenant. The best way to pass DB endpoint, DBName and credentials to your application is through ENV variables of the Service.

Elastic Cache

Create a REDIS or Memcache from DevOps --> Database --> Ecache --> +Add button above the table. The best way to pass the cache endpoint to your application is through ENV variables of the Service.

S3 Bucket

Create S3 Bucket from DevOps --> Storage --> S3 --> +Add button above the table. Provide the name and select S3Bucket from the dropdown. The required permissions to access the S3 bucket from VM, Lambda functions and Containers is provisoned automatically through Instance profiles and hence no Access key is required in the Application code. Application code should use the IAM role/Instance profile approach to connect to these services i.e the AWS SDK constructor which only uses region should be used. Uploading the file and changing the bucket permissions can be done directly in the AWS console which can be accessed by clicking on the >_ icon within this login the user will have enough permissions to configure the bucket. But no access or security level permissions will be given.

Elastic Search

Create an Elasticsearch domain from DevOps --> Analytics --> Elasticsearch --> +Add button above the table.

Containers

  • Built-In: Any docker container can be deployed in the VM. Go to DevOps --> Containers --> EKS/ Native --> +Add button above the table. Give a name for your services (no spaces); number of replicas; Dockerimage; volumes (if any). The number of replicas must be less than or equal to the number of hosts in the fleet. You can use AllocationTags to deploy the container in a specific set of hosts.
  • EKS: DuploCloud supports EKS and AKS out of box. Kubernetes clusters are created during Infrastructure setup under Administrator -> Infrastructure. The cluster is created in the same VPC as the Infrastructure. Typically it takes arround 10mins for an Infrastructure with AKS/EKS cluster to be ready. Next we will walk you through deploying an application with in a Tenant in Kubernetes. Application will comprise of set of VMs, Deploymentset(pods) with a Application load balancer. Pods can be deployed either through the DuploUI or through Kubectl(HelmCharts)

    • Adding Tenants: We map each tenant to a namespace in kubernetes. For example: If a tenant is called analytics in Duplo, then there will be a namespace called duploservices-analytics in kubernetes. All components of an application with in this tenant are placed in this namespace. Since Nodes cannot be part of namespace in kubernetes, we set a label called "tenantname" for all the nodes that are launched within the tenant. For example: A node launched in the analytics tenant will have label called "tenantname: duploservices-analytics". Any pods that are launched throught the Duplo UI will have appropriate nodeselector to tie the pod to the nodes with in the tenant. If you are deploying directly via kubectl then make sure your deployment have appropriate nodeselector.
    • Adding Hosts: Once the tenant is created. You can go to DevOps and change to the particular tenant. Under the hosts menu, start by creating new Nodes(Hosts). Make sure the value of pool is set to "Eks Linux"(which will be the default).
    • Docker registry credentials & Kubernetes secrets: These can be set DevOps --> Containers --> EKS/ Native tabs. Docker registry credentials are passed to the kubernetes cluster as kubernetes.io/dockerconfigjson. Other kubernetes secrets can also be set here and referenced in your deployment later.
    • Adding Services: When we say services we dont mean kubernetes service, but we mean the duplo term service(See terminology at the top). You can deploy duplo services by clicking +Add button under the EKS/ Native tab. Implicitly duplo converts this either into a deployment set or a stateful set. If there are no volume mappings then its deployment set else its a stateful set. While most configs are self explanatory, For example: Images, Replicas and environmental variables the other advanced configuration can be provided under the other K8 Config section. The conent of this section is a 1 on 1 mapping of the kubernetes api. All types of configuration for deployments are stateful sets are supported, while putting the appropriate JSON under the other K8 Config section.For example: to reference the secrets via config map, the JSON would look like the following.
      {
      	"Volumes": [
      		{
      			"name": "config-volume",
      			"configMap": {
      				"name": "game-config"
      			}
      		}
      	],
      	"VolumesMounts": [
      		{
      			"name": "config-volume",
      			"mountPath": "/etc/config"
      		}
      	]
      }
    • Kubectl token & config: Duplo provides you JIT token(15min) to access the kubectl cluster. You can get the temporary token and the k8 cluster URL from the Infrastructure you created. Select the Infrastructure --> EKS tab Once you have the token and URL run these commands in your local. kubectl config --kubeconfig=config-demo set-cluster EKS_CLUSTER --server=[EKS_API_URL] --insecure-skip-tls-verify kubectl config --kubeconfig=config-demo set-credentials tempadmin --token=[TOKEN] kubectl config --kubeconfig=config-demo set-context EKS --cluster=EKS_CLUSTER --user=tempadmin --namespace=duploservices-[TENANTNAME] export KUBECONFIG=config-demo kubectl config use-context EKS
      With these commands you have configured kubectl to point and access the k8 cluster. You can now run the commands to apply the deployment templates. kubectl apply -f nginx.yaml
      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: nginx-deployment-g
        labels:
          app: nginx-deployment-g
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: nginx-deployment-g
        template:
          metadata:
            labels:
              app: nginx-deployment-g
          spec:
            nodeSelector:
              tenantname: "duploservices-stgeast1"
            containers:
            - name: nginx
              image: nginx:latest
              ports:
              - containerPort: 80
      If you want longer duration token you can create them on your own and secure it out of band from Duplo.
    • Services display: Once the deployment command ran successfully, if you go to the services tab of the Tenant. You can see those deployments. Now you can attach the loadbalancers for the services.
    • Load balancers, WAF and other workflows: These all remain the same. See above on creation of these.
  • ECS Fargate: Documentation TBD. Please contact DuploCloud team for assistance.

Secrets Manager

Create any secrets under DevOps --> App Integration --> Aws Secrets --> +Add button above the table.

DynamoDB

Create DynamoDB from under DevOps --> Database --> DynamoDb tab --> +Add button above the table. The required permissions to access the DynamoDB from VM, Lambda functions and Containers is provisoned automatically through Instance profiles and hence no Access key is required in the Application code. Application code should use the IAM role/Instance profile approach to connect to these services i.e the AWS SDK constructor which only uses region should be used. More detailed configuration of the DynamoDB can be done directly in the AWS console which can be accessed by clicking on the Console >_ icon within this login the user will have enough permissions to configure the desired application specific details of the DynamoDB. But no access or security level permissions will be given.

SQS Queue

Create SQS Queue from under DevOps --> App Integration --> SQS tab --> +Add button above the table. Refer to DynamoDB to know more about the permissions.

SNS Topic

Create SNS Topic from under DevOps --> App Integration --> SNS tab --> +Add button above the table. Refer to DynamoDB to know more about the permissions.

Kinesis Stream

Create Kinesis Stream from under DevOps --> Analytics --> Kinesis tab --> +Add button above the table. Refer to DynamoDB to know more about the permissions.

Lambda

Create Lambda function from under DevOps --> Serverless --> Lambda tab --> +Add button. Give a name for the Lambda function and other values. This will create the lambda function. Click on AWS console to go to the aws console for this function. Test the function. You can look at the tutorial above for the same or look at aws documentation.

  • Updating Lambda Function and configuration: To update the code for the Lambda function, create a new package with a different name and upload in S3 bucket again. The select the lambda function in the table and click Edit under dropdown menu in Actions column. Make sure the right S3 bucket has been selected and provide the name of the function. To update a function configuration like timeout memory etc use the edit configuration button.
  • Integrating with other resourcesDuplo enables you to create a classic micro-services based architecture where your Lambda function can integrate with any other resource within your tenant like S3, Dynamo, RDS or other Docker based microservices. Duplo will implicitly enable the Lambda function to communicate with other resources but block any communication outside the tenant (except ELB).
  • Triggers and Event SourcesTo setup a trigger or event source, the resource needs to be created via Duplo console. Subsequently one could trigger directly from that resource to the Lambda function in the AWS console menu of your Lambda function. Resources could be S3 buckets, API Gateway, DynamoDB, SNS etc. An example trigger via API Gateway was described in the tutorial.
  • Passing SecretsPassing secrets to a Lambda function can be done in much the same way as passing secrets to your Docker based service i.e. using environmental variables. For example you can create a SQL database from Database --> RDS menu in Duplo, provide username and password then in the Lambda menu give the same username and password. You see no secrets need to stored anywhere outside like vault, git repo etc.

API Gateway

AWS ApiGateway RestApi is created from the Duplo Portal which will take care of creating the security policies to make the API Gateway accessible to other resources (like Lambda function) within the tenant. From the Duplo portal only create the rest api and all other configuration for the api (like defining methods, resources and pointing to lambda functions) should be done in the AWS console. The console for the api can be reached by selecting the API in DevOps --> Networking --> Api Gateway table and then clicking on the Console button under the Actions menu. The above tutorial displays the integration of API Gateway and lambda function.

Workspaces

Individual workspaces/Devspaces can be created for the user in Duplo. These workspaces will have access to all the resources in the tenant which user has access. To create a workspace go to Administrator --> Users --> Select user --> Devspace tab. Once the workspace is created. It can be accessed from the Code next to Diagnostics.

Service Discovery(CloudMap)

Documentation TBD. Please contact DuploCloud team for assistance.

IOT

Documentation TBD. Please contact DuploCloud team for assistance.

Cloud Watch

By default the metrics for a resource are available in the Metrics page in DuploCloud. But some metrics need agents to be installed in the system to collect the information like AWS Agent. DuploCloud provides a way to automatically install these agents on all the hosts whenever they are provisioned. For more informaion refer to https://portal.duplocloud.net/compliance/Implementation.html#agent-modules

WAF

Creation of Web Application Firewall is a onetime process. Create a WAF in the AWS Console, fetch the ID/ARN and update the plan in DuploCloud. Once the plan has the WAF it can be used attached to the LB.

Duplo also provides a WAF Dashboard through which you can analyse the traffic that is comming in, the request that are blocked. The Dashboard can be accessed from Security --> WAF

Using AWS Console

Duplo users get AWS console access for advanced configurations of S3, Dynamo, SQS, SNS Topic, Kinesis stream, API Gateway resources that were created through Duplo. ELB and EC2 parts of console is not supported yet. You can click on the AWS console icon to take you directly to that resource in AWS console with permissions scoped to the current tenant/project. You don't have permissions to create new resources directly from the AWS console. You need to do them in Duplo. You can do all operations on resources already created via Duplo. For example you can create an S3 bucket from the Duplo UI and go to AWS console to add remove files, setup static web hosting etc. Similarly, create a DynamoDB in Duplo and use AWS console to add/remove entries in the table.

FAQ

  • How do i ssh into the host?
    Under each host you can click on Connection details under Action dropdown which will provide the key file and instructions to ssh.
  • My host is windows how do i RDP?
    Under host click on on Connection details under Action dropdown, it will provide the password and instructions to rdp.
  • How do i get into the container where my code is running?
    Under the services Status TAB find the host where the container is running. Then ssh into the host (see instructions above) and then run "sudo docker ps" and get id the container of you container. Then run "sudo docker exec -it containerid bash". You can tell which one is your container by using the image id. Don't forget the sudo in docker commands
  • I cannot connect to my service url, how do i debug?
    Make sure the DNS name is resolving by running in your computer "ping ". Then we need to check if the application is running by testing the same from within the container. ssh into the host and then connect to your docker container using docker exec command (see above). From inside the container curl the application url using the ip 127.0.0.1 and port where the application is running. Confirm that this works. Then curl the same url using the ip address of the container instead of 127.0.0.1, the IP address can be obtained by running ifconfig command in the container.
    If the connection from within the container is fine then exit from the container into the host. Now curl the same endpoint from the host i.e. using container ip and port. If this works then under the ELB UI in Duplo note down the host port that Duplo created for the given container endpoint, this will be in the range 10xxx or the same as container port. Now try connecting to the "HostIP" and DuploMappedHostPort just obtained. If this works as well but the service url is still failing contact the your enterprise admin or duplolive-support@duplocloud.net.
  • What keys should i use in my application to connect to the AWS resources i have created in Duplo like S3, Dynamo, SQS etc?
    You don't need any. Use the AWS constructor that takes only the region as the argument (us-west-2). Duplo setup links your instance profile and the resources. So the host in Duplo already has access to the resources within the same tenant\project. Duplo AWS resources are reachable only from Duplo Hosts on the same account. You CANNOT CONNECT TO ANY DUPLO AWS RESOURCE FROM YOUR LAPTOP.
  • What is rolling upgrade and how do i enable it?
    If you have multiple replicas in your service i.e. multiple containers then when you update your service like change an image or env variable then duplo will make this change one container at a time i.e. it will first bring down the first container and bring up the new one on that host with the updated config. If the new container fails to start or the health check url set by the user is not returning 200OK then Duplo will pause the upgrade of remaining containers. A user has to intervene by fixing the issue typically by updating the service with a newer image that has a fix. If no health check url is specified then Duplo only checks for the new container to be running. In order to specify health check go the ELB menu and you will find the health check url suffix.
  • I want to have multiple replicas of my mvc service, how do i make sure that only one of them runs migration?
    Enable health check for your service and make sure that the api does not return 200OK till migration is done. Since Duplo waits for health check to be complete before upgrading the next service it is guaranteed that only one instance runs migration.
  • One or more of my containers are showing in a pending state, how can i debug?
    If it is Pending when the desired state is Running then either the image is being downloaded so wait a few minutes. If it's been more than five minutes check the faults from the button below the table. Check if your image name is correct and does not have spaces. Image names are case sensitive so it should be all lower case i.e. the image name in DockerHub should also be lower case.
    If the current state is Pending when desired state is Delete then it means this container is the old version of the service. It is still running as the system is in rolling upgrade and previous replica is not successfully upgraded yet. Check the faults in other containers of this service for which the Current State is Pending with Desired State as Running..
  • Some of my container status says "pending delete" what does it mean?
    This means Duplo is wanting to remove these containers. The most common reason for this is that another replica of the same service was upgraded but is now not operational. Hence Duplo has blocked the upgrade. You might see the other replicas in even "Running" state but its possible that health check is failing and that's why the rolling upgrade is blocked. To unblock fix the service configuration (image, env etc) to a error free state.
  • How to create host with public ip?
    While creating host, click on show advanced and select the public subnet in the list of availability zone.

AWS Use Cases

Docker Webapp

In this demo, we will deploy a simple Hello World NodeJS web app. Duplo pulls Docker images from Docker Hub. You can choose a public image or provide credentials to access your private repository. For the sake of this demo we will use a ready made image available on Duplo's repository on Docker Hub.

  • Login to your DuploCloud console.
  • Select DevOps --> Hosts, a Host is the instance in which your Docker container will run. You can should choose a host with appropriate processing capacity for your application
  • Click on the +Add button to choose your host. Fill out the advanced options form if required and click submit.
  • You should now see your Host present in the table. Please give it a moment to instantiate.
  • Next, we can create a Service. A Service is nothing but a container with user specified image and environment variables. Lets go ahead and click +Add button to create a new service.
  • Name the service "Test-service". For this demo we will use the latest, nodejs-hello image from Duplo's public Docker hub repository. Fill in "duplocloud/nodejs-hello:latest" in the Docker Image field.
  • Enter the desired number of replicas you want in the swarm. Please note that each replica runs in an individual Host, so the the number of replicas must equal the number of Hosts. For the sake of this demo we will choose 1.
  • Fill in the desired environment variables, this is ideal for credentials or application specific configurations.
  • Volume mapping is super easy, simply give the host path and container path as shown. Please note that we highly recommend keeping the Hosts stateless and using S3 for static assets. We will keep this field empty for this demo.
  • Hit Create! Please wait a moment for the service to initialize.
  • Almost there. Since the hello-nodejs image serves on port 3000 we need to create a load balancer (LB) configuration to map external port (LB) to internal port (container).
  • Select the Test-service and click on configure load balancer on the load balancer tab. Fill the menu as shown below and click submit. This will also create a DNS name that we can use.
  • Please wait for ~5 minutes as it can take a while for the DNS Route table changes to be reflected.
  • Expand Test-service and copy the url displayed under DNS. Hit the URL in the browser and You should see the Hello world serving our welcome message.
  • Congratulations! You have just launched your first web service on Duplo!

S3 Backed Webapp

In the last tutorial we deployed a NodeJS based webserver and accessed it using the DNS name that Duplo created for us. In this tutorial we will take it a step further. We will create a text file with a message and upload it to S3. Next, we will modify our NodeJS application to access this file and display the message to every visitor. This purpose of the tutorial is to familiarize yourself with how easily you are able to manage resources on AWS using Duplo.

  • The code in the Docker container that we used in Part 1 can be found on this repo. It's a simple NodeJS web server. Clone this repository and make the following changes.
  • Current app.js should look like this.
  • Lets make the following changes
  • In the above code we added a link variable that will point to the S3 bucket that we will create shortly, message variable is the message the text file in the S3 bucket will contain.
  • We can now go ahead and create a text file with the message "My Name is Pranay",
  • Now that we have this file, we'll upload it to S3. Login to your Duplo Console and Navigate to the DevOps --> Storage --> S3.
  • Click +Add button to create a S3 bucket, provide the resource name ancd click on Create.
  • This will create a S3 bucket
  • Now click on the AWS Console button under Actions menu, this will take us directly to the S3 bucket in AWS console with limited access. Here, we can upload the file and also modify permissions. Select Upload and choose the text file that we created earlier. Hit next till the end and the file should be uploaded.
  • Click on the file name and copy the Link for the file in the bucket.
  • Now lets assign this link to the link variable in app.js
  • Almost there, all we need to do now is create a Docker file and push to Docker Hub so that Duplo can access it.
  • I have created a simple dockerfile inspired from this Blog from NodeJS.
  • We will now run the Docker build command to build out an image from this dockerfile.
  • And now Docker push to push it to Docker Hub
  • All set! We will create a new service as we did in the last tutorial to run this new Docker container.
  • In the services tab click on the plus sign and fill the options as specified.
  • It's important to note here that this service along with the previous service will run in the single host that we created in the previous tutorial! All we need to do is create a new load balancer for this service so that we can access it.
  • Click on configure Load balancer button under Load Balancer tab of the service and fill out the form.
  • As before, give it a minute and hit the freshly generated link in the browser. Voila!
  • So now we have two services which we can access via two different load balancers running on a single host (EC2 instance) ! Due to this tight coupling with S3 we can do all sort of fun stuff like store static files as well as javascript in S3.

Serverless Webapp

Duplo makes deploying serverless applications a breeze. A step by step tutorial on how to deploy a Lambda application is at https://www.linkedin.com/pulse/deploy-existing-web-applications-serverless-using-aws-thiruvengadam/ Following is a video version of the tutorial:


Deployment is a three step process.

  • Create a Zip file: Generate a Zip package of your Lambda code. The Lambda function code should be at the root of the package. If your are using virtual env all dependencies should be packaged. Look the the aws documentation at http://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html for detailed instructions on how to generate a package. We personally prefer using tools like zappa and serverless
  • Create an S3 bucket and upload the Zip file: DevOps --> Storage --> S3 --> +Add button above the table. Give a name for your bucket or leave it blank and a name will be auto generated. Then Click on AWS Console Button to get into AWS console for this S3 bucket and upload the zip file we just created.
  • Create Lambda Function: DevOps --> Serverless --> Lambda --> +Add button above the table. Give a name for the Lambda function and other values. This will create the lambda function. Click on AWS console to go to the aws console for this function. Test the function. Refer to Lambda for more info. You can also look at the tutorial above for the same or look at aws documentation.

Passing Secrets

Passing secrets to the Docker based service/ Lambda function can be done using environmental variables. For example you can create a SQL database from the DevOps --> Database --> RDS tab in Duplo, provide username and password then in the Lambda menu give the same username and password. You see no secrets need to stored anywhere outside like vault, git repo etc.

Monitoring

Metrics of the resources created/managed in duplo can be tracked under DevOps --> Metrics.

Faults/Alarms

Faults that happen in the system be it Infrastructure creation, Container deployments, Application health checks or any Triggered Alarms can be tracked in the DuploCloud portal under Faults Menu. You can look at Tenant specific faults under Devops --> Faults or all the faults in the system under Administrator --> Faults. You can set the AWS Alarms for the individual metrics, click on the bell icon on any of the metrics. A form to create alarm showsup. You can provide the necessary information and create the alarm.

Alerting/Notification

In addition to notify you about the faults, DuploCloud integrates with Sentry, which will send an Email alert for the fault and also acts as a single place to look at all the events. To configure Sentry Go to https://sentry.io/, Under projects create a new project. Then go to Settings --> Projects --> your project --> Client keys (DSN). Click on show deprecated DSN, Get the deprecated DNS name and add it under DevOps -> Faults -> Update Sentry Config.

Central Logging

All the activity in the DuploCloud is logged which can be used for auditing. All the logs are saved into ES and can be visualized in Kibana. URL for the Kibana are available under Diagnostics. The ES & Kibana will be sitting inside the VPC and cannot be accessed from outside. Connect to the VPN and access these URL.

Backup & Restore

Backup of the resources be it DB Snaphsots, VM Snaphsots/VM AMI can be done directly under the resource in the DuploCloud. Once the backups are done, they will be available to be restored on the next instance creation.

Disaster Recovery

Documentation TBD. Please contact DuploCloud team for assistance.

Cost Management

Usage cost for the resources can be viewed in the DuploCloud under Administrator --> Billing. You can explore the cost for any period and for any set of resources.

Bigdata/ETL

  • Usecase:
    • Collection data from using various methods/sources
      • Web scraping: Selenium using headless chrome/firefox.
      • Web crowling: statis website sing crowling
      • API to Data collection: It could be REST or GraphQL api.
      • Private internal customer data collected over various transactions.
      • Private external customer data collected over secured SFTP.
      • The data purchased from 3rd party.
      • The data from various social networks.
    • Co-relate data from various sources
      • cleanup and Process data and apply various statistical methods, create
      • Correlate terrabytes of data from various sources and maker sense from the data.
      • Detect anomalies, summarize, bucketize, and various aggragations.
      • Attach meta-data to enrich data.
      • Create data for NLP and ML models for predictions of future events.
    • AI/ML pipe lines and life-cycle management
      • Make data available to data science team
      • Train models and do continueos improvement trials, reinforcement learning.
      • Create anomalies, bucketize data, summarize and do various aggragations.
      • Train NLP and ML models for predictions of future events based on history
      • Create history for models/hyper parameters and data at various stages.
  • Deploying Apache SparkTM cluster

    In this tutorial we will create a Spark cluster with a Jupyter notebook. A typical use case is ETL jobs, for example reading parquet files from S3, processing and pushing reports to databases. The aim is to process GBs of data in faster and cost effective way. The high level steps are 1. Create 3 VMs one for each Spark master, Spark worker and Jupyter notebook. (2) Deploy Docker images for each of these on these VMs.

    • Create a host for Spark master: Under DevOps --> Hosts --> EC2, click on +Add button to create hosts and click show advanced. Change the value of instance type to 'm4.xlarge' and and add an allocation tag 'sparkmaster'.
    • Create host for spark worker: Create another host for the worker. Change the value of instance type to 'm4.4xlarge' and add an allocation tag 'sparkworker'. Click on submit. The number of workers depends on how much load you want to process. So you should add one host for each worker. They should all have the same allocation tag 'sparkworker'. You can add remove workers and scale up or down the worker spark service as many times as you want. We will see in the next steps.
    • Create host for jupyter: Create one more host for Jupyter notebook. Choose the value of instance type to 'm4.4xlarge' and add the allocation tag as 'jupyter'.
    • Create spark master Docker service: Go to Containers --> EKS/ Native --> Services and create new service. Under name choose 'sparkmaster', image 'duplocloud/anyservice:spark_v6', add the allocation tag 'sparkmaster'. In the Docker host config select Host network, By setting this in Docker Host config you are making the container networking same as the VM i.e. container IP is same as VM
    • Create Jupyter service: First we need the IP address of Spark master. Click on Spark master service and on the right expand the container details and copy the host ip. Create another service, under name choose 'jupyter', image 'duplocloud/anyservice:spark_notebook_pyspark_scala_v4', add the allocation jupyter and select Host network for Docker Host Config, Add volume mapping "/home/ubuntu/jupyter:/home/jovyan/work", Also provide the environment variables {"SPARK_OPTS":" --master spark://<>:7077 --driver-memory 20g --executor-memory 15g --executor-cores 4 "} where replace <> with the IP you just got above
    • Create Spark workers: Create another service name sparkworker, image 'duplocloud/anyservice:spark_v7', add the allocation sparkworker and select Host network for Docker Host Config, Also provide the environment varibales { "node": "worker", "masterip": "<"} where replace <> with the IP you just got above. Depending on how many worker hosts you have created use the same number under replicas and that is the way you can scale up and down. At anytime you can add new hosts, set the allocation tag sparkworker and then under services, edit the sparkworker service and update the replicas.
    • Create Docker services shell :Add/update shell access by clicking on >_ icon This gives you easy access into the container shell. You will need to wait for 5 minutes for the shell to be ready. Make sure you are connected to VPN if you choose to launch the shell as internal only
    • Open Jupyter docker shell :Select Jupyter service and expand the container. Copy the hostip and then click on >_ icon.
    • Get Jupyter url :Once you are inside the shell. Enter 'jupyter notebook list' to get the URL along with auth token. Replace the Ip with Jupyter ip you copied previously.
    • Open Jupyter :In your browser, invoke the Jupyter url and you should be able to see the UI.
    • Now you can use Jupyter to connect to datasources and destinations and do ETL jobs. Sources and destinations can include various SQL & NoSQL databases, S3 and various reporting tools including Big data and GPU based Deep learning.

  • Jupyter with an ETL Use-case

    In this tutorial we will create a Jupyter notebook and show some basic web scraping, using spark for preprocssing, exporting into schema, do ETLs, join multiple dataframes (parquets), export reports into mysql.

    • Scrape data from Internet: Connect to a website and parse html (using jsoup)
    • Download data: Extract the downloaded zip, This perticular file is 8 GB in size and has 9 million records in csv
    • Save to S3 Upload the data to Aws S3
    • Connect to spark cluster Also Configure session with required settings to read & write from Aws S3
    • Load data in spark cluster
    • Define schema
    • Do data processing
    • Spark Sql
    • Spark Sql joins 20 GB of data from multiple sources
    • Export reports to RDS for UI consumption Generate various charts and graphs

Security & Compliance

Refer to Compliance

Cloud Migration(SMS)

Documentation TBD. Please contact DuploCloud team for assistance.

Remote Working

Refer to Workspaces

Azure Guide

Subscription

DuploCloud DevOps Bot needs Azure Subscription details for it to Manage Cloud Resources. Proceed to Admin --> Credentils. Click on +button. It opens up a form where it asks to enter information about Azure Subscription ID, secrets.

Infrastructure

A completely new isolated environment can be created for <> from Duplo. Proceed to Admin --> Infrastructure. Click on +button. It opens up a form where it asks to enter basic information about the new environment that is needed. Duplo will create a VNET with a default subnet and a default NSG. Creation of the Infrastructure takes arround 10mins to complete, once the Infrastructure is complete it comes to Ready state. A plan with the same name will be create for this Infrastructure. This plan can be used for creation of the tenants.

Key Vault

Documentation TBD. Please contact DuploCloud team for assistance.

Storage Account

Documentation TBD. Please contact DuploCloud team for assistance.

Tenant

Create a new Tenant from the already created plan. Give a min for the Duplo to create the NSG for the tenant which acts as a security boundary.

Quick Start

Switch to the Tenant we have created. Deployment is a three step process. 1. Creating Host(VM), 2. Deploying a Service, 3. Exposing the service using LB
  • Create a Host(VM) from the menu DevOps --> Hosts --> Azure --> Azure Hosts --> + sign above the table. Choose the desired instance type. The available instance types are set by your administrator or the plan you choose in Duplo. If you are not using this host for hosting containers then set the pool as none. If you want a public IP for the host, then enable the public ip.
  • Deploy a Service (application) from the menu DevOps --> Services --> + Sign above the table. Give a name for your services (no spaces); number of replicas; Dockerimage; volumes (if any). The number of replicas must be less than or equal to the number of hosts in the fleet.
  • Expose using LB Create an LB from the menu DevOps --> Services --> + Sign above the table for Load Balancer Configuration. The url suffix you specify under Health Check will be used by Duplo during rolling upgrade i.e. when a service image is changed then Duplo will take down one container replica at a time and bring up a new one, then once its running then the Duplo agent running on the host will make a call to the url and expects a 200 OK. If it does not get a 200 OK then the upgrade is paused and user needs to update with an image that fixes the issue.
  • The DNS name for the service will be present in the Services Table. It takes about 5 to 10 minutes for the DNS name to start resolving.
  • Update a Service from the menu DevOps --> Services --> Select the service from the table and click on edit.

Virtual Machine Scale Sets

Documentation TBD. Please contact DuploCloud team for assistance.

SQL Managed Instances

Documentation TBD. Please contact DuploCloud team for assistance.

SQL Database

Documentation TBD. Please contact DuploCloud team for assistance.

Containers

Webapps

Documentation TBD. Please contact DuploCloud team for assistance.

Functions

Documentation TBD. Please contact DuploCloud team for assistance.

WAF

Documentation TBD. Please contact DuploCloud team for assistance.

Tables

Documentation TBD. Please contact DuploCloud team for assistance.

Azure Use Cases

Docker Web App

In this demo, we will deploy a simple Hello World NodeJS web app. Duplo pulls Docker images from Docker Hub. You can choose a public image or provide credentials to access your private repository. For the sake of this demo we will use a ready made image available on Duplo's repository on Docker Hub.

  • Login to your DuploCloud console.
  • Select "Deployments" from the tab on the top left corner.
  • Select "Hosts" from the tabs. Select "Azure" Cloud on the left-side Section. A Host is the instance in which your Docker container will run. You can should choose a host with appropriate processing capacity for your application.
  • Click on the plus icon to choose your host. Fill out the advanced configuration form if required and click submit.
  • You should now see your Host present in the table. Please give it a moment to instantiate.
  • Next, we can create a Service. A Service is nothing but a container with user specified image and environment variables. Lets go ahead and click the plus icon to create a new service.
  • Name the service "test-service". For this demo we will use the latest, nodejs-hello image from Duplo's public Docker hub repository. Fill in "duplocloud/nodejs-hello:latest" in the Docker Image field.
  • Enter the desired number of replicas you want in the swarm. Please note that each replica runs in an individual Host, so the the number of replicas must equal the number of Hosts. For the sake of this demo we will choose 1.
  • Fill in the desired environment variables, this is ideal for credentials or application specific configurations.
  • Volume mapping is super easy, simply give the host path and container path as shown. Please note that we highly recommend keeping the Hosts stateless and using S3 for static assets. We will keep this field empty for this demo.
  • Hit Submit! Please wait a moment for the service to initialize.
  • Almost there. Since the hello-nodejs image serves on port 3000 we need to create a load balancer (LB) configuration to map external port (LB) to internal port (container).
  • Select the Test-service and click the plus icon on the load balancer configuration table. Fill the menu as shown below and click submit.
  • Please wait for ~5 minutes as it can take a while for the Load Balancer to get provisioned.

Serverless WebApps

Documentation TBD. Please contact DuploCloud team for assistance.

Metrics

Metrics of the resources created/managed in duplo can be tracked under Deployments --> Metrics. You can also set the AWS Alarms/Azure Alert rules for the individual metrics, click on the bell icon on any of the metrics. A form to create alarm showsup. You can provide the necessary information and create the alarm. If the AWS Alarm get triggered or any Azure alert is raised, Duplo will fetch it and showup in the faults. Refer Faults/Alerting section for more information

Faults/Alerting

Faults that happen in the system be it Infrastructure creation, Container deployments or Application health checks can be tracked in the DuploCloud portal under Faults Menu. You can look at Tenant specific faults under Deployments-> Faults or all the faults in the system under Admin -> Faults. In addition to notify you about the faults, DuploCloud integrates with Sentry, which will send an Email alert for the fault and also acts as a single place to look at all the events. To configure Sentry Go to https://sentry.io/, Under projects create a new project. Then go to Settings -> Projects -> your project -> Client keys (DSN). Click on show deprecated DSN, Get the deprecated DNS name and add it under Deployments -> Metrics -> Sentry DSN.

Logs

All the activity in the DuploCloud is logged which can be used for auditing. All the logs are saved into ES and can be visualized in Kibana. URL for the Kibana are available under Diagnostics. The ES&Kibana will be sitting inside the VPC and cannot be accessed from outside. Connect to the VPN and access these URL.

Backups

Backup of the resources be it DB Snaphsots, VM Snaphsots/VM AMI can be done directly under the resource in the DuploCloud.

Disaster recovery

Documentation TBD. Please contact DuploCloud team for assistance.

Security

Documentation TBD. Please contact DuploCloud team for assistance.

Cloud Migration

Documentation TBD. Please contact DuploCloud team for assistance.

Remote Working

Documentation TBD. Please contact DuploCloud team for assistance.

Additional Deployment functionalities

Advanced Functions

  • Allocation tags: By default Duplo spreads the container replicas across the hosts. By virtue of allocation tag the user can pin a container to a set of hosts that have a specific tag. Click on the edit icon next to allocation tag under host. 
"AllocationTags" as key is already pre-populated, configure the value of your choice for example "highmemory;highcpu" or "serviceA" and click on save on the left. Then, when creating a service, set the allocation tag value to be a substring of the above tag, for example highmemory or serviceA. The allocation algorithm tries to pick a host where AllocationTag specified in the service is a subsstring of the AllocationTags of the host.
    "AllocationTags" as key is already pre-populated, configure the value of your choice for example "highmemory;highcpu" or "serviceA" and click on save on the left. Then when creating a service set the allocation tag value to be a substring of the above tag for example highmemory or serviceA. The allocation algorithm tries to pick a host where AllocationTag specified in the service is a subsstring of the AllocationTags of the host.

BYOH

If you have a host already running somewhere in the cloud or onpremise, you can bring that to Duplo using BYOH functionality and let Duplo manage the host, let it be running the containers or installing the compliance agents. To configure BYOH go to Deployments -> Hosts -> BYO-HOSTS. Click on add and provide the name, Ip Address and in the Fleet type select Native App(if you dont Duplo to manage the containers but manage the compliance agents on the host), provide the username, password/Privatekey file. Make sure the SSH access to the host is opened to access from the Duplo. After the host is added in about 5mins you can go to Security -> Agents -> Select agent and see that the agent on host is in Active state.

Administrator's Guide

Configuration Scopes

  • Base Configuration: This must be setp by the user outside Duplo and provided as an input to DuploCloud. This is a once in a life time configuration. This configuration includes one VPC and one subnet with appropriate routes. It is recommended that you setup at least 2 Availability Zones, one public and one private subnet in each availability zone. All resources (EC2 instances, RDS, Elastic-cache etc) are placed in private subnet by default. ELB is placed in public or private subnet based on user (tenant) choice. DuploCloud provides a default cloud formation template that can be used if desired.
  • Plan Configuration: Every tenant is part of one and only one plan. Configuration applied at the plan level is applies to all tenant's in a plan. A plan can be used to denote say a dev environment, a class of tenants (private\public facing) etc. Following are the plan configurations. Except of VPC and subnets rest of the parameters can be changed at will. Plan parameters include:
    {
    			 "Name": "devplan", /* Name of the plan */
    			 "Images": [ /* AMIs that are made available for the tenants */
    			  {
    			   "Name": "Duplo-Host-Docker-v1.17", /* User friendly Name of the AMI displayed to the user */
    			   "ImageId": "ami-0861ea68", /* AWS AMI ID */
    			   "OS": "Linux"
    			  },
    			  {
    			   "Name": "ubuntu-dev",
    			   "ImageId": "ami-d83af7b8",
    			   "OS": "Linux"
    			  },
    			  {
    			   "Name": "Windows-Docker-Fleet",
    			   "ImageId": "ami-1977d979",
    			   "OS": "Windows"
    			  },
    			  {
    			   "Name": "Unmanaged Ubuntu-16.04",
    			   "ImageId": "ami-5e63d13e",
    			   "OS": "Linux"
    			  }
    			 ],
    			 "Quotas": [ /* Limit of on the number of resources that be used by each tenant. If none is set for a given resource type, then there is no limit. */
    			  {
    			   "ResourceType": "ec2",
    			   "CumulativeCount": 2,
    			   "InstanceQuotas": [
    			    {
    			     "InstanceType": "t2.medium",
    			     "MetaData": "(2CPU, 4GB)",
    			     "Count": 2
    			    },
    			    {
    			     "InstanceType": "t2.small",
    			     "MetaData": "(1CPU, 2GB)",
    			     "Count": 2
    			    }
    			   ]
    			  },
    			  {
    			   "ResourceType": "rds",
    			   "CumulativeCount": 1,
    			   "InstanceQuotas": [
    			    {
    			     "InstanceType": "db.t2.small",
    			     "MetaData": "(1CPU, 1.7GB)",
    			     "Count": 1
    			    }
    			   ]
    			  },
    			  {
    			   "ResourceType": "ecache",
    			   "CumulativeCount": 1,
    			   "InstanceQuotas": [
    			    {
    			     "InstanceType": "cache.t2.small",
    			     "MetaData": "(1CPU, 1.7GB)",
    			     "Count": 1
    			    }
    			   ]
    			  },
    			  {
    			   "ResourceType": "s3",
    			   "CumulativeCount": 1,
    			   "InstanceQuotas": [
    			    {
    			     "InstanceType": "bucket",
    			     "MetaData": "bucket",
    			     "Count": 1
    			    }
    			   ]
    			  },
    			  {
    			   "ResourceType": "sqs",
    			   "CumulativeCount": 1,
    			   "InstanceQuotas": [
    			    {
    			     "InstanceType": "queue",
    			     "MetaData": "queue",
    			     "Count": 1
    			    }
    			   ]
    			  },
    			  {
    			   "ResourceType": "dynamodb",
    			   "CumulativeCount": 1,
    			   "InstanceQuotas": [
    			    {
    			     "InstanceType": "table",
    			     "MetaData": "table",
    			     "Count": 1
    			    }
    			   ]
    			  },
    			  {
    			   "ResourceType": "elb",
    			   "CumulativeCount": 2,
    			   "InstanceQuotas": [
    			    {
    			     "InstanceType": "elb",
    			     "MetaData": "elb",
    			     "Count": 1
    			    }
    			   ]
    			  },
    			  {
    			   "ResourceType": "sns",
    			   "CumulativeCount": 1,
    			   "InstanceQuotas": [
    			    {
    			     "InstanceType": "topic",
    			     "MetaData": "topic",
    			     "Count": 1
    			    }
    			   ]
    			  }
    			 ],
    			 "AwsConfig": {
    			  "VpcId": "vpc-1eafce79", /* VPC for this tenant */
    			  "AwsHostSg": "sg-c2d899ba", /* list of security groups separated by ;. All hosts will be placed in this security groups. */
    			  "AwsElbSg": "sg-b9d899c1", /* Security group in which the ELB will be placed. This applies to traffic into the extenal (VIP) of the elb. Internal Sg between hosts and ELB will be auo setup.  */
    			  "AwsPublicSubnet": "subnet-0066b449;subnet-24e25943", /* Subnets in which hpublic facing ELBs must be placed. Each subnet corresponds to an AZ. */
    			  "AwsInternalElbSubnet": "subnet-0e66b447;subnet-23e25944", /* Subnets in which internal resources like Ec2 hosts, rds ecaache etc will be placed. If you like hosts to be public facing then put the same public subnets here. */
    			  "AwsElastiCacheSubnetGrp": "duplo-cache-aww6gk5hsy4n",
    			  "AwsRdsSubnetGrp": "duplo-vpc-21-resources-dbsubnetduplo-z9fh3g59362z",
    			  "CommonPolicyARNs": "", /*Set of IAM policy ARNs that will be applied to all tenants. */
    			  "Domain name": "" /* Name of route 53  domain that will be used for tenant custom dns names*/
    			  "CertMgrResourceArns": "" /* List of certificate arns that wil be available to the tenant for SSL termination on the ELB. */
    			 },
    			 "UnrestrictedExtLB": false,
    			 "Capabilities": {
    			  "DisableNativeApps": false,
    			  "DisablePublicEps": false,
    			  "DisablePrivateElb": false,
    			  "AssignInstanceElasticIp": false,
    			  "BlockEbsOptimization": false,
    			  "DisableSumoIntegration": false,
    			  "DisableSignalFxIntegration": false,
    			  "EnableTenantExpiry": false
    			 }
    			}
    			
  • Tenant Configuration: These are the configuration that we covered in the deployment guide.

Access Control

There are 2 types of roles, user and administrator. Each user can have access to one or more tenants. Each tenant can be accessed by one or more users. Administrator have access to all tenants plus the administrative functions like plan configuration, system dashboard, system faults etc



Accessing Cloud Server

VPN Access

Duplo provisions the OpenVPN server for users to connect with the cloud resources like EC2, RDS instances, Elastic Search, Elastic Cache etc. Below steps will guide you to connect with VPN server using OpenVPN client.

  • Setup OpenVPN client
    Login to OpenVPN web portal to download OpenVPN config and client.
  • Click on Profile --> Settings as shown in image below
  • Click the openvpn link which will open a new tab.
  • Login using Username and Password provided.
  • Download the openvpn config and client.
  • Now you are ready to connect with VPN using OpenVPN client.

Sharing Encrypted DB

Sharing Unencrypted DB to other accounts is very simple and straight forward. But sharing encrypted DB is slightly difficult. Here we will go through the steps that needs to be followed to share the encrypted DB

1. Create a new customer managed key in AWS KMS, in the Define key usage permissons provide the account id of the other account. 2. Once the key is created, go to RDS -> Snapshots, select the snapshot and click Copy Snapshot.In the encryption change the master key to the key we created before. 3. Once the copied snapshot is ready, as usual share the snapshot to another account by clicking share snapshot and providing the other account id.
4. Now go to the other AWS account -> RDS -> Shared with me. Select the shared snapshot and click copy-snapshot again, and change the encryption key to the encryption key in the account. 5. In the cpoied snapshot add a tag with Key as "Name" and Value as "duploservices-{tenantname}" where tenantname is the tenant where u want to launch an RDS with this snapshot 6. Go to Duplo portal select the tenant. Open RDS--> Add new DB (+ icon)--> Give name for the new DB--> In the snapshot select the new snapshot-> Enter instance type and hit submit. In few mins, the DB will be created with the data from the snapshot. You must use the existing username and password to access the DB

SSH EC2 Instance

Once you are connected to VPN. You can use the privatekey to SSH into EC2 instance.

  • Download Private key
    Select Tenant and navigate to Deployments --> Hosts.
  • Click on the unlock icon as shown below.
  • Click on the key icon to download the private key. Change the permission to key file as shown and ssh into the ec2 instance
  • Now you should be connected to the server.


Katkit - Duplo's CI/CD Component

Duplo provides a CI/CD framework that allows you to build, test and deploy your application from Git HUB commits and PRs. We call it Katkit. Katkit is a arbitrary code execution engine which allows the user to run arbitrary code before and after deployment. Katkit follows the same notion of a "Tenant" or environment. Thus tying together CI and CD. In other words the tests are run against the application in same underlying AWS topology where one's code is running as against running them in a separate fleet of servers which does not capture the interactions of the application with the AWS infrastructure like IAM, Security groups ELB etc.

At a high level, Katkit functions as follows

  • A repository is linked to a Tenant.
  • User chooses a GIT commit to run test and deploy
  • Katkit deploys a service in the same tenant with the docker image provided by DuploCloud, which is essentially like a jenkins worker and had the Katkit agent in it.
  • Katkit agent in the CI container checks out the code at that commit inside the container. It then executes ci.sh from the checked out code. Essentially each build is a short lived service that is removed once the ci.sh execution is over.
  • User can put any arbitrary code in ci.sh
  • Katkit allows, for a given run of a commit, the user to execute code in "phases" where in each phase Katkit repeats the above steps with a difference in the ENV variables that are set in each phase. The code inside ci.sh is to read the env variables and perform different actions corresponding to each phase
  • Katkit has a special phase called "deployment" where it does not run ci.sh but it looks for the servicdescription.js file (details below), replaces the docker image tag and replaces it with the git commit sha. It is assumed that the user, before invoking the deployment phase, has gone through a prior phase where he build a docker image which was tagged with the git commit sha. The sha is available as an ENV variable in every phase.

First Deployment

Before using CI/CD, the first deployment of the application needs to be done via Duplo menus described above. Make sure that the application works as expected. Katkit is used only for upgrades of container images and run tests that can be written to run before and after.

Environments

In Duplo a standard practice is to have a separate tenant for a given logical application for each deployment environment. For example say an application called taskrunner would be created as three tenants called d-taskrunner, b-taskrunner and p-taskrunner to represent dev, beta and prod environment. In each tenant one can specify an arbitrary name for the env say "DEV" in the menu Dashboard-->ENV. This string will be set by Katkit as an ENV variable when it runs the tests during CI/CD and thus your test code can use this value to determine what tests should be run in each env or for that matter take any other action.

Export Service Description

Service Description represents the toplogy of a service. It is a JSON file that is used by Katkit to upgrade the running service. Go to Deployment-->Services-->Export. This will give a json file. Save this as servicedescription.js under the servicedescription folder that must exist at the root of your repository. In this file search for "DockerImage": and here change the image tag to the word <hubtag> for example change "DockerImage": "nginx:latest" to "DockerImage": "nginx:<hubtag>". Remove the ExtraConfig and Replicas field from the file. These have env variables and replicas which would vary from one environment to other. Hence during deployment Katkit will retain what is already present in the current running service.

Link Repository

Once the above steps have been performed, you can link your GitHub Repository to your tenant. In addition to the repository name you also need to specify the "Home Branch" which is the branch for which the PRs will be monitored by Katkit for the user to run deployments. Same repository and branch combination can be linked in several tenants. If your repository has several services for different tenants then each service can be represented by a separate folder at the root. This is Folder Path field. Katkit looks for service description file under /servicedescription/servicedescription.js Same respository but different folders can also be used in different tenant. Same tenant can also have different repositories.

Phases

Each CI/CD run comprises of one or more phases. There are two types of phases - execution and deployment. In execution phase Katkit will invoke ci.sh file from the repository. The difference between two execution phases is in ENV variables based on which user code in ci.sh can operate differently. There can be only one deployment phase in a given run. Katkit does not run ci.sh in deployment phase but it looks for the servicdescription.js file (details below), replaces the docker image tag <hubtag> and replaces it with the git commit sha. It is assumed that the user, before invoking the deployment phase, has gone through a prior phase where he build a docker image which was tagged with the git commit sha. The sha is available as an ENV variable in every phase.

Katkit Config

The above configuration customizations like Phases, ENV etc can be saved in the repository in a config file called katkitconfig.js Following is an example of one such file

Advanced Functions

  • Bring-your-own-image: By default all tenant CI/CD runs are executed in a docker image specified by the administrator. This image would typically have the most common packages for your organization. But a user can bring his own builder image and specify the same. The image should have the katkit agent that can be copied from the default builder image.
  • Bring-your-own-fleet or Local Fleet: By default Katkit will run the builder containers in a separate set of hosts, but the user can also choose to run the build container in the same tenant hosts which is being tested.