Up until now all of the Kubernetes work I've done has been local on my Ubuntu server, charger. But there is only so much compute & memory I can wring from that server, so in this blog post we will look at Azure Kubernetes Service (AKS) and use Azure to deploy both a private container registry and a Kubernetes cluster, with the intent of supporting massively parallel backtesting in Dask.
Getting started
First, if you haven't already install the Azure CLI, e.g. on my Ubuntu 20.04 machine, it's as simple as:
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
then "sudo az login" and follow the prompts.
Deploying to a private registry
Azure lets you host Docker images -- similar to DockerHub -- via an Azure Container Registry (ACR). We'll create one for use with our Kubernetes cluster:
$ sudo az group create --name cloudwallResources --location eastus
$ sudo az acr create --resource-group cloudwallResources --name cloudwall --sku Basic
Confirm it's created:
$ sudo az acr list --resource-group cloudwallResources --query "[].{acrLoginServer:loginServer}" --output table
AcrLoginServer
---------------------------
cloudwall.azurecr.io
then login:
$ sudo az acr login --name cloudwall
Login Succeeded
and tag & push:
$ sudo docker tag cloudwallcapital/serenity:2020.10.31-b58 cloudwall.azurecr.io/serenity:2020.10.31-b58
$ sudo docker push cloudwall.azurecr.io/serenity:2020.10.31-b58
We now have deployed our Serenity image into a private container registry, and can reference it going forward when creating Kubernetes resources.
Create Kubernetes cluster
Until now I've relied on Ubuntu's microk8s cluster to run locally, which works well but is limited by the 128 GB of RAM in the single server sitting next to my desk at home. Can we take the same resources and move them from that internal cloud deployment to Azure? Let's give it a try.
First we create a cluster in the resource group created above, and link it to the ACR instance. Note here we're overriding the node VM size to be a Standard_E2s_v3 -- third generation VM, 16 GB; if we don't do this our Dask scheduler will be stuck forever pending scheduling to a node, as it's requesting 10 GB of memory.
$ sudo az aks create \
--resource-group cloudwallResources \
--name serenityCluster \
--node-count 1 \
--node-vm-size Standard_E2s_v3 \
--generate-ssh-keys \
--attach-acr cloudwall
install kubectl:
$ sudo az aks install-cli
and configure the credentials so kubectl works:
$ sudo az aks get-credentials --resource-group cloudwallResources --name serenityCluster
Now we should be able to use kubectl for AKS:
$ sudo kubectl get nodes
NAME STATUS ROLES AGE VERSION
aks-nodepool1-42397265-vmss000000 Ready agent 3m54s v1.17.11
and microk8s.kubectl for local:
$ sudo microk8s.kubectl get nodes
NAME STATUS ROLES AGE VERSION
charger Ready <none> 237d v1.19.0-34+1a52fbf0753680
Perfect! Now let's try and take one of our Kubernetes configs from serenity and deploy it into the cloud -- using Dask clustering from the previous post:
$ sudo kubectl apply -f dev/shadows/serenity/kubernetes/dask.yaml
service/daskd-scheduler created
deployment.apps/daskd-scheduler created
deployment.apps/daskd-worker created
$ sudo kubectl get pods
NAME READY STATUS RESTARTS AGE
daskd-scheduler-65dd64d974-24lcd 1/1 Running 0 61s
Saving money
High memory nodes are expensive, so let's set up start-stop extension so we can stop the cluster:
$ sudo az extension add --name aks-preview
The installed extension 'aks-preview' is in preview.
$ sudo az feature register --namespace "Microsoft.ContainerService" --name "StartStopPreview"
Check that registration is complete:
$ sudo az feature list -o table --query "[?contains(name, 'Microsoft.ContainerService/StartStopPreview')].{Name:name,State:properties.state}"
Name State
------------------------------------------- -----------
Microsoft.ContainerService/StartStopPreview Registering
and finish registration:
$ sudo az provider register --namespace Microsoft.ContainerService
Now we can stop our cluster until we need it again -- saving about $91 / month in this case:
$ sudo az aks stop --name serenityCluster --resource-group cloudwallResources