In Research-to-Go with Docker we used Docker and Docker Compose to create a Serenity research environment. Unfortunately all good things must come to an end, and what's described in that previous article is currently not working because Jupyter Lab has moved on to the 2.x series, but the Git extension has not kept up; the Docker Compose setup no longer boots. I decided to take the chance to re-do it using a custom Docker image that's fully in my control rather than the stock Jupyter Docker image and switch to Kubernetes to manage it at runtime.
Updating the Dockerfile
The first change is to the Dockerfile. The base changes from the Jupyter Docker stack to the more basic python:3.8-slim-buster image used by the rest of Serenity. Some functions done in the "base" Dockerfile also get pushed down, e.g. creation of a "serenity" user and home directory:
RUN apt update && apt install --yes git nodejs npm
COPY $PWD/requirements.txt /app/requirements.txt
RUN pip install -r /app/requirements.txt && \
jupyter labextension install @jupyterlab/git --no-build && \
jupyter labextension install @jupyterlab/plotly-extension --no-build && \
jupyter labextension install nbdime-jupyterlab --no-build && \
jupyter serverextension enable --py jupyterlab_git && \
jupyter lab build
RUN groupadd -r serenity && useradd --create-home --no-log-init -r -g serenity serenity
RUN mkdir -p /home/serenity/dev/shadows
RUN git clone https://github.com/cloudwall/serenity.git
CMD ["jupyter", "lab", "--ip=*", "--port=8888", "--no-browser", "--notebook-dir=/home/serenity/dev/shadows/serenity"]
With this done we can build, tag and push:
sudo docker build -t serenity-jupyterlab:2020.04.12-2 .
sudo docker tag serenity-jupyterlab:2020.04.12-2 \
sudo docker push cloudwallcapital/serenity-jupyterlab:2020.04.12-2
creating a new tag up in the cloudwallcapital/serenity-jupyterlab repository.
You can now run it with:
$ sudo docker run cloudwallcapital/serenity-jupyterlab:2020.04.12-2
[I 14:58:26.081 LabApp] Writing notebook server cookie secret to /home/serenity/.local/share/jupyter/runtime/notebook_cookie_secret
[W 14:58:26.273 LabApp] WARNING: The notebook server is listening on all IP addresses and not using encryption. This is not recommended.
[I 14:58:26.827 LabApp] JupyterLab extension loaded from /usr/local/lib/python3.8/site-packages/jupyterlab
[I 14:58:26.827 LabApp] JupyterLab application directory is /usr/local/share/jupyter/lab
[I 14:58:27.452 LabApp] Serving notebooks from local directory: /home/serenity/dev/shadows/serenity
[I 14:58:27.452 LabApp] The Jupyter Notebook is running at:
[I 14:58:27.452 LabApp] http://f2c0257c98ba:8888/?token=0a03d9e4153a59a77a76d3ef41b2b1b0d8bb2dbda6556793
[I 14:58:27.452 LabApp] or http://127.0.0.1:8888/?token=0a03d9e4153a59a77a76d3ef41b2b1b0d8bb2dbda6556793
[I 14:58:27.452 LabApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 14:58:27.454 LabApp]
To access the notebook, open this file in a browser:
Or copy and paste one of these URLs:
to test that everything is working before we move on to the Kubernetes step.
Long-running Jupyter with Kubernetes Deployment
The Kubernetes piece requires a couple components;
- a storage mapping for read-only access to the Behemoth tick store
- a Deployment for the container itself
- a NodePort object to expose to the container to the outside world
We start with storage, which uses the ReadOnlyMany mode, unlike the recorder, and createa cleaim for the serenity-jupyterlab app to use this storage:
The Deployment runs a single instance with a specific version of the Labs software to ensure stability, having learned our lessons with the previous Jupyter stack. We also reference the behemoth-volume claim and mount it at
/behemoth for use:
- name: serenity-jupyterlab
- containerPort: 8888
- mountPath: /behemoth
- name: behemoth-volume
Finally, we need to expose our Jupyter lab instance to the world, mapping the Pods internal port 8888 listener to 30888:
- port: 8888
We can use
microk8s.kubectl to apply all three of these YAML files, and the net result should look something like this:
$ sudo microk8s.kubectl get pods
NAME READY STATUS RESTARTS AGE
binance-recorder-67954bbdf8-ppjxc 1/1 Running 0 15h
coinbase-recorder-5dc6bc495b-p7khr 1/1 Running 0 15h
phemex-recorder-69b84bc9bc-ww56c 1/1 Running 0 15h
scheduler-69fb554d55-bl46l 1/1 Running 0 15h
serenity-jupyterlab-fdb94c444-vpftc 1/1 Running 0 15h
timescaledb-648d59b7f5-gnd5w 1/1 Running 1 15h
We'll need to get the token, but we can grab that quickly with the logs command:
$ sudo microk8s.kubectl logs serenity-jupyterlab-fdb94c444-vpftc | egrep "127.0.0.1:8888"
And we are back up and running, this time with an always-running Jupyter Lab and a stable deployment mechanism.