This is written because I had a struggle installing ML libraries on Apple M1 chip but it can be used for any other CPU and OS. Here you will learn the following:
Basic Dokerfile helps you setup your development environment for Machine Learning projects very fast
Setup VSCode development environment to run codes on a container
Before reading further, make sure you have installed docker and VSCode on your machine. If you are using Mac, Docker desktop includes docker-compose and you can download it here.
Simple overview docker
Docker is free software that helps you to deliver software packages without worrying about the underlying hardware. It will also be thought of as PaaS (Platform as a Service, find more about it here) which uses os level virtualization, which then runs the software package as a docker container. It should not be confused with virtual machines, basically, in virtual machines, we need to specify the underlying hardware that the image uses to create its kernel, conversely, docker containers are using host kernel.
If I wanted to install common ML libraries without using Conda/Miniconda it would be a pain! So, this Dockerfile will help you set up your development environment easily and quickly. It will do the following:
Uses Ubuntu latest version for development
Create new user for development and add it to sudo
Enable SSH Server
Copy all codes inside the docker image
You should generate private and public key using ssh-keygen and put them inside folder ssh_keys where the Dockerfile is. This way there is no need to enter a password whenever you want to login to the container.
RUN mkdir -p /var/run/sshd
RUN apt-get update && apt-get install sudo -y \
RUN pip3 -q install pip --upgrade
RUN useradd -rm -d /home/development -s /bin/bash development && \
echo "development:development" | chpasswd && adduser development sudo
RUN mkdir /home/development/.ssh/ && \
chmod 700 /home/development/.ssh
COPY ssh_keys/id_rsa.pub /home/development/.ssh/authorized_keys
RUN chown development:development -R /home/development/.ssh && \
chmod 600 /home/development/.ssh/authorized_keys
COPY requirements.txt /home/development
RUN chown development:development -R /home/development/
RUN pip install -r requirements.txt
The requirement.txt file contains the following commonly used ML libraries:
Docker-compose makes it easy to run multiple instances of docker images at once! Although we don’t strictly need it here, I usually use docker-compose to run a container. The docker-compose file will do the following:
Build the image
Mount a shared folder with host
Run SSH daemon and jupyter lab
You can mount your code inside the docker container, so you can there is no need to copy them during building docker image.
Sometimes you don’t want to use web browser or you want to directly work on the files which are on the container etc. , so for that the SSH was enabled in Dockerfile. To connect to the container add the following to the ssh configuration file in your host:
IdentityFile [Full path to private key]
The config file usually are in /home/[username]/.ssh/ if it does not exist, create it or add the configuration in /etc/ssh/ssh_config. To connect to the remote environment inside VSCode click on the green button down left.
In order to connect to the container from terminal use:
ssh development@localhost -p 20
the password is development. And to access jupyter lab open the localhost with port 8800 (http://localhost:8800/) in your browser. It will need token and you can get it from logs of the container docker logs [Container ID].