Docker for ML on Mac

This is written because I had a struggle installing ML libraries on Apple M1 chip but it can be used for any other CPU and OS. Here you will learn the following:

  • Basic Dokerfile helps you setup your development environment for Machine Learning projects very fast

  • Setup VSCode development environment to run codes on a container

Before reading further, make sure you have installed docker and VSCode on your machine. If you are using Mac, Docker desktop includes docker-compose and you can download it here.

Simple overview docker

Docker is free software that helps you to deliver software packages without worrying about the underlying hardware. It will also be thought of as PaaS (Platform as a Service, find more about it here) which uses os level virtualization, which then runs the software package as a docker container. It should not be confused with virtual machines, basically, in virtual machines, we need to specify the underlying hardware that the image uses to create its kernel, conversely, docker containers are using host kernel.

Dockerfile

If I wanted to install common ML libraries without using Conda/Miniconda it would be a pain! So, this Dockerfile will help you set up your development environment easily and quickly. It will do the following:

  • Uses Ubuntu latest version for development

  • Create new user for development and add it to sudo

  • Enable SSH Server

  • Copy all codes inside the docker image

You should generate private and public key using ssh-keygen and put them inside folder ssh_keys where the Dockerfile is. This way there is no need to enter a password whenever you want to login to the container.

FROM ubuntu:latest

ARG DEBIAN_FRONTEND=noninteractive
RUN mkdir -p /var/run/sshd

RUN apt-get update && apt-get install sudo -y \
    nano \
    build-essential \
    openssh-server \
    python3 \
    python3-pip \
    python3-dev \
    git
RUN pip3 -q install pip --upgrade
RUN useradd -rm -d /home/development -s /bin/bash development && \
    echo "development:development" | chpasswd && adduser development sudo
RUN mkdir /home/development/.ssh/ && \
    chmod 700 /home/development/.ssh
COPY ssh_keys/id_rsa.pub /home/development/.ssh/authorized_keys
RUN chown development:development -R  /home/development/.ssh && \
    chmod 600 /home/development/.ssh/authorized_keys
WORKDIR /home/development
COPY requirements.txt /home/development
RUN chown development:development -R /home/development/
RUN pip install -r requirements.txt

The requirement.txt file contains the following commonly used ML libraries:

pandas
numpy
scikit-learn
scipy
matplotlib
dtale
jupyter
jupyterlab
python-dateutil

Docker-compose

Docker-compose makes it easy to run multiple instances of docker images at once! Although we don’t strictly need it here, I usually use docker-compose to run a container. The docker-compose file will do the following:

  • Build the image

  • Bind ports

  • Mount a shared folder with host

  • Run SSH daemon and jupyter lab

You can mount your code inside the docker container, so you can there is no need to copy them during building docker image.

version: "3.3"
services:
  stockmarket:
    image: "development:v1"
    build: .
    ports:
      - "20:22"
      - "40000:40000"
      - "8800:8888"
    volumes:
      - "../share_files:/mount"
    command: >
      bash -c "/usr/sbin/sshd
      && jupyter-lab --ip=0.0.0.0 --no-browser  --allow-root"

Setup VSCode development

Sometimes you don’t want to use web browser or you want to directly work on the files which are on the container etc. , so for that the SSH was enabled in Dockerfile. To connect to the container add the following to the ssh configuration file in your host:

Host development
  User development
  Hostname localhost
  IdentityFile [Full path to private key]
  port 20

The config file usually are in /home/[username]/.ssh/ if it does not exist, create it or add the configuration in /etc/ssh/ssh_config. To connect to the remote environment inside VSCode click on the green button down left.

Fig. 1: SSH bottom left of the VSCode window

In order to connect to the container from terminal use:

ssh development@localhost -p 20

the password is development. And to access jupyter lab open the localhost with port 8800 (http://localhost:8800/) in your browser. It will need token and you can get it from logs of the container docker logs [Container ID].

You can find all files in my git repo.

Don’t forget to give a star and share it.

Enjoy and PEACE!

Add a Comment

Your email address will not be published. Required fields are marked *