Python packaging and distribution is still not perfect. Packaging and distributing Django projects is less perfect and a bit painful. There quite a few hurdles to overcome to package Django projects for distribution. If you are interested, I’ve given a few talks on the topic like this one in PyCon Sei.
So, what is Docker? It is a way to isolate a program at the operating system level. Unlike virtualization, where the entire hardware running the program is isolated from the host computer via emulation, in containers, only the operating system running the program is isolated from the host computer using a separate “context”, a technique that allows running an operating system inside an operating system. While technically not the same, the end result “feels” similar to the end user as if it was a sort of “lightweight visualization”. A hardware capable of hosting 10 virtual machines will be able to host 100 or more containers.
Aside from operating system isolating, Docker (and containers in general) also works as a packaging solution. This is because you are storing your program and all the requirements needed to execute it in a repeatable manner in the same distribution medium, all the way to down to the operating system. What this means is that a Docker packaged program will run exactly the same regardless of the host operating system. In that sense, it offers many of the same benefits of visualization without the execution overhead.
“Dockerizing” Django projects is an art. There are no official guidelines and the entire community is still learning what works and what doesn’t. If you are interested in learning more about the state of technologies and techniques used to package a Django project as a Docker image you can watch my PyCon Sette presentation on the topic.
Hopefully, from that introduction, you will have a better idea why creating a Docker image for Mayan is a good idea. You get a distributable image that reduces installation steps from a few dozen to just two, and you end up with an execution unit isolated from your system that can be controlled at will.
The packaging philosophy on Docker is that each container should perform just one (or the least possible) functions. This is called “separation of concerns” and offers many advantages like being able to scale specific parts of your Docker deployment. Following this philosophy, the Mayan EDMS image includes the bare minimum to provide a running instance. That is, the Mayan EDMS image contains Mayan EDMS, a web server (NGINX), a broker to move messages for the background tasks and store their results (Redis for both), and a process manager to keep everything chugging along once deployed. Since Python has native support for SQLite, it is used by default. This setup will give you a fairly decent and well-performing deployment up to several tens of thousands of document and a few concurrent users. If you want to go beyond that then you need to scale up your Mayan EDMS Docker deployment, starting with the database, hence the purpose of this post.
To get a Mayan EDMS Docker installation using MySQL we will be launching two containers: one for MySQL and one for Mayan. Normally programs are configured via a configuration or “ini” files but the philosophy for dockerized programs is to configure containers via environment variables. Here are the steps to follow.
Docker containers are isolated by design, that means execution, file access, and network access isolation. To get our two Docker containers “talking” we create a network just for them. For that we use the command:
docker network create mayan -d bridge
This creates a
bridge network, a simple network type used to connect hosts
without routing. We’ll call our bridge network
mayan. We will deploy our
containers using this network so that they have network access to each other
as if they were the only two computers in a local area network (LAN). Docker
recently added support for dynamic domain names which means we can reference
the containers by name and not just IP address.
As mentioned above we will configure the containers using environment variables
that are passed to the container when created. Since we will be passing a
few variables, the command line to launch the containers will be a bit long
and prone to data entry mistakes. For these situations, Docker allows us to
define those environment variables in a file and pass the filename when
launching the containers, let’s do that now and create a file name
with the content:
# MySQL container MYSQL_ROOT_PASSWORD=mysql_root_password MYSQL_PASSWORD=mayan_password MYSQL_DATABASE=mayan_db MYSQL_USER=mayan_user # Mayan container MAYAN_DATABASE_DRIVER=django.db.backends.mysql MAYAN_DATABASE_NAME=mayan_db MAYAN_DATABASE_USER=mayan_user MAYAN_DATABASE_PASSWORD=mayan_password MAYAN_DATABASE_HOST=mayan-mysql MAYAN_DATABASE_PORT=3306
The first set of variables configures the MySQL container to create a database
when it launches, create a user and grant it all permission to that database.
The second set of variables configures the Mayan container to use the specified
credentials to access the database container, which from Mayan’s point of view
is just another host in a network, in this case, a host that will be
mayan-mysql. The initial line just tells Mayan which Django database
driver to use when accessing the database.
Now we proceed to create and launch the first container, the MySQL container using the command line:
docker run -d --name mayan-mysql --restart=always --env-file envfile -v mayan_mysql:/var/lib/mysql --net=mayan mysql:latest
This tells Docker to create and run a container named
mayan-mysql that will
restart every time it stops for whatever reason (hangs or the host restarts),
that will use the file
envfile for configuration, will store its data from
/var/lib/mysql into a persistent Docker storage
(called volumes) by the name of
mayan_mysql, will use the
and will use the latest official MySQL Docker image.
You can watch the log files of the container as it initializes with the command:
docker logs mayan-mysql
Finally we launch the Mayan container with the command:
docker run -d --name mayan-edms --restart=always --env-file envfile -v mayan_data:/var/lib/mayan --net=mayan -p 80:80 mayanedms/mayanedms:2.6.4-3
This tells Docker to create and run a container named
mayan-edms that will
restart every time it stops for whatever reason, that will use the file
envfile for configuration, will store its data from the directory
/var/lib/mayan into a persistent Docker volume called
expose its internal port 80 (HTTP) as 80 port to the outside world, will
mayan network and is using the version
2.6.4-3 of the official Mayan
EDMS Docker image.
Inspect the logs of the container using:
docker logs mayan-edms
and you should see the container creating the database, and doing all the
initialization required. After a few minutes, you will be able to browse to
localhost (or 127.0.0.1), on port 80 on the machine running the containers
and use Mayan normally. Since the containers were launched with
--restart=always option, you don’t need to do anything to start them up
next time you boot up the host computer.
Compare those steps to the number of steps required for a “bare metal” production deployment of Mayan EDMS (or any Django project) and you will see why Docker is becoming such a successful medium to not just run code, but also to distribute it.