Previously, I presented a simple web application that was distributed into several docker containers. In this article, I will be introducing the CoreOS platform as the backend for clusterizing a CherryPy application.
CoreOS quick overview
CoreOS is a Linux distribution designed to support distributed/clustering scenarios. I will not spend too much time explaining it here as their documentation already provides lots of information. Most specifically, review their architecture use-cases for a good overview of what CoreOS is articulated.
What matters to us in this article is that we can use CoreOS to manage a cluster of nodes that will host our application as docker containers. To achieve this, CoreOS relies on a technologies such as systemd, etcd and fleet at its core.
Each CoreOS instance within the cluster runs a linux kernel which executes systemd to manage processes within that instance. etcd is a distributed key/value store used across the cluster to enable service discovery and configuration synchronization within the cluster. Fleet is used to manage services executed within your cluster. Those services aredescribed in files called unit files.
Roughly speaking, you use a unit-file to describe your service and specify which docker container to execute. Using fleet, you submit and load that service to the cluster before starting/stopping it at will. CoreOS will determine which host it will deploy it on (you can setup constraints that CoreOS will follow). Once loaded onto a node, the node’s systemd takes over to manage the service locally and you can use fleet to query the status of that service from outside.
Setup your environment with Vagrant
Vagrant is a nifty tool to orchestrate small deployment on your development machine. For instance, here is a simple command to create a node with Ubuntu running on it:
$ vagrant up ubuntu/trusty64 --provider virtualbox
Vagrant has a fairly rich command line you can script to generate a final image. However, Vagrant usually provisions virtual machines by following a description found within a simple text file (well actually it’s a ruby module) called a Vagrantfile. This is the path we will be following in this article.
Let’s get the code:
$ hg clone https://bitbucket.org/Lawouach/cherrypy-recipes $ cd cherrypy-recipes/deployment/container/vagrant_webapp_with_load_balancing
From there you can create the cluster as follows:
$ eval `ssh-agent -s` $ export FLEETCTL_TUNNEL=127.0.0.1:2222 $ ./cluster create
I am not using directly vagrant to create the cluster because there are a couple of other operations that must be carried to let fleet talk to the CoreOS node properly. Namely:
- Generate a new cluster id (via https://discovery.etcd.io/new)
- Start a ssh agent to handle the node’s SSH identities to connect from the outside
- Indicate where to locate the node’s ssh service (through a port mapped by Vagrant)
- Create the cluster (this calls vagrant up internally)
Once completed, you should have a running CoreOS node that you can log into:
$ vagrant ssh core-01
To destroy the cluster and terminate the node:
$ ./cluster destroy
This also takes care of wiping out local resources that we don’t need any longer.
Before moving on, you will need to install the fleet tools.
$ wget https://github.com/coreos/fleet/releases/download/v0.9.0/fleet-v0.9.0-linux-amd64.tar.gz $ tar zxvf fleet-v0.9.0-linux-amd64.tar.gz $ export PATH=$PATH:`pwd`/fleet-v0.9.0-linux-amd64
Run your CherryPy application onto the cluster
If you have destroyed the cluster, re-create it and make sure you can speak to it through fleet as follows:
$ fleetctl list-machines MACHINE IP METADATA 50f6819c... 172.17.8.101 -
Bingo! This is the public address we statically set in the Vagrantfile associated to the node.
Let’s ensure we have no registered units yet:
$ fleetctl list-unit-files UNIT HASH DSTATE STATE TARGET $ fleetctl list-units UNIT MACHINE ACTIVE SUB
Okay, all is good. Now, let’s push each of our units to the cluster:
$ fleetctl submit units/webapp_db.service $ fleetctl submit units/webapp_app@.service $ fleetctl submit units/webapp_load_balancer.service $ fleetctl list-unit-files UNIT HASH DSTATE STATE TARGET webapp_app@.service 02c0c64 inactive inactive - webapp_db.service 127e44a inactive inactive - webapp_load_balancer.service e1cfee6 inactive inactive - $ fleetctl list-units UNIT MACHINE ACTIVE SUB
As you can see, the unit files have been registered but they are not loaded onto the cluster yet.
Notice the naming convention used for webapp_app@.service, this is due to the fact that this is will not be considered as a service description itself but as a template for a named service. We will see this in a minute. Refer to this extensive DigitalOcean article for more details regarding unit files.
Let’s now load each unit onto the cluster:
$ fleetctl load units/webapp_db.service Unit webapp_db.service loaded on 50f6819c.../172.17.8.101 $ fleetctl list-units UNIT MACHINE ACTIVE SUB webapp_db.service 50f6819c.../172.17.8.101 inactive dead
Here, we asked fleet to load the service onto an available node. Considering there is a single node, it wasn’t a a difficult decision to make.
At that stage, your service is not started. It simply is attached to a node.
$ fleetctl journal webapp_db.service -- Logs begin at Tue 2015-02-17 19:26:07 UTC, end at Tue 2015-02-17 19:40:49 UTC. --
It is not compulsory to explicitely load before starting a service. However, if gives you the opportunity to unload a service if a specific condition occurs (service needs to be amended, the chosen host isn’t valid any longer…).
Now ce can finally start it:
$ fleetctl start units/webapp_db.service Unit webapp_db.service launched on 50f6819c.../172.17.8.101
You can see what’s happening:
$ fleetctl journal webapp_db.service -- Logs begin at Tue 2015-02-17 19:26:07 UTC, end at Tue 2015-02-17 19:56:28 UTC. -- Feb 17 19:56:19 core-01 docker[1561]: dc55e5f30ff9: Pulling fs layer Feb 17 19:56:21 core-01 docker[1561]: dc55e5f30ff9: Download complete Feb 17 19:56:21 core-01 docker[1561]: 835f524d1d7e: Pulling metadata Feb 17 19:56:22 core-01 docker[1561]: 835f524d1d7e: Pulling fs layer Feb 17 19:56:24 core-01 docker[1561]: 835f524d1d7e: Download complete Feb 17 19:56:24 core-01 docker[1561]: cb0503cedddb: Pulling metadata Feb 17 19:56:25 core-01 docker[1561]: cb0503cedddb: Pulling fs layer Feb 17 19:56:27 core-01 docker[1561]: cb0503cedddb: Download complete Feb 17 19:56:27 core-01 docker[1561]: cdd30fd0c6f3: Pulling metadata Feb 17 19:56:27 core-01 docker[1561]: cdd30fd0c6f3: Pulling fs layer
Or alternatively, you can request the service’s status:
$ fleetctl status units/webapp_db.service ● webapp_db.service - Notes database Loaded: loaded (/run/fleet/units/webapp_db.service; linked-runtime; vendor preset: disabled) Active: activating (start-pre) since Tue 2015-02-17 19:55:33 UTC; 1min 25s ago Process: 1552 ExecStartPre=/usr/bin/docker rm notesdb (code=exited, status=1/FAILURE) Process: 1478 ExecStartPre=/usr/bin/docker kill notesdb (code=exited, status=1/FAILURE) Control: 1561 (docker) CGroup: /system.slice/webapp_db.service └─control └─1561 /usr/bin/docker pull lawouach/webapp_db Feb 17 19:56:31 core-01 docker[1561]: c1eac5e31754: Pulling fs layer Feb 17 19:56:33 core-01 docker[1561]: c1eac5e31754: Download complete Feb 17 19:56:33 core-01 docker[1561]: 672ef5050bb9: Pulling metadata Feb 17 19:56:35 core-01 docker[1561]: 672ef5050bb9: Pulling fs layer Feb 17 19:56:36 core-01 docker[1561]: 672ef5050bb9: Download complete Feb 17 19:56:36 core-01 docker[1561]: 7ebc912be04a: Pulling metadata Feb 17 19:56:37 core-01 docker[1561]: 7ebc912be04a: Pulling fs layer Feb 17 19:56:52 core-01 docker[1561]: 7ebc912be04a: Download complete Feb 17 19:56:52 core-01 docker[1561]: 22f2bfe64e7f: Pulling metadata Feb 17 19:56:52 core-01 docker[1561]: 22f2bfe64e7f: Pulling fs layer
Once the service is ready:
fleetctl status units/webapp_db.service ● webapp_db.service - Notes database Loaded: loaded (/run/fleet/units/webapp_db.service; linked-runtime; vendor preset: disabled) Active: active (running) since Tue 2015-02-17 19:57:24 UTC; 2min 46s ago Process: 1561 ExecStartPre=/usr/bin/docker pull lawouach/webapp_db (code=exited, status=0/SUCCESS) Process: 1552 ExecStartPre=/usr/bin/docker rm notesdb (code=exited, status=1/FAILURE) Process: 1478 ExecStartPre=/usr/bin/docker kill notesdb (code=exited, status=1/FAILURE) Main PID: 1831 (docker) CGroup: /system.slice/webapp_db.service └─1831 /usr/bin/docker run --name notesdb -e POSTGRES_PASSWORD=test -e POSTGRES_USER=test -t lawouach/webapp_db:latest Feb 17 19:57:28 core-01 docker[1831]: backend> Feb 17 19:57:28 core-01 docker[1831]: PostgreSQL stand-alone backend 9.4.0 Feb 17 19:57:28 core-01 docker[1831]: backend> statement: CREATE USER "test" WITH SUPERUSER PASSWORD 'test' ; Feb 17 19:57:28 core-01 docker[1831]: backend> Feb 17 19:57:28 core-01 docker[1831]: ******CREATING NOTES DATABASE****** Feb 17 19:57:28 core-01 docker[1831]: PostgreSQL stand-alone backend 9.4.0 Feb 17 19:57:28 core-01 docker[1831]: backend> backend> backend> ******DOCKER NOTES CREATED****** Feb 17 19:57:28 core-01 docker[1831]: LOG: database system was shut down at 2015-02-17 19:57:28 UTC Feb 17 19:57:28 core-01 docker[1831]: LOG: database system is ready to accept connections Feb 17 19:57:28 core-01 docker[1831]: LOG: autovacuum launcher started
Starting a service from a unit template works the same way except you provide an identifier to the instance:
$ fleetctl load units/webapp_app@1.service $ fleetctl start units/webapp_app@1.service $ fleetctl status units/webapp_app@1.service ● webapp_app@1.service - App service Loaded: loaded (/run/fleet/units/webapp_app@1.service; linked-runtime; vendor preset: disabled) Active: active (running) since Tue 2015-02-17 20:06:40 UTC; 2min 56s ago Process: 2031 ExecStartPre=/usr/bin/docker pull lawouach/webapp_app (code=exited, status=0/SUCCESS) Process: 2019 ExecStartPre=/usr/bin/docker rm notes%i (code=exited, status=1/FAILURE) Process: 2012 ExecStartPre=/usr/bin/docker kill notes%i (code=exited, status=1/FAILURE) Main PID: 2170 (docker) CGroup: /system.slice/system-webapp_app.slice/webapp_app@1.service └─2170 /usr/bin/docker run --link notesdb:postgres --name notes1 -P -t lawouach/webapp_app:latest Feb 17 20:06:41 core-01 docker[2170]: [17/Feb/2015:20:06:41] ENGINE Listening for SIGHUP. Feb 17 20:06:41 core-01 docker[2170]: [17/Feb/2015:20:06:41] ENGINE Listening for SIGTERM. Feb 17 20:06:41 core-01 docker[2170]: [17/Feb/2015:20:06:41] ENGINE Listening for SIGUSR1. Feb 17 20:06:41 core-01 docker[2170]: [17/Feb/2015:20:06:41] ENGINE Bus STARTING Feb 17 20:06:41 core-01 docker[2170]: [17/Feb/2015:20:06:41] ENGINE Starting up DB access Feb 17 20:06:41 core-01 docker[2170]: [17/Feb/2015:20:06:41] ENGINE Setting up Mako resources Feb 17 20:06:41 core-01 docker[2170]: [17/Feb/2015:20:06:41] ENGINE Started monitor thread 'Autoreloader'. Feb 17 20:06:41 core-01 docker[2170]: [17/Feb/2015:20:06:41] ENGINE Started monitor thread '_TimeoutMonitor'. Feb 17 20:06:42 core-01 docker[2170]: [17/Feb/2015:20:06:42] ENGINE Serving on http://0.0.0.0:8080 Feb 17 20:06:42 core-01 docker[2170]: [17/Feb/2015:20:06:42] ENGINE Bus STARTED
The reason I chose 1 as the identifier is so that it the container’s name becomes notes1 as expected by the load-balancer container when linking it to the application’s container. As described in the previous article.
Start a second instance of that unit template:
$ fleetctl load units/webapp_app@2.service $ fleetctl start units/webapp_app@2.service
That second instance starts immediatly because the image is already there.
Finally, once both services are marked as “active”, you can start the load-balancer service as well:
$ fleetctl start units/webapp_load_balancer.service $ fleetctl status units/webapp_load_balancer.service ● webapp_load_balancer.service - Load Balancer service Loaded: loaded (/run/fleet/units/webapp_load_balancer.service; linked-runtime; vendor preset: disabled) Active: active (running) since Tue 2015-02-17 20:10:21 UTC; 1min 51s ago Process: 2418 ExecStartPre=/usr/bin/docker pull lawouach/webapp_load_balancer (code=exited, status=0/SUCCESS) Process: 2410 ExecStartPre=/usr/bin/docker rm notes_loadbalancer (code=exited, status=1/FAILURE) Process: 2403 ExecStartPre=/usr/bin/docker kill notes_loadbalancer (code=exited, status=1/FAILURE) Main PID: 2500 (docker) CGroup: /system.slice/webapp_load_balancer.service └─2500 /usr/bin/docker run --link notes1:n1 --link notes2:n2 --name notes_loadbalancer -p 8090:8090 -p 8091:8091 -t lawouach/webapp_load_balancer:latest Feb 17 20:10:14 core-01 docker[2418]: 9284a1282362: Download complete Feb 17 20:10:14 core-01 docker[2418]: d53024a13d34: Pulling metadata Feb 17 20:10:15 core-01 docker[2418]: d53024a13d34: Pulling fs layer Feb 17 20:10:17 core-01 docker[2418]: d53024a13d34: Download complete Feb 17 20:10:17 core-01 docker[2418]: 45e1cf959053: Pulling metadata Feb 17 20:10:18 core-01 docker[2418]: 45e1cf959053: Pulling fs layer Feb 17 20:10:21 core-01 docker[2418]: 45e1cf959053: Download complete Feb 17 20:10:21 core-01 docker[2418]: 45e1cf959053: Download complete Feb 17 20:10:21 core-01 docker[2418]: Status: Downloaded newer image for lawouach/webapp_load_balancer:latest Feb 17 20:10:21 core-01 systemd[1]: Started Load Balancer service.
At that stage, the complete application is up and running and you can go to http://localhost:7070/ to use it. Port 7070 is mapped to port 8091 by vagrant within our Vagrantfile.
No such thing as a free lunch
As I said earlier, we created a cluster of one node on purpose. Indeed, the way all our containers are able to dynamically know where to locate each other is through the linking mechanism. Though this works very well in simple scenarios like this one, this has a fundamental limit since you cannot link across different hosts. If we had multiple nodes, fleet would try distributing our services accross all of them (unless we decided to constraint this within the unit files) and this would break the links between them obviously. This is why, in this particular example, we create a single node’s cluster.
Docker provides a mechanism named ambassador to address this restriction but we will not review it, instead we will benefit from a flat sub-network topology provided by weave as it seems it follows a more traditional path than the docker’s linking approach. This will be the subject of my next article.