Saturday, January 21, 2017

More fun with the Docker Developer Beginner Linux Containers course

This time in lesson 3. Problems bringing up the docker swarm.

docker compose up -d

Only redis stayed up, according to docker ps -a. The other four containers exited prematurely.

First action: run compose up again without the -d option. This results in more output from each container. Summary of the output:

vote_1: python can't open file 'app.py': [Errno 13] Permission denied
worker_1: System.AggregationException: One or more errors occurred. (No such device or address)
db: exited with code 1

Well, after spending about a day debugging this, I found some comments on github that fedora uses a fork of docker, and the version may be out of date. So I uninstalled docker:

sudo dnf remove docker-engine docker-compose docker

Reinstalled from the official repos using instructions from:

https://docs.docker.com/engine/installation/linux/fedora/
https://docs.docker.com/compose/install/

And rebooted the machine because systemctl start docker-engine complained "docker-engine.service not found". After all this, the docker-compose tutorial worked as expected.

Here are some debugging notes, even if the correct solution ultimately turned out to be "reinstall from official sources."

As always, focus on one problem at a time. I did this by commenting out all but one member of the swarm in docker-compose.yml, and running "docker-compose up" to debug. The "command:" line in the yml file can be changed to try different things in the container. For example, replacing "command: python app.py" with "command: ls -l /" allowed me to see that the ownership of the app directory (1000:1000) did not match the user (root), which explains the permission denied error.

Learn how to clean up and run things manually.

docker ps # show running containers
docker ps -a # show all containers
docker rm <container id> # remove containers listed in docker ps -a
docker rm -f <container id>
docker images
docker rmi <image id> # remove images

A neat shortcut is you only need to pass the first three digits of a container or image id.

Note that running "docker run -it examplevotingapp_vote /bin/sh" produced different results (i.e. /app was owned by root instead of 1000). My guess is it's because the docker-compose.yml file has this line:

volumes:
  - ./vote:/app

This tells docker-compose to mount the "vote" directory in the cwd on the host as "/app". Of course, the vote directory is owned by me (uid 1000) instead of root, so that probably explains the ownership weirdness.

The difference from running the container manually and using docker-compose ultimately led me to realize that docker-compose starts up container examplevotingapp_vote_1. My current theory is that docker-compose takes the base _vote image and adds another layer to it according to the configuration in the yml file (e.g. mounts the /app volume, connects to the swarm networks, etc). Using layers is consistent with the docker architecture from what I've seen so far, so let's run with that for now.

One final note about the last build-push-pull steps from the end of the lesson. The steps do allow you to access (pull) your voting app containers from another machine. But they fail to mention what else you'll to bring up the swarm. Certainly, docker-compose up failed to do anything until I cloned the example app from github. And running docker-compose up now, it seems to be rebuilding the swarm instead of using the containers I already pulled down. And yes, docker images confirms there is now images beginning with examplevotingapp* as well as bobthesquirrel/votingapp*. Needs further investigation.

No comments:

Post a Comment