This is a bit of a RANT, and a bit of a HOW-TO. I probably do a 70% job of addressing either side, but that’s partially due to my frustration with the whole process that got me to this point.
I’ve been very frustrated with my initial forays into docker. I wanted to learn docker as a way to deploy small web servers and other services for use in my home lab and on my home DMZ.
There is a LOT of information for developers on how to use docker at AWS or in a small desktop VM to develop web apps, but little information on deploying it in production and some of the gotchas. For example, default internal NAT networking that may conflict with your available networks.
Seems like the folks that developed docker are not really network folks. Nor do they really think outside their simple development environments.
This is bad. If you build something that isn’t portable, and you hand it to someone else, and some percentage of those installs break due to your bad assumptions, you’ve let people down.
The core issue I ran into by chance: Docker, in most versions and implementations, will set up a “docker0” internal network for use by the container, and use NAT or similar to route the application onto the customer network. In theory, this internal network is invisible to the user network. It appears that docker simply looks at the IP address(es) on the host, and if they aren’t conflicting with 172.17.0.0/16, it uses that IP CIDR for it’s own internal use.
That works fine unless you have a network that uses 172.17.0.0/16 (or any overlapping and conflicting network). My network uses 172.16.0.0/16 for DMZ, 172.17.0.0/16 for LAN, and 172.18.0.0/16 for VPN.
So I tried Docker on my LAN and it “worked”. Moved it to my DMZ and I could no longer get to the services. It took me some time to figure out that the internal IP range was in conflict with my LAN addressing. I searched and reached out on Twitter, and eventually figured out how to use DOCKER_OPTS to adjust the internal network and bridge. Except that it didn’t work.
Apparently where DOCKER_OPTS gets set depends on what version of Linux you are using and if it uses systemd or upstart or whatever for starting daemons. And almost NOWHERE could I find out how to set it correctly for CoreOS (used in AeroFS, a product I am trying out at home), and when I did find an example for CoreOS, it wasn’t applicable because AeroFS isn’t launching docker the same way that CoreOS people say you are supposed to launch it.
Takeway #1: The world of Docker is in flux, everyone does it different, and you will spend a lot of cycles if you are using other peoples appliances and trying to modify them for your environment.
Takeaway #2: If you make an appliance, it shouldn’t require root access and CoreOS and Docker experience in order to get it to work on a 172.16.0.0/16 network. Your code should offer a way to move the network. People who aren’t docker experts (me) shouldn’t be giving your support staff the answer.
Here is the quick answer for AeroFS, and it should work on any CoreOS variant, although there may be better ways to do it if they are using a clean implementation of CoreOS
CoreOS puts the startup service info for docker in: /usr/lib64/systemd/system/docker.service. This file should not be modified, but will point at an EnvironmentFile that you can see if it exists and see if it has any DockerOpts. Since you should not modify this file, you can use /etc/systemd/system/docker.service.d/docker_opts.conf to override. So add the following to /etc/systemd/system/docker.service.d/docker_opts.conf:
(note the hyphen before the file path)
The contents of /etc/default/docker should include all your DOCKER_OPTS. Here is my file:
DOCKER_OPTS is documented on the docker site at https://docs.docker.com/articles/networking/ but they make an assumption, based on whatever Linux flavor-of-the-day is in vogue, that /etc/default/docker is actually read all the time, and it is NOT. My workaround above forces CoreOS to read the same file and apply DOCKER_OPTS.