Category Archives: docker

When ten commandments are not enough.

There are good reasons why people love a good set of rules about how to go about their jobs.  Here’s a new one:  12 Factor Micro-Services.  It’s part of the enthusiasm for containerizing everything, and it seems to live off the energy produced by the eternal tension between development and operations.

  • Codebase: One codebase tracked in revision control, many deploys
  • Dependencies: Explicitly declare and isolate dependencies
  • Config: Store config in the environment
  • Backing Services: Treat backing services as attached resources
  • Build, release, run: Strictly separate build and run stages
  • Processes: Execute the app as one or more stateless processes
  • Port binding: Export services via port binding
  • Concurrency: Scale out via the process model
  • Disposability: Maximize robustness with fast startup and graceful shutdown
  • Dev/prod parity: Keep development, staging, and production as similar as possible
  • Logs: Treat logs as event streams
  • Admin processes: Run admin/management tasks as one-off processes

I find this an odd list.  I don’t per-say have much issue with it, but are these really the top twelve things to keep in mind when designing the modern cloud based monster?

For example that point about codebase.  We all build systems out of many many 3rd party components, which means there are numerous codebases.  Why would that modularity be appropriate only at the project or firm boundary?

Which brings us to the question of how you take on fresh releases of third party components, and more generally how you manage refresh to the design of the data stores or the logging architecture, or the backup strategy.  All these are pretty central but they aren’t really on this list.

Which maybe why this is a list about “micro-services.”  Which again, I’m totally on board with. And yet, it seems to me like a fetishization of modularity.  Modularity is hard, it’s not cheap, and damn-it sometimes it just is not cost effective.  This is not about the dialectic between dev and ops, this is about the dialectic between doing and planning, or something.

It is  like the way people assume there is optimal size for a function when it is the outliers that are full of interest.

Generally when designing the outliers are.


Meanwhile, who says I can’t have teasers after my blog postings:

Docker #5 – benchmark? not really…

Given a Docker image you can spin up a container in lots of places.  For example on my Mac under Boot2Docker, at Orchard, or on Digital Ocean.   I don’t have any bare metal at hand, so these all involve the slight tax of virtual machine.

I ran the same experiment on these three.  The experiment launches 18 containers, serially.  The jobs they varied; but they are not very large.

180 seconds Boot2Docker
189 seconds Orchard
149 seconds Digital Ocean

These numbers are almost certainly meaningless!  I don’t even know what might be slowing things down: CPU, I/O, Swap, etc.

Interestingly if I launch all 18 containers in parallel I get similar results +/-10%.  The numbers vary only a few percent if I run these experiments repeatedly.  I warmed up the machines a bit by running a few jobs first.

Yeah.  Adding Google App Engine and EC2 would be interesting.

While Orchard charges $10/month v.s. Digital Ocean’s $5 their billing granularity is better.  You purchase 10 minutes, and then a minute at a time, v.s. Digital Ocean which bills an hour at a time.  Orchard is a little more convenient to use v.s. Digital Ocean.  A bit-o-scripting could fix that.

I’m using this for batch jobs.  Hence I have an itch: a batch Q manager for container runs.  That would, presumably assure that machines are spun up and down to balance cost and throughput.

Queue of Containers

Docker #4: Accessing a remote docker daemon using socat&ssh

Well!  This unix tool is pretty amazing.  Socat let’s you connect two things together, there the two things are pretty much anything that might behave like a stream.  There is a nice overview article written in 2009 over here.  You can do crazy things like make a device on machine A available on a machine B.  Running this command on A will bring us to machine B’s /dev/random:

socat \
  PIPE:/tmp/machine_a_urandom  \
  SYSTEM:"ssh machine_a socat - /dev/urandom"

What brought this up, you ask.

I have been up machines to run Docker Containers on, at Digital Ocean, for short periods of time to run batch jobs.  Docker’s deamon listens on a unix socket, /var/run/docker.sock, for your instructions.  I develop on my Mac, so I need to transmit my instructions to the VM at Digital Ocean.  Let’s call him mr-doh.

One option is to reconfigure mr-doh’s Docker deamon  to listening on localhost tcp port.   Having done that you can have ssh forward that back to your Mac and then your set/export the DOCKER_HOST environment variable and your good to go.

The problem with that is it adds the work of spinning up mr-doh, which if your only going to have him running for a short period of time adds to the tedium.

Well, as we can see in the /dev/urandom example above you can use socat to forward things. That might look like this:

socat \
    "UNIX-LISTEN:/tmp/mr-doh-docker.sock,reuseaddr,fork" \
    "EXEC:'ssh -kTax root@mr-doh.example.com socat STDIO UNIX-CONNECT\:/var/run/docker.sock'" &

Which will fork a soccat to manages /tmp/mr-doh-docker.sock.  We can then teach the docker client on the Mac to use that by doing:

export DOCKER_HOST=unix:///tmp/mr-doh-docker.sock
When the client uses it socat will fire up ssh and connect to the docker deamon's socket on mr-doh.  Of course for this to work you'll want to have your ssh key installed in root@mr-doh's authorized_keys etc.

For your enjoyment is a somewhat raw script which will arrange to bring the /var/run/docker.sock from mr-doh back home. get-docker-socket-from-remote prints the export command you’ll need.

It’s cool that docker supports TLS.  You have to setup and manage keys etc.  So, that’s another approach.

Docker, part 3

Docker containers don’t support multicast, at least not easily.   I find that a bummer.

It’s unclear why not.  Well, the most immediate reason is that the networking interfaces they create for the containers don’t have the necessary flag to enable multicast.  That, at least according to the issue, is because that’s how Linux defaulted these interfaces.    Why did they did they do that?

This means that any number of P2P or (masterless) solutions don’t work.  For example zeroconf/mdns is out.  I guess this explains the handful of custom service discovery tools.  Reinventing the wheel.

In other news… Once you have Boot2Docker setup you need to tell the docker command were the docker daemon is listening for instructions. You do that with the -H switch to docker, or via the DOCKER_HOST environment variable.   Typically you’d do:

export DOCKER_HOST=tcp://192.168.59.103:2375

But if your feeling fastidious you might want to ask boot2docker for the IP and port.

export "DOCKER_HOST=tcp://$(boot2docker ip 2> /dev/null):$(boot2docker info | sed 's/^.*DockerPort.:\([0-9]*\).*$/\1/')"

establish-routing-to-boot2docker-container-network

Boot2Docker lets you run Docker Containers on your Mac by using VirtualBox to create a stripped down Linux Box (call that DH) where the Docker daemon can run.   DH and your Mac have a networking interface on a software defined network (named vboxnet) created by Virtual Box.  The containers and DH have networking interfaces on a software defined network created by the Docker daemon.  Call it SDN-D, since they didn’t name it.

The authors of boot2docker did not set things up so your Mac to you connect directly to the containers on sdn-d.  Presumably they didn’t think it wise to adjust the Mac routing tables.  But you can.  This is very convenient.  It lets you avoid most of the elegant, but tedious, -publish or -publish-all (aka -p, -P) switches when running a container.  They hand-craft special plumbing for ports when running with containers.  It also nice because DH is very stripped down making it painful to work on.

So, I give you this little shell script: establish-routing-to-boot2docker-container-network.   It adds routing on the Mac to SDN-D via DH on vboxnet.  This is risky if SDN-D happens to overlap a network that the Mac is already routing to, and the script does not guard against that.  See bellow for how to deal if you have that problem.

If your containers have ssh listeners then you can put this in your ~/.ssh/config to avoid the PIA around host keys.  But notice how it hardwires the numbers for SDN-D.

Host 172.17.0.*
  StrictHostKeyChecking no
  UserKnownHostsFile /dev/null
  User root

The numbers of SDN-D are bound when the Docker daemon launches on DH.  The –bip switch, used when the docker daemon launches, can adjusts that.  You setting it in /var/lib/boot2docker/profile on DH via EXTRA_ARGS.    Do that if you have the overlap problem mentioned above.  I do it because I want SDN-D to be small.  That let’s nmap can scan it quickly.

If you’ve not used ~/.ssh/config before, well you should!  But in that case you may find it useful to know that ssh uses the first setting it finds that Host block should appear before your global defaults.

 

Docker, part 2

The San Francisco Hook

I played with Docker some more.  It’s still in beta so, unsurprisingly, I ran into some problems.   It’s cool, none the less.

I made a repository for running OpenMCL, aka ccl, inside a container.   I set this up so the Lisp process expects to be managed using slime/swank.  So it exports port where swank listens for clients to connect.  When you run it you export that port, i.e. “-p 1234:4005” in the example below.

Docker shines at making it easy to try things like this.  Fire it up: “docker run –name=my_ccl -i -d -p 1234:4005 bhyde/crate-of-ccl”.   Docker will spontaneously fetch the everything you need.   Then you M-x slime-connect to :1234 and you are all set.  Well, almost, the hard part is  .

I have run this in two ways, on my Mac, and on DigitalOcean.  On the Mac you need to have a virtual machine running linux that will hold your containers – the usual way to do that is the boot2docker package.  On Digital Ocean you can either run a Linux droplet and then installed Docker, or you can use the application which bundles that for you.

I ran into lots of challenges getting access to the exported port.  In the end I settled on using good old ssh LocalForward statements in my ~/.ssh/config to bring the exported port back to my workstation.  Something like “LocalForward 91234 172.17.42.1:1234” where that IP address that of an interface (docker0 for example) on the machine where the container is running.  Lots of other things look like they will work, but didn’t.

Docker consists of a client and a server (i.e. daemon).  Both are implemented in the same executable.  The client chats with the server using HTTP (approximately).  This usually happens over a Unix socket.  But you can ask the daemon to listen on a TCP port, and if you LocalForward that back to your workstation you can manage everything from there.  This is nice since you can avoid cluttering you container hosting machine with source files.  I have bash functions like this one “dfc () { docker -H tcp://localhost:2376 $@ ; }” which provides a for chatting with the docker daemon on my Digital Ocean machine.

OpenMCL/ccl doesn’t really like to be run as a server.   People work around by running it under something like screen (or tmux, detachtty, etc.).  Docker bundles this functionality, that’s what the -i switch (for interactive) requests in that docker run command.  Having done that you can then uses “docker log my_ccl” or “docker attach my_ccl” to dump the output or open a connection to Lisp process’ REPL.   You exit a docker attach session using control-C.  That can be difficult if you are inside of an Emacs comint session, in which case M-x comint-kill-subjob is sometimes helpful.

For reasons beyond my keen doing “echo ‘(print :hi)’ | docker attach my_ccl” get’s slightly different results depending on Digital Ocean v.s. boot2docker.  Still you can use that to do assorted simple things.   UIOP is included in the image along with Quicklisp, so you can do uiop:runprogram calls … for example to apt-get etc.

Of course if you really want to do apt-get, install a bundle of Lisp code, etc. you ought to create a new container built on this one.  That kind of layering is another place where Docker shines.

So far I haven’t puzzled out how to run one liners.  Something like: “docker run –rm bhyde/crate-of-ccl ccl -e ‘(print :hi)'” doesn’t work out as I’d expect.  It appears that argument pass thru, arg. quoting, and that the plumbing of standard IO et. al. is full of personality which I haven’t comprehended.  Or maybe there are bugs.

That’s frustrating – I undermines my desire to do sterile testing.

 

Docker is interesting

Somebody mentioned Docker during a recent phone interview, so I went off to have a look.  It’s interesting.

We all love sandboxing.  Sandboxing is the idea that you could run your computations inside of a box.  The box would then protect us from whatever vile thing the computation might do.  Visa versa it might protect the computation from whatever attacks the outside world might inflict upon it.   There are many ways to build a sandbox.   Operating systems devote lots of calories to this problem.  I recall a setting in an old Univac operating system that set a limit on how many pages a user could print on the line printer.   Caja tries to wrap a box around arbitrary JavaScript so it can’t snoop on the rest of the web page.  My favorite framework thinking about this kind of thing is capabilities.  Probably because I was exposed to them back at CMU in the 1970s.

Docker is yet another scheme for running stuff in a sandbox.  They call these containers, like a standardized shipping container.  I wonder if they actually took the time to read “The Box: …“, since it’s an amazing book.

Docker is also the usual hybrid open-source/commercial/vc-funded kind of thing.  Of course it has an online hub/repository.  Sort of like the package managers do; but in this case run by the firm.  Sort of like github.  The business model is interesting, but that’s – maybe – for another post.

Docker stands on a huge amount of work done over the last decades by operating system folks on the sandboxing problem.  It’s really really hard to retrofit sandboxing into an existing operating system.   The redesigned thing is likely to have a lot of rough edges.  So – on the one hand – Docker is a system to reduce the rough edges.  Let meer mortals can play with sandboxes, finally.  But it is also trying to build a single unified API across the diversity of operating systems.  In theory that would let me make a container which I can then run “everywhere.”   “Run everywhere” is perennial eh?

Most people describe docker as an alternative to virtual hosting (ec2, vmware, etc. etc.).  And that’s true.  But it’s also an alternative to package managers (yum, apt, homebrew, etc. etc.).   For example say I want to try out “Tiny Tiny RSS,” which is a web app for reading RSS feeds.  I “just” do this:

docker run -name=my_db -d nornagon/postgres
docker run -d --link my_db:db -p 80:80 clue/ttrss
open http://localhost/

Those three lines create two containers, one for the database and one for the RSS reader.  The 2nd link links the database into the RSS container, and exposes the RSS reader’s http service on the localhost.  The database and RSS reader containers are filled in with images that are downloaded from the central repository and caches. Disposing of these applications is simple.

That all works on a sufficiently modern Linux since that’s where the sandboxing support docker depends on is found.  If your are on Windows or the Mac then you can install a virtual machine and run inside of that.  The installers will set everything up for you.