Graphical Programming and yEd

Graphical programming languages are like red sports cars.  They have lots of curb appeal, but they are rarely safe and reliable.

I long worked for a company whose product featured a very rich graphic programming. It allowed an extremely effective sales process.  The salesman would visit the customer who would sketch a picture of his problem on the whiteboard, and the salesman would enquire about how bad things would get if the problem didn’t get solved.

Meanwhile in the corner the sales engineer would copy the drawing into his notebook.  That night he would create an app in our product who’s front page looked as much like that drawing as possible.  It didn’t really matter if it did anything, but it usually did a little simulation and some icons would animate and some charts’ would scroll.  The customers would be very excited by these little demos.

I consider those last two paragraphs a delightful bit of sardonic humor.  But such products do sell well.   Customers like how pretty they look.  Sales likes them.  Engineering gets to have mixed feelings.  The maintenance contracts can be lucrative.  Thathelps with buisness model volatility.  So yeah, there is plenty of value in graphical programming.

So one of the lightning talks at ILC 2014 caught my attention.  The speaker, Paul Tarvydas, mentioned in passing that he had a little hack based on a free drawing application called yEd.  That evening I wrote a similar little hack.

Using yEd you can make an illustrations, like this one showing the software release process for most startups.

My few lines of code will extract the topology from the drawing, at which point you can build whatever strikes your fancy: code, ontologies, data structures.  (Have I mentioned how much fun it is to use Optima to digest into a glob of XML?  Why yes I have.)

I was also provoked by Fare Rideaus‘ talk.  Fare is evangelizing the idea that we ought to start using Lisp for scripting.   He has a package, cl-launch, intended to support this.  Here’s an example script.   Let’s dump the edges in that drawing:

bash-3.2$ ./topology.sh abc.graphml
Alpha -> Beta
Beta -> Cancel
Beta -> Beta
Beta -> Beta
bash-3.2$

I’ve noticed, dear Reader, that you are very observant.  It’s one of the things I admire about you.  So you wondering: “Yeah Ben, you found too many edges!”   Well, I warned you that these sports cars are rarely safe.  Didn’t I?

Precariat watch

This long profile of reasonably successful member of the precariat in the Times is worth reading, if you are curious about this trend.

“If you did the calculations, many of these people would be earning less than minimum wage,” says Dean Baker, an economist who is the co-director of the Center for Economic and Policy Research in Washington. “You are getting people to self-exploit in ways we have regulations in place to prevent.”

“These are not jobs, jobs that have any future, jobs that have the possibility of upgrading; this is contingent, arbitrary work,” says Stanley Aronowitz, director of the Center for the Study of Culture, Technology and Work at the Graduate Center of the City University of New York. “It might as well be called wage slavery in which all the cards are held, mediated by technology, by the employer, whether it is the intermediary company or the customer.”

fake bots and standards

I read this morning about bots that pretend to be Google.   I’m surprised to realise that I’m unaware of any standard scheme for a bot (or other HTTP client) to assert it’s identity in a secure way.  This seems like a kind of authentication, i.e. some sites would prefer to know they are being crawled by the authentic bot v.s. an imposter.

There is a list of the standard authentication schemes.  But none of them handle this use case.

This doesn’t look too difficult.  You need a way for agents to sign their requests.  So, you make another auth scheme.  Authentication schemes using this scheme include a few fields.  A URL to denote who is signing, and presumably the document associated with that URL has the public key for the agent.  A field that allows the server to infer exactly what the agent signed.  That would need to include enough stuff to frustrate various reply attempts (the requested url and the time might be sufficient).

More interesting, at least to me, is why we do not appear to have such a standard already.  We can play the cost benefit game.   There are at least four players in that calculation.

The standard’s gauntlet is a PIA.  For an individual this would be a pretty small feather in one’s cap.  And a long haul.  So the cost/benefit for the technologist is weak.  And this isn’t just a matter of technology, you also have to convince the other players to play into the game.

What about the sites the bot is visiting.  They play a tax for serving the bad bots.  The size of that tax is the benefit they might capture after running the gauntlet.  But meanwhile have alternatives. They aren’t very good alternatives, but they are probably sufficient for example they can whitelist the IP address ranges they think the good bots are using.  That’s tedious, but it’s in their comfort zone v.s. the standards gauntlet.

What about the operators of the big “high quality” bots.  It might be argued that the fraudulent bots are doing them some sort of reputation damage, but I find that hard to believe.  An slightly disconcerting thought is that they might pay some people in their standards office to run the gauntlet because this would create little barrier to entry for other spidering businesses.

The fourth constituency that might care is the internet architecture crowd.  I wonder if they are actually somewhat opposed, or at least ambivalent, to this kind of authentication.  Since it has the smell of an attempt to undermine anonymity.

ssh-keyscan and waiting for servers to come online

Here’s a little trick.  Ssh-keyscan is useful for asking ssh daemons for their host keys.  People use it to provision their known_hosts file.

You can also use it to poll a ssh daemon – very use when waiting for a new server to boot up, including cloud servers.  That might look like this:

LOG waiting for sshd and get hostkey
while echo ".." ; do
    HOSTKEY="$( ssh-keyscan -T 20 $IP 2> /dev/null )"
    if [[ ouch != "ouch$HOSTKEY" ]] ; then
        echo "$HOSTKEY"
        break
    fi
done

LOG Add $NAME to known_hosts
cp ~/.ssh/known_hosts ~/.ssh/known_hosts.old
sed "/^$IP /d" ~/.ssh/known_hosts.old > ~/.ssh/known_hosts
echo "$HOSTKEY" >> ~/.ssh/known_hosts

That code is too optimistic, it assumes that the server will start.

And also: there are scenarios where ssh’s timeout parameters don’t work right.   So you can hang, inspite of that -T timeout.  Fixing that requires getting fresher versions of sshd.

auth-source: getting my secrets out of my emacs init file

I do not reveal my emacs init file publicly because it has secrets in it. For passwords (particularly for various APIs), and decryption keys in it.

But, the other day I discovered auth-source.   I used in this example to launch my IRC setup:

(defun start-erc ()
  "Wrapper for ERC that get's password via auth-source."
  (interactive)
  (let* ((server (erc-compute-server))
         (port (erc-compute-port))
         (credentials (auth-source-search :host server 
                                          :port (format "%s" port)
                                          :max-tokens 1)))
    (cond
     (credentials
      (erc :password (funcall (plist-get (car credentials) :secret))))
     (t
      (message "auth-source-search failed to find necessary credentials for irc server")))))

Auth-source-search will find my credentials in ~/.authinfo.gpg.  A line there like that looks like this: “machine irc.example.org port 12345 login luser password aPasWurd“.

Curious about hard it would be to fold that directly into the M-x erc</code> I read enough code to discover it calls thru to a function which does in fact call auth-source-search; so you can revise my function like so:

(defun start-erc ()
  "Start erc computing all the default connection details, which might get the password via auth-source."
  (interactive)
  (let ((password? nil))
    (erc-open (erc-compute-server)
              (erc-compute-port)
              (erc-compute-nick)
              (erc-compute-full-name)
              t  ;; connect
              password?))

I'm delighted.  But it, looks like this facility isn't used as much as I'd expect.

I found it because the helm-delicious package advised using it for my delicious password.

I was making good progress getting all my secrets out of the init file by have a function that would load all the secrets on demand loading an encrypted elisp file (load "secrets.el.gpg"). That works nicely too.

Maybe I should go read up on the secret storage scheme of vree desktop.

Docker #5 – benchmark? not really…

Given a Docker image you can spin up a container in lots of places.  For example on my Mac under Boot2Docker, at Orchard, or on Digital Ocean.   I don’t have any bare metal at hand, so these all involve the slight tax of virtual machine.

I ran the same experiment on these three.  The experiment launches 18 containers, serially.  The jobs they varied; but they are not very large.

180 seconds Boot2Docker
189 seconds Orchard
149 seconds Digital Ocean

These numbers are almost certainly meaningless!  I don’t even know what might be slowing things down: CPU, I/O, Swap, etc.

Interestingly if I launch all 18 containers in parallel I get similar results +/-10%.  The numbers vary only a few percent if I run these experiments repeatedly.  I warmed up the machines a bit by running a few jobs first.

Yeah.  Adding Google App Engine and EC2 would be interesting.

While Orchard charges $10/month v.s. Digital Ocean’s $5 their billing granularity is better.  You purchase 10 minutes, and then a minute at a time, v.s. Digital Ocean which bills an hour at a time.  Orchard is a little more convenient to use v.s. Digital Ocean.  A bit-o-scripting could fix that.

I’m using this for batch jobs.  Hence I have an itch: a batch Q manager for container runs.  That would, presumably assure that machines are spun up and down to balance cost and throughput.

Queue of Containers

Docker #4: Accessing a remote docker daemon using socat&ssh

Well!  This unix tool is pretty amazing.  Socat let’s you connect two things together, there the two things are pretty much anything that might behave like a stream.  There is a nice overview article written in 2009 over here.  You can do crazy things like make a device on machine A available on a machine B.  Running this command on A will bring us to machine B’s /dev/random:

socat \
  PIPE:/tmp/machine_a_urandom  \
  SYSTEM:"ssh machine_a socat - /dev/urandom"

What brought this up, you ask.

I have been up machines to run Docker Containers on, at Digital Ocean, for short periods of time to run batch jobs.  Docker’s deamon listens on a unix socket, /var/run/docker.sock, for your instructions.  I develop on my Mac, so I need to transmit my instructions to the VM at Digital Ocean.  Let’s call him mr-doh.

One option is to reconfigure mr-doh’s Docker deamon  to listening on localhost tcp port.   Having done that you can have ssh forward that back to your Mac and then your set/export the DOCKER_HOST environment variable and your good to go.

The problem with that is it adds the work of spinning up mr-doh, which if your only going to have him running for a short period of time adds to the tedium.

Well, as we can see in the /dev/urandom example above you can use socat to forward things. That might look like this:

socat \
    "UNIX-LISTEN:/tmp/mr-doh-docker.sock,reuseaddr,fork" \
    "EXEC:'ssh -kTax root@mr-doh.example.com socat STDIO UNIX-CONNECT\:/var/run/docker.sock'" &

Which will fork a soccat to manages /tmp/mr-doh-docker.sock.  We can then teach the docker client on the Mac to use that by doing:

export DOCKER_HOST=unix:///tmp/mr-doh-docker.sock
When the client uses it socat will fire up ssh and connect to the docker deamon's socket on mr-doh.  Of course for this to work you'll want to have your ssh key installed in root@mr-doh's authorized_keys etc.

For your enjoyment is a somewhat raw script which will arrange to bring the /var/run/docker.sock from mr-doh back home. get-docker-socket-from-remote prints the export command you’ll need.

It’s cool that docker supports TLS.  You have to setup and manage keys etc.  So, that’s another approach.

Docker, part 3

Docker containers don’t support multicast, at least not easily.   I find that a bummer.

It’s unclear why not.  Well, the most immediate reason is that the networking interfaces they create for the containers don’t have the necessary flag to enable multicast.  That, at least according to the issue, is because that’s how Linux defaulted these interfaces.    Why did they did they do that?

This means that any number of P2P or (masterless) solutions don’t work.  For example zeroconf/mdns is out.  I guess this explains the handful of custom service discovery tools.  Reinventing the wheel.

In other news… Once you have Boot2Docker setup you need to tell the docker command were the docker daemon is listening for instructions. You do that with the -H switch to docker, or via the DOCKER_HOST environment variable.   Typically you’d do:

export DOCKER_HOST=tcp://192.168.59.103:2375

But if your feeling fastidious you might want to ask boot2docker for the IP and port.

export "DOCKER_HOST=tcp://$(boot2docker ip 2> /dev/null):$(boot2docker info | sed 's/^.*DockerPort.:\([0-9]*\).*$/\1/')"

establish-routing-to-boot2docker-container-network

Boot2Docker lets you run Docker Containers on your Mac by using VirtualBox to create a stripped down Linux Box (call that DH) where the Docker daemon can run.   DH and your Mac have a networking interface on a software defined network (named vboxnet) created by Virtual Box.  The containers and DH have networking interfaces on a software defined network created by the Docker daemon.  Call it SDN-D, since they didn’t name it.

The authors of boot2docker did not set things up so your Mac to you connect directly to the containers on sdn-d.  Presumably they didn’t think it wise to adjust the Mac routing tables.  But you can.  This is very convenient.  It lets you avoid most of the elegant, but tedious, -publish or -publish-all (aka -p, -P) switches when running a container.  They hand-craft special plumbing for ports when running with containers.  It also nice because DH is very stripped down making it painful to work on.

So, I give you this little shell script: establish-routing-to-boot2docker-container-network.   It adds routing on the Mac to SDN-D via DH on vboxnet.  This is risky if SDN-D happens to overlap a network that the Mac is already routing to, and the script does not guard against that.  See bellow for how to deal if you have that problem.

If your containers have ssh listeners then you can put this in your ~/.ssh/config to avoid the PIA around host keys.  But notice how it hardwires the numbers for SDN-D.

Host 172.17.0.*
  StrictHostKeyChecking no
  UserKnownHostsFile /dev/null
  User root

The numbers of SDN-D are bound when the Docker daemon launches on DH.  The –bip switch, used when the docker daemon launches, can adjusts that.  You setting it in /var/lib/boot2docker/profile on DH via EXTRA_ARGS.    Do that if you have the overlap problem mentioned above.  I do it because I want SDN-D to be small.  That let’s nmap can scan it quickly.

If you’ve not used ~/.ssh/config before, well you should!  But in that case you may find it useful to know that ssh uses the first setting it finds that Host block should appear before your global defaults.