Category Archives: common-lisp

What’s new in Quicklisp

A new Quicklisp release is out.  “New projects: cl-arff-parser, cl-bayesnet, cl-libpuzzle, cl-one-time-passwords, cl-rrt, cl-secure-read, function-cache, gendl, sha3, trivial-raw-io, yaclanapht. … Updated projects…”

I spent a few minutes looking at the new ones, so here’s a bit more info…

cl-arff-parserhttps://github.com/pieterw/cl-arff-parser#readme (BSD?)  Reader for ARFF (Attribute-Relation File Format) file, an ASCII file that describes a list of instances sharing a set of attributes.

cl-bayesnethttps://github.com/lhope/cl-bayesnet#readme (LLGPL)
a tool for the compilation and probability calculation of discrete, probabilistic Bayesian Networks.

cl-libpuzzlehttps://github.com/pocket7878/cl-libpuzzle#readme (LLGPL)
A foriegn function bridge to libpuzzle, a library for finding similar pictures
see also: http://linux.die.net/man/3/libpuzzle

cl-one-time-passwordshttps://github.com/bhyde/cl-one-time-passwords#readme (apache2)
implementation of the HOTP and TOTP standard as used in google authenticator and others for 2-factor authentication.

cl-rrthttps://github.com/guicho271828/cl-rrt#readme (LLGPL)
a … multidimentional path-plannning algorithm… use[d] in robotics … car drivings …

cl-secure-readhttps://github.com/mabragor/cl-secure-read#readme (GPLv3)
Based on the “Let of Lambda” secure reader.

function-cachehttps://github.com/AccelerationNet/function-cache#readme (BSD)
an expanded form of memoization

gendlhttps://github.com/genworks/gendl#readme (AGPL)
A big Generative Programming and Knowledge Based Engineering framework. Previously known as genworks-gdl.

sha3https://github.com/pmai/sha3#readme (MIT/X11)
Implementation of the Secure Hash Algorithm 3 (SHA-3), also known as Keccak

trivial-raw-io – https://github.com/redline6561/trivial-raw-io#readme (BSD)
… export three simple symbols: with-raw-io, read-char, and read-line

yaclanaphthttps://github.com/mabragor/anaphora#readme (GPL3)
Improvement/fork of Nicodemus Siivola’s ANAPHORA, with license change.

My Common Lisp Buildpack for Heroku

Meanwhile, I reworked an existing buildpack for my needs.  It’s very easy to try it.   Assuming you’ve signed up for heroku following their quickstart instructions then you just:

curl https://gist.github.com/bhyde/5383182/raw/gistfile1.txt | bash

That will create a directory on your machine with the example application’s sources, over on Heroku it will build and launch that application, and finally it will open your web browser visiting the home page of the application.

herokuCCL64

This is all free.

This application is written in Common Lisp.  There are lots of nice open source Common Lisp compilers, in this case it’s using Clozure Common Lisp.  The sources of the app amount to 16 lines of code.  Another few lines implement the hook used by the buildapp to compile the application.   Tiny applications like this are made possible thru the excellent build and library support in the modern lisp community; so most of the meat is defining how to use them (i.e. the system definition).

It’s interesting that none of the above installs common lisp on your own machine, nor does it check out the buildpack I built.  In total it adds ~200K to your local machine, most of which is the git repository.  The actual sources are about 14K, of which the image is 13.

To undo the above you need only delete the directory it creates and destroy the application on heroku: heroku apps:destroy <name>.

Heroku Buildpacks

buildpacks

Heroku is a cloud computing platform, i.e. a place where you can run your applications.  When you author an application on Heroku it is split into two parts.  One part, the buildpack, is responsible for building your application; while the other part is the actual application’s sources.

I found this interesting.  Modularity always serves to separate concerns.  What concerns, and who’s concerns, are natural next questions.  Buildpacks make it easier to get started with Heroku; so they address one of Heroku’s concerns – i.e. “How can we get people to use our product.” – by lowering the barrier to entry.

There are many buildpacks.  Heroku provides a half dozen, and other people have build a few hundred more.  Each build packs makes it easier to get an app based on some programming framework started.  For example there are ones for: ruby, node.js, erlang, wordpress, common lisp, go, etc. etc. etc.

Of course how exactly any given application get’s built tends to quickly become unique.  I suspect that most serious app developers customize their buildpacks.  Heroku makes extensive use of git to move things around, so naturally a lot of the buildpacks are on github.  I got thinking it would be interesting to look at some statistics about them.

The chart above has one dot for each of N buildpacks (in some cases multiple buildpacks have landed on the same dot).  Notice that the chart is log-log.   The vertical axis indicates how many +1 like stars the buildapp has received – so that’s a proxy for how popular it is.  The horizontal axis shows how often the buildapp has been forked.

In one’s fantasy the perfect build app for a given programming platform would fufill all the needs of it’s app developers.  In that case it would never need to be forked.  But three things work against that fantasy.  First off the buildapps aren’t perfect, in fact they tend to simplistic because their short-term goal is to make it trival to get started.  Secondly – I think – the building of applications tends to sport all kinds of unique special cases.  And finally the usual third reason – it’s hard work to push patches around, so even if you improve a given buildpack the chances your enhancments flow back to the master buildpack are – well – iffy.

Anyhow, I wanted to share the picture. (oh, and zero was mapped to .1)

Coder’s Asceticism

In poetry class I learned that performing inside a straight jacket can, surprisingly, work out pretty well.  Drawing with a peculiar pen, working in an unusual medium, or venue … all these can work out surprisingly well.  You can go too far though.    Knowing that an abundance of choice does not make us happy is not an  argument  for eliminating all choice.

“Moderation in all things” is how Patrick Stein sums up a post on trying on a few programming straight jackets.  No function over five lines.  No package (we are talking Common Lisp here) spread over multiple files.  Unusual coding conventions for your package namespaces.  He is having fun.

Seems to me that a rule on the maximum size of a function is silly.  Surely the question is what is the right distribution of function sizes – i doubt it’s normally distributed.  I’d think a good rule might be that a function’s size should signal something about it’s complexity.  Isn’t function complexity mostly independent of program modularity? I.e., modularity alone can cause functions to fragment.  It is a common fetish: “small is beautiful.” And, that helps to explain why I’ve never heard people advocating against small functions.  I’ve often encountered code that seems scattered into a thousand tiny  pieces, it becomes incomprehensible.  That is certainly the wrong thing to do, except when it isn’t.

Can’t Think

I’m struck by how visceral my reaction to this is:

For my part, I don’t see any reason why the Lisp community should constrain itself to having exactly one defsystem facility.

It misses the heart of the problem so completely that it unintentionally makes it worse.

cl-ppcre as a tokenizer

CL-PPCRE is a Lisp clone of the perl regular expression subsystem (i gather it’s even faster).  I  love it because lots of things I used to do in Perl or AWK can I now do in Lisp.  Each language has it’s coding tricks and there are lot for Perl’s regular expressions.  The Perl regex manpage is 30 pages, and the FAQ is 20 pages.  That’s a lot of clever, and double or triple is once you recall how concise Perl is!  I often is use all this to build lexical analysiers using regular expressions to nibble tokens off string. (One variation on that is outlined in the Perl FAQ.)

Both Cl-PPCRE and Perl have two notations for their regular expressions.  The concise one you’ve certainly used; and a much more verbose one.  For example you might write “alpha|beta” to match either of the two strings in the concise form, and in CL-PPCRE’s long form you’d write (:alternation “alpha” “beta”).  In Lisp you might write something like: (setf (parse-tree-synomym :tokens) ‘(:sequence :start-anchor (:alternation :begin :end :if …))) and then nibble off tokens by doing (scan :tokens program-text :start start).

The only problem with that is you lost track of what token was recognized.  My trick for that is to define the individual tokens so when they are matched the set a variable in the dynamic extent established for tokenizing.  Along these lines:  (setf (parse-tree-synonym :if) (:sequence “if” (:function set-token-kind-to-if)). The :function form is used to do call outs in the midst of the pattern matching.  It is the analog of Perl’s (?{ code }) construct.

That of course all get’s wrapped up in macros so it reads nicely, e.g: (deftoken :constant (“\\\d+”) (parse-integer (token-text))).

You can juice this up by having parse-tree-synonyms that keep track of the line number in the file your parsing.

If you get the details right this can be very efficent; both in time to code up and at runtime.  When combined with CL-YACC you can whip-up a parser for yet another language in a few hours.  Yesterday I built one for XDR.  Here is a simplified example.

Water’s series, using producing

Posting for the memory book.  Appendix A in the Common Lisp manual is a package Richard Water’s created known as series.  A series is a kind of hybrid between sequences (think lists) and streams.  They are cool because the resulting code is damn fast; usually compiling down to raw loops.  In anycase I had more trouble writing the following than I expected.  I kept writing setf rather than setq; which creates some bizarre errors.  Errors printing results actually.  It would also be good if I’d actually read the doc for producing, rather than just skimming it, for any given FOO or BAR the (next-in FOO ...) and (next-out BAR ...) forms should appear once and only once.    You get extra points if they appear at the start or tail of the tagbody respectively.

In this example we want to inhale two series nibbling off one or the other as appropriate.  Series a pain to use when doing this kind of mingling; but mostly because they don’t have a way to seek forward more rapidly than pulling items off the inputs one at a time.  What also makes them painful to use at this level is that you have to write at a level that is enjoyably only because it invokes a nostalgia for assembler language.  No doubt it one was writing a lot of his kind of code then you’d make a micro language that compiles into this pseudo assembler.  The series sources do something along those lines; making them wonderfully hard to read.

I’m really not fluent in using this package, so critiques from more expert users would be welcome.


(defun union-integer-series (s1 s2)
  "Given two ascending series of integers return a series of those integers
    which appear in both."
  (declare (optimizable-series-function)
                    (series s1 s2))
  (producing (items) ((g1 s1) (g2 s2) i1 (i1-ok nil) i2 (i2-ok nil))
      (loop
            (tagbody
                  --TOP--

                  (if i1-ok (go --I1-OK--))
                  (setq i1 (next-in g1 (terminate-producing)))
                  (setf i1-ok t)
                  --I1-OK--

                  (if i2-ok (go --I2-OK--))
                  (setq i2 (next-in g2 (terminate-producing)))
                  (setf i2-ok t)
                  --I2-OK--

                  (unless (= i1 i2)
                      (cond
                          ((< i1 i2)
                            (setq i1-ok nil))
                          (t
                            (setq i2-ok nil)))
                      (go --TOP--))

                  (setq i1-ok nil)
                  (setq i2-ok nil)
                  (next-out items i1)
                ))))

By example

> (setf *print-length* 7)
7
> (scan-range :from 5)
#Z(5 6 7 8 9 10 11 ...)
> (scan-range :upto 9)
#Z(0 1 2 3 4 5 6 ...)
> (union-integer-series * **)
#Z(5 6 7 8 9)
> 

Conditions in Common Lisp

Dan Weinreb wrote a long blog post recently outlining what Common Lisp’s “Conditions (exceptions) are really about.”  He argues, and I won’t disagree, that they provide a clean, even elegant, way to handle all the long tail of situations that arise when your trying to be fastidious about the contract for your functions.  One of the things that used to be unique about Lisp was it’s elegant support for unwinding the stack when odd things forced you to abandon what ever your doing.  It was a lot easier to see to it that the printer handle was relinquished, the file closed, the gate shut in Lisp than it was in Fortran or C.  These days most languages are better about admitting that these things happen and something should be in the language to help.

Interestingly though the Common Lisp condition system remains a much more elegant thing than you find in other languages.  I don’t really understand why.  The condition system doesn’t just provide a way to unwind the stack, passing along an object instance to describe what ‘exceptional’ condition arose.  It also allows the handler for these conditions to resolve them.  In effect this provides an elegant way to implement call backs, i.e. that situation where a low level body of code needs to ask the higher, or later written, code how to resolve a problem.  For example you can use the condition system to implement the classic call back scenario where a GUI widget wants to provide it’s users with the ability to do additional checking on user input.

There’s more, but what I really wanted to get off my chest (and that’s what a blog’s for eh?) is that I sometimes I like to muse about raising a condition is yet another example of how complex function calling really is.  Back in the day we used to be able to figure out what function would implement a call site at link time.  Later we got used to having dynamic linking do some of that, and it was common in memory constrained systems to load and flush code on the fly under the covers when in the function call.  I’ve seen systems where you switched interpreters in the function call for performance and space trade off reasons.  These days people are used to having a complex method dispatch that puzzles out what method will will take care of the request.  The method dispatch is a messy beast when you allow new classes to be loaded.  It was with some pleasure that I first read the mythical man month’s description of how some ornate linker came to be obsoleted; since at the time I was struggling with how loading compiled lisp files was an amazingly disruptive event; since it upset all your method tables.

Generally method dispatch is done by searches up stream in the acyclic directed graph of types.  I.e. the behavior of the function is inherited from the class hierarchy.  The class hierarchy is not the only hierarchy in most programs.  For example there is almost always a containment hierarchy, i.e. the OK button is inside this dialog; or for example this file write is taking place inside the task of writing the log, inside the task of handling this web request.

At one point in my life I was quite enthusiastic about the idea of language designs that might allow these other hierarchies more respect.  During that period I built a few class systems where behavior was inherited first from the containment hierarchy and only secondarily from the class hierarchy.

What I like about conditions is how when a condition is raised and the system goes off seeking a handler for that condition – it’s just another variant of method dispatch – and the call hierarchy is were we search for the handler.  For me the fact that these calls often don’t return; or return in such unusual ways is part of what’s delightful about the condition system.

Echonest

boomfab.pngWrote small Common Lisp library for accessing the echonest music analysis API. All the other APIs are jealous.
Writing this was fun because I got to use Relax NG for the first time in actual code. By inspecting the XML that comes back from echonest I guessed at what the schema was and wrote it up in the compact form of RNG (see here). Isn’t that easy to read!

Inserting the library cxml-rng for validating your XML given a Relax NG schema and inserted it into the SAX pipeline took less than a half dozen tokens (see here). My reward: is that the handler that loads the XML into a useful data structure doesn’t need to be as paranoid. How relaxing!