One project I’ve been doing while unemployeed is to review how the Common Lisp landscape has changed since I was last paying attention. And the big change is the emergance of a rapidly growing pool of libraries (900+) all distributed via Quicklisp.
My current favorite is a library called Optima. Optima adds constructs for doing rich pattern matching. It can match lists, arrays, objects, structs, strings; and if you load fare-quasiquote you can write patterns using backquote.
> (ql:quickload '("optima" "fare-quasiquote" "parse-js" "parenscript")) ... > (defpackage #:js2ps (:use #:cl #:optima #:parenscript #:parse-js)) > (in-package #:js2ps) > (pprint (parse-js "a += f(x,y)"))) (:toplevel ((:stat (:assign :+ (:sub (:name "a") (:num 3)) (:call (:name "f") ((:name "x") (:name "y")))))))
Then we can write a recursive function using Optima to translate that into parenscript like so:
(defun js2ps (txt) (labels ((r (x) (match x (`(:toplevel ,stms) `(progn ,@(mapcar #'r stms))) (`(:stat ,s) `(progn ,(r s))) (`(:assign :+ ,left ,right) `(incf ,(r left) ,(r right))) (`(:call ,func ,args) `(,(r func) ,@(mapcar #'r args))) (`(:name ,name) (js2ps-name name)) (`(:num ,n) n) (`(:sub ,array ,index) `(aref ,(r array) ,(r index))) (eck (error "TBD ~S" eck))))) (r (parse-js txt))))
Let’s try it, and then using
js2ps> (js2ps "a += f(x,y)") (progn (progn (incf (aref a 3) (f x y)))) js2ps> (ps* *) "a += f(x, y);"
You have already noticed that the patterns look just like the code for building the result. Pretty eh?
A complete translator is easy. The fiddly bits arise around handling variable bindings and around what in Parenscript are called @ and chain. There code here that does much of that.
I’ve also used Optima is to scrap web pages. It’s trivial to load and parse an page into an s-expression, just quickload “drakma” and “closure-html”, and then do (parse (http-request url) (closure-html:make-lhtml-builder)). Then let’s say you wanted to know the prices that macbook airs have sold for on ebay.
(defun snarf-prices (page-sxpr) ;; for example: (snarf-prices (fetch-price-page "macbook air 13" 3)) (labels ((recure (x) (match x (`(:div ((:class "g-b bidsold") (:itemprop "price")) ,(ppcre "\\$([\\d.]+)" price)) (push (parse-number price) result)) (`(,(satisfies keywordp) _ ,@children) (map nil #'recure children))))) (recure page-sxpr) (nreverse result))))
The second pattern match
`(,(satisfies keywordp) _ ,@children) uses two new tricks. The
statisfies shows how we can put arbitrary predicates into your patterns, and the
_ is a wildcard.
The first pattern shows how to match strings to regular expressions. The pattern (ppcre “\\w*\\$([\\d.]+)” price) would match the string
” $647.10″ and binds price to the string “647.10”.
The code for an earlier version of this page scraping example can be seen here.
One problem with Optima is that it’s a bit addictive. Once you start using it you stop using lots of other techniques. For example I hardly ever use cl-ppcre directly anymore. For example if I want to pluck the host out of some URLs I’ll write:
(match url ((ppcre "//([-_.\\w]+)/" host) host)). It also lets me write code in the style of awk or perl, i.e. with a set of pattern matches.
And I should have mentioned, it expands into pretty good raw code. I say pretty good rather than great because it doesn’t currently factor out common subexpressions.
So try it out, you’ll like it.
Great stuff! I started using Optima recently, too, after having been exposed to pattern matching in other languages (Shen, Haskell). One of my favorite things about Common Lisp is how it often (not always) is capable of absorbing great ideas from other languages in a usable way.
Shameless plug. CL-UNIFICATION can match more. Classes and defstructs for example; CL-PPCRE integration and lambda-list matching of multidimensional arrays.
As I mention, but don’t illustrate, it’s possible to match objects (and structs) in Optima.
(match x ((class-name slot-a (slot-b `(,first-born ,@other-kids)) ...) ...).
That said, CL-UNIFICATION looks interesting.
Great article… now I may have to consider using fare-quasiquote to my optima use.
One nit from above: the \\w* in a regex matches zero or more “word” characters. For whitespace, you want \\s* or \\W* for “non-word”, right?
Patrick – right you are.
(ppcre "\\w*\\$([\\d.]+)" price)is silly. I’ll fix it in a moment. It works only because the \\w* matches zero characters, so “\\$([\\d.]+)” would work just as well.
Hey, you should check out this old thread on the old META parser.
This looks similar to that.
Very cool. I tried to follow along with your js2ps example but hit a few snags:
– You need
:use #:parse-jsfor your first snippet to work.
– In js2ps you wrote
(`(:name ,name) (js2ps-name name))but you never introduced js2ps-name! Did you not mean
Meanwhile, thanks for catching the error in the defpackage. Hard to imagine how that slipped thru. I’ll fix it in a moment.
Pingback: tidy up the output of lisp macros | Ascription is an Anathema to any Enthusiasm