Diagnostic Typing

One of the dialectics in computer science is between dynamic and static typing. Dialectics are like professional wrestling. Cheap fun. But, they leads to category blindness. So let me blather a bit about a “third way” that I call diagnostic typing.

At one point in my career I spent a few years deeply committed to the static typing camp. The height of that experience was an amazing day when we successfully linked, for the first time, a huge complex program. It consisted of hundreds of thousands of parts written by numerous authors and code generators. It worked! First time! Getting to that point had required tremendous labor, since the static type checking demanded by the language we were using had forced us to fix lots of stuff that might have been left for latter. At that moment it seemed worth while that we had deferred so much immediate gratification for so long.

Late in the project we had some really amazing bugs. Bugs that took weeks and teams to fix. Fun bugs with long interesting stories to tell about each of them.

During that later period I found my self writing what I came to call diagnostic typing code. This code would work to prove a complex declaration about the nature of a data structure. These declarations were put forward by the team members. For example somebody might say “All the records of type A are in the dictionary D.” and then somebody else would say “Ah, I thought the core set of those aren’t in the dictionary.” At that point I’d go off and write some code to to check if these declarations were true. It was fun because the truth was almost always much more complex than anybody thought. The bugs were all around the edges of these.

So the dialectic between dynamic and static typing is actually a kind of layered thing; with at least with three layers. Static, Dynamic, and Diagnostic. Static type checking is done at compile time. Dynamic type checking and dispatching is done at runtime. Diagnostic type checking is done intermittently; usually in response to a demand or a fear. It was extremely valuable to become explicit about some of the declarations that had been implicit.

Diagnostic type checking can be very very expensive. That makes it a lot of fun! It lets you can write all kinds of assertions about your program that would never be practical to check or enforce at the compile or runtime. You can get out the big guns: graph theory, statistics, coverage, grammars, budgets, spelling correctors. For example: all the window components form a strongly connected component via their child/parent arcs. For example: the elements of this hash table are uniformly distributed. (As an aside I don’t think I have every found a hash table that was well behaved in the wild.)

One of my favorite examples. This isn’t just for data structures you can do this on program traces too. Back in the 1970s somebody at CMU wrote a paper about using the ideas from language grammars to declare the patterns over calls on class instances. Things like: x:foo<-create(x); {open(x); {update(x)}+; close(x)}+; destroy(x)}*. In a later life I would sometimes write code to diagnostically check statements like that by using the tracing facilities in Common Lisp.

I wrote a lot of this diagnostic typing code for the persistent store using prolog. I would dump the entire persistent store into a suite of prolog assertions and then write the diagnostic typing declaration as small prolog programs who’s execution would prove or disprove these declarations. While that found a lot of very very subtle bugs I found it more fascinating how it raised raised the level of discussion about the work.

This kind of approach will, I suspect, become more common real soon now. So much of the data sloshing around on the net is full of surprises that diagnostic typing declarations would reveal. Moving the data across organizational boundaries creates a demand for tools that can frame the discussion between the parties.

One of the reasons I’m suffering a fit of enthusiasm for RDF is how it appears to offer a normal form, much as prolog assertions did for me in my previous experience, for just this kind of problem solving.

This trio: static/dynamic/diagnostic typing are all about shifting around the work, the trust, and the gratification. Don’t overlook the gratification. There is a lot of fun to be had in diagnostic typing approaches. I doubt you can write down all the declarations about the data before it starts flowing. Why defer the fun of flow?

3 thoughts on “Diagnostic Typing

  1. Ben Hyde

    I’ve never seen, but then I’ve never looked, for such a thing in the marketplace. I would be very surprised if there isn’t an industry of vendors that sell tools like this for helping firms to tame their relational databases. That said my experience is that most firms don’t want to know how tangled the real rules are for their data. The dude that brings the facts on the ground “Did you know that 12% of the mailing addresses have zip codes that make no sense given the state code?” to their attention is rarely welcomed with open arms. Fixing bugs makes the buyer happy, but they rarely want to know what you discovered down there undermining foundations. Maybe a business doing this could be framed as quality certification or something?

  2. Pingback: Tasty Languages | Ascription is an Anathema to any Enthusiasm

Leave a Reply

Your email address will not be published.