I have been reading with sympathy Patrick Logan’s blog where in he is fighting the good fight for more dynamic languages against the righteous forces of less dynamic languages. It’s a fight that many in my generation fought, and lost. I’ve never buried the hatchet. It’s in the basement someplace. The outcome of the last round wasn’t in was anybody’s best interests. It’s good to see that people are still spending rhetorical calories attempting to fix address the problem.
At ApacheCon I had a short but high bandwidth conversation about RDF with one of my friends. First we quickly toured the reasons we like RDF and then we had an equally fund tour of things that frustrate us about it. One of the frustrating things has been rattling around in my head ever since.
RDF is strangely type-less. This is good and bad. For example let’s say your collecting metadata about URI. You collect a lump of data about each URI; size, format, time of last update, etc. etc. Good fun is had. Time passes and you discover another source of data about URI; his data model is different. RDF is very helpful at this point. His lumps of data and your lumps of data can mix and match casually in a slurry of assertions about the URIs. All kinds of conflicts between the data models are acceptable. Both of you may have a bit of data about last update; both of you don’t have full knowledge about the last update time – you only have approximations, you only have partial data, in some cases he’s last update time and yours are different. You used different time formats and while you were very fastidious about describing the characteristics of your clock he never thought about that so you really don’t know if when he’s data says 11:13AM is +/- 30 seconds or +/- a day.
RDF allows your program to take both sets of information and pour them into your system and get back to work. It’s very very tolerant of the kind of models that appear in real world problem solving.
This is good, because it’s real.
But it drives people crazy! The flexible nature of these models makes them very subtle to reason about. Many of the exception in the data model propagates into exceptions in the code that chews on that data. The programmer, and his even his clients, needs to learn a set of practices that on the one hand leverages the power of a model that is so deeply informed by the nature of real work problems and on the other hand is respectful of the soft and porous nature of the ground he’s building his system on.
One of the unspoken threads that runs thru the dynamic/strong typing debates has always been the way that the two camps are suspicious that the other camp is insane. The dynamic typing crowd suspects that the strong typing crowd is delusional about the real nature of the data they are working on; real data just isn’t particularly strongly typed. The static typing crowd thinks that the dynamic typing crowd is trying to build their house on a foundation of marshmallows.