Friday, October 14, 2005

Paremetric Diversions Revisited

OK, I admit it. After writing about H.S. Lahman’s talk on invariants and parametric polymorphism, I wanted to see what the rest of the world thinks parametric polymorphism is (and isn’t).

The Wikipedia entry for polymorphism states,
“polymorphism is the idea of allowing the same definitions to be used with different types of data (specifically, different classes of objects), resulting in more general and abstract implementations.”

Hold that parenthetical thought—specifically different classes of objects—resulting in more general implementations.

Well in the billing example I defined a single generic algorithm that worked on a single class of data (a RateSpec). The clever encoding of the RateSpec threshold and rate tables allowed a single algorithm to cover a wide range of data values: some customers could only be charged a single base rate, others could have different thresholds. The RateSpec object provided a uniform view of this data from the algorithm’s perspective. I wasn’t technically using different classes to represent different encodings. I was being economical by only defining a single class that could encapsulate many different rate and threshold encodings. Is this really parametric polymorphism?

The Wikipedia entry goes on to say:
“Using parametric polymorphism, a function or datatype can be written generically so that it can deal equally well with objects of various types. For example, a function append that joins two lists can be constructed so that it does not depend on one particular type of list: it can append lists of integers, lists of real numbers, lists of strings, and so on… Parametric polymorphism was the first type of polymorphism developed, first identified by Christopher Strachey in 1967. It was also the first type of polymorphism to appear in an actual programming language, ML in 1976. It exists today in Standard ML, O'Caml, Haskell, and others. Some argue that templates should be considered an example of parametric polymorphism, though instead of actually reusing generic code they rely on macros to generate specific code (which can result in code bloat).”

There's a subtle distinction between modern implementation techniques that are labeled parametric polymorphism and what H.M. was getting at. Before hearing H.M.’s talk, I’d thought of parametric polymorphism as just being another word for parameterized types and/or template classes. That does seem to be a fairly common modern definition. But H.S. Lahman put a slight twist on things. His main point was that careful design of data along with a generic algorithm can accommodate wide variations without lots of classes, case statements or conditional logic. In fact, the key is to design a single, uniform view of seemingly disparate data. In a complex system (the Shreveport LA water rates aren’t complicated enough), different RateSpec implementations would likely be needed. And I suspect that there isn't one algorithm to calculate rates. But in my simple example, they weren't. So while I technically might not have been using parametric polymorphism, I did achieve uniformity by encapsulating what varies in a single RateSpec class whose instances would be composed of different rate and threshold table attributes. And that what makes this design simple and flexible.