Some categories really do exist. For example there really is something called an elephant. You can decide if X is an elephant. Elephant category exists because there is some continuity in the genetic thread that runs thru members of the species.

Of course, even the edges of a category like that are rough. In the very very long term the tree of evolution makes them rough. In the very short term the life experience of each individual gives them unique personality.

I think a lot of people presume that a similar continutity runs thru other systems. They project what they know about categories in species onto populations where no force as powerful as the genetic thread exists. When this doesn’t work they draw analogies to the roughness that arises in nature. Other systems that appear to have continuty are forever undergoing frankinstein like hybridization. It’s all much more confusing and fuzzy than the world of species.

Lots of systems don’t have any useful sharp boundries. For example there just isn’t a useful sharp boundary between a blog, a newspaper, or a firm’s updates to it’s product catalog.

One of curious aspects of the power-law discussion is how the elite catagory in a network create the impression of a category. For example are the top 5000 firms on the planet a good sample for informing your model of what a firm is? Would a sample of the bottom 5000 or the middle 5000 firms be better? Clearly a study of the top 5000 isn’t going to inform in a useful way your model of the bottom or middle 5000. The word firm – well, it’s a mess.

This problem infects every discussion of a population with a power law distribution. The population appears to exist as a category; but as you attempt to describe it there is a tendency to sample one sub-population and project what you learn from that upon the class. For example if you study the a-list blogs and learn almost nothing about what is going on in what you might name the micro-blogging community around teenagers. You study open source and discover that most projects as Source Forge are tiny – do you declare them to be irrelevant?

It’s an interesting puzzle words – it would appear – just breakdown when you attempt to think about these populations.

  David

    Even the category of elephant is vague, a shorthand, a piece of fiction. It seems well-defined just because none of the edge-cases have hoven into view.

    Nature doesn’t know from elephants. She doesn’t even know from *things*. All there is, is the Wave Function. All else is the work of Man.

