Date::Tiny 0.01 - Expanding my ::Tiny empire (and evolution)

Alias on 2006-08-28T02:17:55

I've just uploaded Date::Tiny 0.01 to the CPAN. It implements an very small date object for use in log file parsing and other light duties where you won't need to manipulate the date, in as little code as possible. If you do need to do anything serious with the date object, it can be automatically inflated into a full DateTime object as needed with a ->DateTime method.

This is the 4th module in the ::Tiny series, and I plan to shortly also release the Time::Tiny and DateTime::Tiny companions to this latest module.

When I wrote Config::Tiny, the first of the ::Tiny modules, it was something of a rebellion. I was simply annoyed at the size to which modules to do apparently simple tasks had grown.

The uptake and and popularity of Config::Tiny has continued to astonish me.

Obviously there is something very attractive about a small, concise, zero-dependency module that is fast and takes up almost no memory, even if it is a bit hacky and not quite as "pure" as it could be. Config::Tiny is implemented just as a hash of a hash that has been blessed as a convenience, so it's not strictly OO.

It's worth noting I get more positive feedback, both emails and in person, for Config::Tiny than I do for anything else I've written. It even ended up in Perl Best Practices as the Config module of choice for basic config file functionality.

Obviously the concept has struck a chord with the general userbase.

After realising the data structures were almost identical (largely just a HoH), I then cloned it and made CSS::Tiny. And although needed less often than a module for .ini config files, I get the odd email about CSS::Tiny as well. Certainly more for a less-used area like CSS compared to other modules.

Surprisingly, early and speculative concepts from the world of non-biological evolution may back up my tactic of shrinking down larger modules in this way. Scientists in this area are trying to answer one very large question.

"What are the general characteristics of evolution, regardless of form, and how do we measure them?"

The holy grail for this area would be one equation to express how "evolved" something (anything) is. The same equation should be able to demonstrate why a human is more evolved than bacteria, why a pre-nova star is more evolved than a new star, and how evolved a human is relative to a star.

One of the early versions of this equation (provided as an example for demonstation purposes, but by no means considered rigourous) would be something like this.

e = i / s / m^3

e - Evolutionary level i - Unit of effective information processing s - second m - metre

That is, where the general evolutionary level of something is based on the amount of information that is processed by that thing per second, per cubic metre.

Since information processing in all forms requires an expenditure of energy, some of which is invariably lost in the form of heat, this also means (loosely of course) that you can use per-volume heat output as a metric to compare two entities for their evolutionary state.

For example, by observing that the human brain generates more heat per cubic centimetre than the Sun (it really does, there's just a lot MORE of the Sun) we can state that the human brain is more evolved than the Sun. Humans are also more evolved than a rock or a stapler.

Whether we are more evolved than a computer is trickier, due to the different natures of the information processing done by our brain and a computer. Things are also complicated by volume constraints. Our brain can't get much bigger without running into heat issues, and chips have similar issues of heat management.

But based on it's higher capacity for heat dispersal, the physical medium of the computer chip certainly looks dangerous as a candidate to ultimately out-evolve biological life in general, if the processing efficiency can only be raised far enough. As long as our brains continue to process information more efficiently per cubic metre however, our future should be safe.

Now I'm doing an awful lot of hand-waving here, but like I said, it's early days for the field of non-biological evolution.

But if we take this concept to software, it allows us to make some interesting predictions about the long term evolution of software.

Software has gotten something of a free ride due to the ever-increasing power of hardware, and there has been very little short term pressure to evolve along the long-term direction of efficiency.

But looking at our e = i / s / m^3 equation, we can fit software into it quite nicely. Assuming equivalent hardware, our per-second becomes per-clock-tick computational efficiency (speed) and our volume becomes memory size.

Thus for any software problem we could well observe that the long-term trend is towards software (in compiled and running form) that is fast and uses very little memory. Which is exactly what the mandate of ::Tiny is (although it adds the caveat that it only implements a subset of functionality).

Doesn't look good for Perl though, with it's large memory usage.

But of course, these evolutionary trends typically take place of very very long time scales, so it has little relevance to us in the here and now, except at perhaps the level of decades.

The second part of the trend noted above is also interesting however. And that is that one of the indications of more evolved software is heat output per volume. And we can actually translate this as well.

If we assume the software is programmed into a FPGA, then heat output per volume can be taken (at least in part) as the execution density of the code.

That is, how often the average command within a program is run per second, the higher the better. So the first of the two equivalent code blocks below is more evolved, because each command fires more often on average than in the second.

foreach ( 0 .. 10 ) { print "Hello World!\n"; } foreach ( 0 .. 10 ) { print "Hello "; print "World!\n"; }

Looking a the sort of code we write on a daily basis, this strongly encourages the use of CPAN modules. This is particularly the case where you can get two completely different pieces of code using the same utility module, since those command will fire twice as often then if they were written seperately, and increase your average execution density.

For a large module or framework, where you are less likely to use it twice in two different places, the trend is greatly reduced.

If I were to go out on a very long limb, I'd ask if this might help explain why utility and often-reused components like URI trend towards a single dominant module, while frameworks tend not to survive as long and eventually get replaced.

It's food for thought at least...


I'll think about it while eating lunch

slanning on 2006-08-28T10:21:37

Okay, you're trying to explain the success of modules on CPAN; that is, why do certain modules become more popular.

What isn't clear to me is: since hardware is increasingly powerful, why is the size in memory of a module important? It's not as if different "species" of a module are competing for limited memory in a literal sense. Something like DBI is quite large, but it's undeniably more successful than database-specific modules were (you're not going to use an XS wrapper around libpq, even if it was tiny, unless you have a very limited environment like maybe an embedded device).

I don't doubt that there is some way to describe the popularity of modules on CPAN in terms of "natural selection", but I can't see what the relevant "resources" being competed for are. Maybe the memory size isn't what's important, but rather the simplicity/elegance of the module's API (for example). Or maybe people are already "adapted" (exapted?) to other ways of programming, and creating modules with similar APIs would cause them to be more popular.

(I'm trying to figure out how to relate what you're saying to the "island rule" that's recently been in the news. When animal species become trapped on islands with limited resources, large animals tend to become dwarfed (dwarf mammoths) and small animals become huge (rats). But what would be the "island" for modules, if any?)

Re:I'll think about it while eating lunch

Alias on 2006-08-29T05:21:41

The reason we've gotten away with using a lot of memory for so long is exactly because hardware is getting more powerful. In times of plenty, waste or resource usage is not an selection factor.

But the time frames I'm talking about here are quite long. I'm looking at evolution of module usage over 3-10 year periods.

Memory does eventually become important, if only for a subset of people. (Think mobile phones)

I have one monstrous private application that uses 80-90 meg of RAM B to load, before doing any work of allocating any memory to data. Until very recently that's been a real problem, and the startup time is still longer than I'd like.

Every time I can cut 3 meg off that number (which is on average what a ::Tiny module saves) it's a good thing.

Wetware issues (APIs, familiarity, etc) are absolutely important. But they tend to be important in the short term (months or years).

Over longer periods, if someone eventually creates something equally usable, and smaller and faster, it will gradually win.

I think the island rule is more complex, since there's different factors at play.

For creatures without natural predators, such as the dwarf elephants, resources would seem to be the more obvious direct approach.

For rats, perhaps it was something different, such as a lack of the predators that were formerly killing off the larger and juicier rats.

With regards to size, the place I see the most striking example is in the C libs. For particularly complex tasks that are strongly encapsulatable, we see a tendancy towards small, tight, fast C library integration. The fact that this means that memory gets shared with other non-Perl applications adds to the factor.

Look at the examples such as libxml, libgif, libsyck and so on.

Because the smallest, and fastest, way to implement these strongly encapsulated functions is in C (if done right, note expat) you might suggest that over time there will be selective pressure towards the use of sufficiently robust C libs.

For the DBI case, I would suggest a different evolutionary factor would be more dominant here, that being flexibility.

While the C libs are small and fast, they are resistant to change. Which is why you'll note the areas where they are used the most are for long-standardised functionality that isn't changing often (xml, image processing etc).

The example of expat is a good one, it may be that it was just implemented too early, while XML was still something of a moving target.

Because different databases come and go, it is far more important that DBI be flexible. You first have to achieve continuity to survive. Once the problem of continuity is solved, THEN the more long-term issues like compactness start to become an issue.

Part of what excites me about Haskell conceptually (I still haven't had time to do more than skim the surface) is that it implements the fast/small concepts rigourously and with mathematical precision.

I'm somewhat curious how big haskell code is. If it's fairly small and memory efficient, then we may well be looking at a long-term successor to C.

This is one of the other interesting things about Parrot and multiple languages.

If a language in one language compiles to Parrot smaller and faster than another, will precompiled versions of those modules become popular with users from the other languages? I can't wait to see :)

A metric for evolution

masak on 2006-09-01T17:33:36

The holy grail for this area would be one equation to express how "evolved" something (anything) is. The same equation should be able to demonstrate why a human is more evolved than bacteria, why a pre-nova star is more evolved than a new star, and how evolved a human is relative to a star.

I think you will have a hard time finding such an equation, because I don't think there is a sense in which a human is more evolved than bacteria.

I'm not just saying this to be cute. I'm a biology student, and my understanding of evolution is that no living creature of today can be said to be "more evolved" than another, in any real sense. We're all descended from the same ur-creature, and we've all had the same time to evolve and adapt. It's just that we've adapted to fill different niches and do different things.

Re:A metric for evolution

Alias on 2006-09-04T05:14:16

It may well be that if you can come with an equation to compare the "evolvedness" (or some more sophisticated concept) that a bacteria and a human are simply too close, and the difference (while existing) may just bee too small to be noticable.

But what if you tried to compare the "evolvedness" of a bacteria and a rock.

Or instead of a rock, how about a simple non-living self-replicating molecule?

If some metric can be found, then perhaps they can be applied to the difference between bacteria and humans. And maybe we find that humans are more evolved than bacteria, but only 1.28% more evolved.

Most of the problem here is in the definition... since evolution is a process, and so creating an "evolvedness" concept is hard. Most likely the metric will be something far less naive.

Re:A metric for evolution

masak on 2006-09-04T05:37:14

It may well be that if you can come with an equation to compare the "evolvedness" (or some more sophisticated concept) that a bacteria and a human are simply too close, and the difference (while existing) may just bee too small to be noticable.

Your reply to my "you won't find any, since it's not there" seems to be "maybe it will be too small to notice".

FWIW, I believe we won't find a metric or a scale along which a human would come out more evolved than bacteria; not because the differences will be too small along the relevant axes, but because we won't find any relevant axes.

The term "evolution" sometimes misleads people subtly into believing that progress is being made from species like bacteria towards species like Homo sapiens. There is no such progress, there is no "towards" in it. It's just fiddling with parameters, keeping in synch with an ever changing environment (or a few steps behind, as it were).

Re:A metric for evolution

Alias on 2006-09-20T00:33:59

I certainly agree that evolution is a process of adjustment, rather than a progression.

But if it can be established that there are long consistent long term trends in evolution, where we progress (in the large scale) from state A to state B, then perhaps that change can be expressed as a metric.

Lets not call that evolution, how about we call it futureification. :)