Data::Table

osfameron on 2005-01-27T16:23:20

This is a useful module but I'm getting frustrated with it. The API is sometimes counter-intuitive, and I'm annoyed that you can't pass in a custom sorting subroutine - it's character/number, ascending/descending and that's your lot. OK, I get now that there are reasons for that ($a and $b are magic to the package in which they were compiled, frustratingly) but surely custom sorting is something that you'd want from a table?

I've occasionally wondered about the dual implementation (by columns / by rows) advertised by Data::Table.

Row-based implementation is better for sorting and pattern matching, while column-based one is better for adding/deleting/swapping columns.
Data::Table swaps them around internally so you don't have to worry about it. But out of curiousity I did some benchmarking, copying the rotate() routine from D::T and I find that adding a column is anywhere from 4-10 times slower, and deleting from 6-12 times slower if you rotate implementation than if you just take the naive approach of transforming every row in turn!

Maybe I've missed something out (I haven't tested with an actual D::T object, just a stripped down structure), but if that's true you'd need to do over that number of additions/deletions to make it worth rotating the structure. (Because once you've rotated, the other insertion/deletion operations are free).


custom sorting

link on 2005-01-27T22:07:28

By custom sorting do you mean passing your own sorting method into the package method? Is this some restriction with the module itself or just a question of speed? perlfunc -f sort says if you prototype your sort method you can have it called as a proper sub,slower but depending on your data size it might not be much of an issue for you.
use A;

sub strange($$)
{
        return -1 if $_[0] == 3;
        return 1 if $_[1] == 3;
        return $_[0] <=> $_[1];
}

A::try(\&strange)

package A;

sub try
{
        my ( $method ) = (@_);
        my @A = sort $method ( 1 , 2 ,3 ,4 ,5 );
        print join(",",@A) . "\n";
}
1;

Re:custom sorting

osfameron on 2005-01-28T08:35:47

thanks - I came up with using @_ arguments last night (though I initially did {$func->($a,$b)} until I read the perlfunc entry about prototyping ($$).

A benchmark of a simple sort on 1000 rows done over 10 million iterations shows a small ( Of course once you start like this, and realize you can create sort libraries you end up by creating complex sorts (like: "sort first by Jan/Feb/Mar order in one column, THEN by numeric order in another") with function composition, which does start to add some overhead. (But is the right thing to do in that it's readable, flexible, and reusable.)

(oops) Re:custom sorting

osfameron on 2005-01-28T08:39:52

Hmm, can't seem to edit that comment, once more, with preview! thanks - I came up with using @_ arguments last night (though I initially did {$func->($a,$b)} until I read the perlfunc entry about prototyping ($$).

A benchmark of a simple sort on 1000 rows done over 10 million iterations shows a small (<10%) variation, bizarrely favouring the @_ version despite the extra level of indirection... I think that deciding you can't support custom sorting because using the overhead of @_ over $a,$b is too great is probably a premature optimization!

Of course once you start like this, and realize you can create sort libraries you end up by creating complex sorts (like: "sort first by Jan/Feb/Mar order in one column, THEN by numeric order in another") with function composition, which does start to add some overhead. (But is the right thing to do in that it's readable, flexible, and reusable.)