I just put the database chapter to bed, and found two interesting things while working on it. First was that there doesn't seem to be a generic module for paging through results. I ended up writing code to do this for the book (you know, displaying a subset of a table with "Viewing Results 26-50" and the appropriate back/forward buttons) and while it was web-specific, I kept thinking "it wouldn't be too hard to generalize this". Before I do so, has anyone else invented this particular wheel first?
And I learned a cute trick for searching hierarchies. You know how to store a tree in a table, right: you give each node an id and store the parent id of the node as well:
CREATE TABLE node ( id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, parent INT, payload TEXT )So now you can find the children of node 5 easily: SELECT * FROM node WHERE parent=5. But it's hard to select all the children of node 5. That requires a tree traversal, which involves lots of database queries, which gets jugly fast.
The cute trick is to build another table containing the path of each node (".1.5.12.19." means that this is node 19 whose parent is node 12 whose parent is node 5 whose parent is node 1). Then finding node 5 and its children is as simple as:
SELECT id,path FROM paths WHERE path LIKE "%.5.%"Suuuper sneaky! I'm really beginning to appreciate how different it is to program in SQL...
--Nat
I think HTML::Pager fits the bill (atleast somewhat). Can't say I've ever used it, though. Looks like it depends on HTML::Template for its output (although it claims it can still be used without it). HTH.
You probably want to investigate Data::Page and it's cousin Data::PageSet.
I wrote a paging module before. I wouldn't wish the edge cases on anybody.
-Dom
Re:Data::Page
Dom2 on 2003-03-31T12:55:34
Oooh, I forgot to mention Class::DBI::Pager, which is wizzy if you're using Class::DBI.-Dom
Re:Data::Page
gnat on 2003-04-01T00:01:09
Well bollocks. I could have sworn I googled without success for "page database results in perl". So much for being done with Chapter 14! Thanks,--Nat
Re:Data::Page
brev on 2003-04-09T09:21:35
This is ancient history by now but Tim Bunce supposedly did a review of this, at OSCON in 1999 or 2000?
http://www.carumba.com/talk/perl/multiview.shtml
IMO paging through RDBMS results is always going to be ugly since SQL is a set-oriented language. You *have* to break the model to do that.
Beware that this is a premature optimization.The cute trick is to build another table containing the path of each node (".1.5.12.19." means that this is node 19 whose parent is node 12 whose parent is node 5 whose parent is node 1).
While this technique works fine with a static tree, it makes tree transformations hideously difficult. Imagine the pain that comes from moving node 12 to be a sibling of node 5, or removing node 5 and consequently promoting their children up a level.
The other technique of using a single integer column to store parent_id is the most flexible, even if the SQL is slightly hairier.
Alternatively, you can select the entire subtree as a two pass operation. First, (recursively) find all of the children of node 5. Next, build a UNION query of all parent nodes that are part of the subtree rooted at node 5. If you build your union query properly, the nodes also come back in-order.
Re:Cute Tricks and Premature Optimizations
Dom2 on 2003-03-31T15:57:55
Heh, if you wanted to be really ugly, you could keep the parent id in the sme table, but have a trigger to rebuild the secondary table with the complete path-like listings. It'd be heavy on the database, but might be affordable depending upon how many updates you're doing.-Dom
Re:Cute Tricks and Premature Optimizations
gnat on 2003-04-01T00:12:43
It seems a bit premature to call this premature:-) As with every problem, the right data structure depends on your data and how it's accessed. Just as an alphabetized flat file is quick to binary search but slow to insert, whereas an unordered flat file is quick to insert (append) but slow to search, you choose your solution based on what you know about the data. If you were going to be doing a lot of hoists or reparenting operations, then I guess you'd have to test it in the field. This doesn't change the cuteness of the hack, though. How do you recursively find all the children of node 5 without doing a ton of SELECTs? That's the problem that a path table gets around.
--Nat
Re:Cute Tricks and Premature Optimizations
ziggy on 2003-04-01T00:50:20
I personally don't think that the "ton of SELECTs" is all that horrible. That's probably my personal style though.How do you recursively find all the children of node 5 without doing a ton of SELECTs? That's the problem that a path table gets around.It starts out with something like this:
...which returns a list of IDs. That list of IDs then goes into a new SQL statement. The IDs are also pushed into a list that contains all IDs returned from your per-level queries:SELECT id FROM nodes WHERE parent IN (5)That process is very amenable to a simple while loop:SELECT id FROM nodes WHERE parent IN (6, 7, 8, 9)At the end of that, you've executed N queries to find all nodes that are 1..N levels beneath your root. From there, it's only one more query to obtain all nodes. You could even rewrite the queries to return all the children at each level over N queries instead of doing the N+1 query. That N+1st query can help to automatically order the nodes though.my @ids = ();
my @new_ids = (5); ## starting condition
while (@new_ids) {
push(@ids, @new_ids);
my $set = join(", ", @new_ids);
@new_ids = do_query("SELECT.... WHERE id IN ($set)");
}
## @ids contains all children rooted at the subtree at node 5With this property, I don't know that the work to "compile" the tree structure into a text field is all that much of a benefit. But that's just my gut feeling, and of course I could be wrong about this.
Re:Cute Tricks and Premature Optimizations
darobin on 2003-04-01T09:50:22
I've written code as the one you describe dozens of times and I must say that it certainly was "fast enough". I'd be worried if I had very deep trees though, as that would be when it could become costly.
But then why store a tree in an RDB? Wouldn't XPath be a *lot* more pleasant to access random things in a tree?
;)
Re:You'd be better off converting hierarchies to s
jnoble on 2003-03-31T20:24:55
Crap, my post got its < and > mangled, even though I told the submission widgit it was plain text.Here's the decoder guide -- not actual SQL.
My Management Chain: Select where left LESS THAN my_left AND right GREATER THAN my_right
My Vast Empire Select where left GREATER THAN OR EQUAL TO my_left AND right LESS THAN OR EQUAL TO my_right Joel Noble
orSELECT COUNT(*) FROM tree
and the entire subtree under a node can be selected bySELECT MAX(id) FROM tree
where the values you plug in are the "first id in subtree' value cached in the parent node and the node id of the parent.SELECT * FROM tree WHERE id >= ? AND id < ?
This scheme works wonderfully when you have static data, or data that you can update (and re-walk) on a scheduled basis, but fails miserably if the table needs to be updated live.
Re:Finding subtrees
gnat on 2003-04-01T01:42:40
You're the second person to recommend a Celko book. Hi ho, hi ho, it's off to Amazon I go... :-) Thanks!
--Nat
Nobody has mentioned Oracle's CONNECT BY. Hierarchical queries in one statement!
I doubt Ziggy is aware of this, but the tree hack you describe is roughly how ASPN works. Ugly and inflexible, yes, but it gave us the performance we wanted, without having to shell out for Oracle.
It may have been a premature optimization. I personally was horrified by the idea during the design meetings. But it works very well in practice and it's easy to understand.