God bless Matt Sergeant! (and "XML::XPath")
I wanted to get at the list of RSS URLs (URIs, URnID10T, whatever) from the NetNewsWire ".plist" file under MacOS X. It's XML so I figured it'd be easy to parse.
Well, "sort of". Instead of being hierarchical or using attributes or any other sort of sane structure, Apple's "plist" files are designed to be very generic so they'll work for any application:
Note that the Property List format is designed to have a very generic schema suitable for any type of application, so it stores keys and their corresponding values as character data inside of elements all at the same level, rather than forcing (or, if you're a "glass-half-empty" type of person, "allowing") each application to define its own structure.autoRefresh 2 flOpenURLsInBrowserInBackground Subscriptions home http://www.perl.com/ name Perl.com News rss http://www.perl.com/pace/perlnews.rdf home http://www.slashdot.org/ name Slashdot rss http://slashdot.org/slashdot.rss
Unfortunately, that makes extracting the value of a given key a bit trickier -- now you need to find, say:Slashdot http://slashdot.org/slashdot.rss
"The first '<string>' element following a '<key>' which contains the string 'rss', all within the first '<array>' after a '<key>' containing the word 'Subscriptions'
"
I thought about setting up a long chain of SAX event handlers to track what keys have or have not been seen yet, maintaining the state of the previous key(s) as I went, but I figured all that code and complexity would make a program more susceptible to bugs.
Enter XML::XPath. By carefully crafting just the right XPath query, I was able to squeeze all that logic down into the holy grail of Perl scripts -- a one-liner:
perl -MXML::XPath -e 'foreach $node (XML::XPath->new(filename =>
"$ENV{'HOME'}/Library/Preferences/com.ranchero.NetNewsWire.plist")->find(q{//key[contains(string(),"Subscriptions")]/following-sibling::array[1]/dict/key[contains(string(),"rss")]/following-sibling::string[1]})->get_nodelist) { print $node->string_value, "\n" }'
Awesome! When I saw how well that worked, I must have done the Happy Dance for at least 15 minutes. *THAT'S* what programming is all about! :-)
Of course, There's More Than One Way To Do It, and I would be reluctant to deploy that one-liner in a production environment -- woe betide the poor programmer who had to maintain it after me! -- but as for neat hacks, this one reigns supreme!