Yahoo! has just announced Research Project: Pig, which is a huge pig. Oh no wait, that's just the logo. Pig itself is "infrastructure to support ad-hoc analysis of very large data sets" and is a query language that runs on top of Hadoop, rather like Google's Sawzall. It's interesting how new minilanguages are the way to do this - see some example code and analysis over at Geeking With Greg.