My talks at YAPC::Europe 2008

andy.sh on 2008-06-16T20:12:06

I am going to give two talks at forthcoming YAPC::Europe, one full-length and a lightning.

Coincidently, both talks are about using Perl in linguistics applications. How to make Google Books at home Several years ago I made online search for two books published by Art. Lebedev Studio. The main idea was to take best from Google and made a better service.

After I left the Studio I totally re-wrote the code, because as always, real understanding comes after you have made the prototype of a system. Current engine gives a search results in the form of graphical preview of page with underlined search words, like this: http://deeptext.net/booksearch/selected.gif>
I am going to cover several elements:

* Working with book layout and converting it into a suitable format. * Extracting paragraphs, phrases and words from the layout. * Understanding the importance of separate words. * Thinking of how to restore the word order if the source had damaged it. * Restoring words split with hyphens. * Indexing the text of a book. * What is better for index: dictionary or morphology engine? * Building the cloud of popular words. * Generating previews and thumbnails. * Highlighting words that are found. * Caching search results. * Adding hot word lists to search results.

To demonstrate how all that stuff works I will make a brochure of all my posts in use.perl.org and create online search through them.

Translating human language with computer grammar A couple of months before German Perl Workshop in 2007 I started to learn German. My lightning talk there was about parsing URIs with Parse::RecDescent grammars. I am so excited about both German and parsing module, that decided to create a very simple machine translator which should be able to translate basic phrases from my German textbook. In Perl no doubt.

Although I am not going too deeply dig into the theory of grammars and human languages, I have bought a 600-pages book "Grammar of the Text" today :-)