learning GTK+, parsing

statico on 2005-05-21T05:01:40

In using use Perl; as a blogging mechanism, I've felt that there are a few features that could improve the posting tool. Namely, there's no spell checking unless I write the content of the entries in some place other than the input field. Second, I hate writing HTML, and I wish there was some other format in which I could write the entries. Today I solved both problems.

I've admired my friend Russell's custom blogging tool, and since seeing it I've always wanted to play around with Glade and GTK+ a little. I apt-getted the Glade interface designer and Gtk2::GladeXML modules, bookmarked a few links, and began to read the examples. I found the Gtk2 module examples and API docs to be the most useful, if not the only, resources while learning.

At first, I thought it'd be easy to create an application with all the simple trimmings: file management, multiple editor windows, copy & paste. After some fooling around and some brief chats on #gtk+, I realized that unless I wanted to spend all weekend implementing the frills it'd be easiest to strip out as much unnecessary stuff as possible. Most notably, implementing a polished Edit menu (i.e., greying out items when nothing is selected, determining where the focus is and adding things to the clipboard) is much harder than I originally thought.

Other things, like setting the font of one of the fields, were pleasantly simple:

$output->modify_font(
   Gtk2::Pango::FontDescription->from_string('Monospace 7') );

Solving my first problem was equally as simple thanks to the Gtk2::Spell module:

my $input  = $gladexml->get_widget('input');
...
my $spell = Gtk2::Spell->new_attach($input);
$spell->set_language('en');

Next, I created a spec for a minimal markup language to write the journal entries in:

This would be a paragraph with _some emphasized_ text
and some *bold* text. Like wiki text, indenting things would
become preformatted blocks, or you could write [code inline].

This would be a new paragraph, and so on...

I thought it'd be fun to write a cute little grammar with Parse::RecDescent, but this turns out to be an entirely wrong approach. I don't know all that much about parsing, but I'm pretty sure that you can't write a parser for the above with Parse::RecDescent without being completely pedantic and specifying what makes up a word and a paragraph.

I think Ingy had it right with the structure of Kwiki::Formatter, which has two interesting properties: First, the formatter is lax -- if they can't parse something, the result is simply unformatted text. The parser never fails. Second, the formatter is defined by specifying what makes up what. My gut is telling me that the Kwiki::Formatter's way and the Parse::RecDescent way are closely related.

(Corrections? Is the answer Parse::RecDescent's lack of backtracking? Hrm.)

Anyway, I ended up using a subclass of Text::KwikiFormatish to format the journal entries, which are then posted to use Perl; using WWW::UsePerl::Journal. My application lets me view the HTML before after it's processed in a separate pane, mostly because I had originally planned to copy the text and paste it into the web form instead of having the application post it for me. No, I'm not sure why.