I wrote a Perl parser last night

rafael on 2002-07-25T07:26:15

Yes, a Perl parser, something that takes some Perl 5 source code and that produces an abstract syntax tree. And it works.

Of course, this is based on an Evil Hack. The Evil Hack is to parse the output of perl -c -DTp. (You'll need a perl compiled with -DDEBUGGING for this to work.) These debug flags make perl output traces for any tokenizer and parser actions.

So, based on the traces, I can reconstruct the functioning of an LALR(1) parser, that "shadows" perl's parser (you know, shifts, reduces, and reading a new token symbol).

Drawbacks :

  • You need -DDEBUGGING.
  • Perl's tokenizer is very clever. It can produce fake (zero-length) tokens or permute some token in the input stream. You don't want to know about how it tokenizes "abc$def".
  • For the moment I can't always get the part of the input source that's associated with some tokens. (If I find out that some information is missing, I'll patch the debug traces in the core!)
  • If your Perl script outputs something like "yydebug: after reduction, shifting from state 23 to state 79" to stderr during execution of a BEGIN block, this will confuse my parser.
Now I have to design an API for it. Basically I can trigger any callback on shifts, reduces and reads. Those sets of callbacks are conveniently packaged as, well, packages. So I was thinking about something like
  • a Perl::ShadowParser that implements the parser
  • Perl::ShadowParser::* backend plugins that provides the callbacks
  • a little program for your convenience that runs the parser on any script with any callback(s) you've provided :
    perlshadowparser -b backend1 -b backend2=option1,option2 perlscript
Lots of tests will be needed, too.

If you have any ideas of something cool to do with it (ideas for backends...), I'm listening.


Well the obvious one...

james on 2002-07-25T08:13:04

...is to produce IMCC output so you can target Parrot.

Not sure how well that would work yet, but cool none the less!

Re:Well the obvious one...

rafael on 2002-07-25T08:29:01

Another obvious backend is to produce Perl 6 code.

There are many ways to implement a Perl 5 to Perl 6 translator. For example, a B module similar to B::Deparse could do a good job. But it won't work on all Perl 5 sources. (See the BUGS section in the B::Deparse manpage.) Another way is to build a completely standalone Perl 5 parser. Very difficult (it's a task for Damians). A third solution is to put a hook into Perl 5.10's parser. The fourth solution is my Evil Hack. Of course, those solutions are not mutually exclusive.

I think that producting a working Perl52Perl6 converter is very important. We'll see what Hugo has to say about it.