Funky Regex Problem

Ovid on 2005-05-06T00:51:14

To make a long story short, I can't figure out why this is failing (I'm trying to force the regex to fail if it matches a single dot):

#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;

use Regexp::Common;

use constant SUCCEED => qr{(?=)};
use constant FAIL    => qr{(?!)};

my $QUOTED = $RE{quoted};
my $NUM    = $RE{num}{real};

my $VALUE  = do {
    use re 'eval';
    qr/(?:$QUOTED|$NUM)(??{'.' eq $+ ? FAIL : SUCCEED})/;
};

my $text = 'name => "foo", fav.num => 3';
my @text = split /($VALUE)/ => $text;
print Dumper \@text;

That prints:

$VAR1 = [
  'name => ',
  '"foo"',
  ', fav',
  '.',
  'num => ',
  '3'
];

What I want it to print is:

$VAR1 = [
  'name => ',
  '"foo"',
  ', fav.num => ',
  '3'
];

Any ideas?


Try this instead:

Damian on 2005-05-06T04:44:23

my $VALUE  = do {
    use re 'eval';
    qr/(?>($QUOTED|$NUM))(??{'.' eq $^N ? FAIL : SUCCEED})/;
};

my $text = 'name => "foo", fav.num => 3';
my @text = split /$VALUE/ => $text;
print Dumper \@text;

Re:Try this instead:

Ovid on 2005-05-06T05:00:56

Very interesting. I suspect it would take me a while to debug, but I've discovered that while this passes my test case wonderfully, it fails miserably when using that in the Lexer.pm example from HOP. My test case is clearly not representing the problem as well as I thought since others are having this problem on the Perlmonks site.

A more natural way?

Aristotle on 2005-05-06T09:20:19

Do you really need a deferred pattern there? I’d write it like so:

qr/(?:$QUOTED|$NUM)(?(?{ '.' eq $+ })$FAIL)/;

(which of course implies variables rather than constants.)

Now given that, you get a zero-length match:

$VAR1 = [
          'name => ',
          '"foo"',
          ', fav',
          '',
          '.num => ',
          '3'
        ];

So obviously a match against the lone dot was prevented, but $QUOTED succeeds in matching nothing.

Okay then.

qr/($QUOTED|$NUM)(?(?{ '.' eq $+ or not length( $+ ) })$FAIL)/;

Result:

$VAR1 = [
          'name => ',
          '"foo"',
          ', fav.num => ',
          '3'
        ];

Well, I never. That’s what we wanted!