This morning, with a little insight from Sheriff and TorgoX on #perl, I came up with the following subroutine:
sub captures {
local $_ = shift;
croak "$_ is not a compiled regexp" unless ref eq 'Regexp';
my $n = 0;
while( /\G(?=.)/gcs ) {
/\G[^\\(]*/gc; # ignore uninteresting stuff
/\G(?:\\.)*/gc; # ignore backslashed stuff
/\G\(\?/gc; # ignore special regex
/\G\(/gc && $n++; # a capturing ( !
}
$n;
}
The trick is to use compiled regular expressions. This way, I do not need to try to balance parentheses or look for the closing ones, because I know the regular expression I'm working on is correct.
I'll be using it in combination with Parse::Yapp (which I'm learing) to detect inconsistencies in statements such as:
regexp /session (closed|opened) for user (\w+)(?: by (.*))?/
captures status, user, by;
I'm so proud of me I've put it on perlmonks as well...
Update 1: Hugo pointed at some shortcomings on perlmonks. Now I have to correct them, and keep this version and the one on perlmonks up-to-date.
Update 2: This is a more complicated problem than I thought, and I've moved the discussion here.