This morning, with a little insight from Sheriff and TorgoX on #perl, I came up with the following subroutine:
sub captures { local $_ = shift; croak "$_ is not a compiled regexp" unless ref eq 'Regexp'; my $n = 0; while( /\G(?=.)/gcs ) { /\G[^\\(]*/gc; # ignore uninteresting stuff /\G(?:\\.)*/gc; # ignore backslashed stuff /\G\(\?/gc; # ignore special regex /\G\(/gc && $n++; # a capturing ( ! } $n; }
The trick is to use compiled regular expressions. This way, I do not need to try to balance parentheses or look for the closing ones, because I know the regular expression I'm working on is correct.
I'll be using it in combination with Parse::Yapp (which I'm learing) to detect inconsistencies in statements such as:
regexp /session (closed|opened) for user (\w+)(?: by (.*))?/ captures status, user, by;
I'm so proud of me I've put it on perlmonks as well...
Update 1: Hugo pointed at some shortcomings on perlmonks. Now I have to correct them, and keep this version and the one on perlmonks up-to-date.
Update 2: This is a more complicated problem than I thought, and I've moved the discussion here.