This is my revised TAP grammar. If you're not familiar with reading this style of grammar, here are a few comments:
I'm using POSIX character classes to represent digits and printable characters. If you want to manually encode all of the Unicode characters for [:print:], be my guest :)
The following means "all printable characters except the newline".
([:print:] - "\n")
The following means, a digit followed by zero or more digits.
digit {digit}
A question mark after an atom means it's optional.
I'm not particularly gifted with grammars, so corrections welcome.
The corrected TAP grammar:
digit ::= [:digit:] character ::= ([:print:] - "\n") positiveInteger ::= ( digit - '0' ) {digit} nonNegativeInteger ::= digit {digit} tap ::= plan tests | tests plan {comment} plan ::= '1..' nonNegativeInteger "\n" lines ::= line {lines} line ::= (comment | test) "\n" tests ::= test {test} test ::= status positiveInteger? description? directive? status ::= 'not '? 'ok ' description ::= (character - (digit '#')) {character - '#'} directive ::= '#' ( 'TODO' | 'SKIP' ) ' ' {character} comment ::= '#' {character}
For purposes of forward compatability, all lines which do not match the grammar are labeled as TAPx::Parser::Result::Unknown and are not considered parse errors.
Re:Where is 'lines' used?
Ovid on 2006-09-15T08:20:15
Yup. Definitely a bug in the grammar. I'm working on improving it now.