So, I was looking in the Camel, wondering if $/ can take a regex (it can't, dammit - neither can Ruby's), when I came across this interesting little trick:
$/ = \004; # read four bytes of data at a time.
Is there any advantage to using this over, say, sysread(4)? Also, is there any chance of getting $/ to accept regexen in the future?
This is a hard problem. Consider backtracking (stuff it back in a buffer) and greediness (how much do you read?). The trivial cases are pretty easy, but they rapidly get complex.
Re:$/ Regex
nicholas on 2003-05-24T10:15:45
There have been several threads about this on p5p, and I agree with what you say. However, I think that it may be possible to make it work reasonably for non-greedy regexps, and my hunch is that most of the time non-greedy regexps would be the correct way to express most people would want for a line ending.
Here's the middle of one p5p thread on $/ regexp If people are searching, I think that some of the other threads have had qr// in the subject. (Mmm. I'm linking to a message by me. Blantant self-promotion.)
Maybe we could do a use.perl poll on "features from other languages that I miss most in perl", as $/ as a regexp is probably not the only one.
Re:$/ Regex
nicholas on 2003-05-24T11:28:22
Hmm. I could see one way that this could go...
"features from other languages that I miss most in perl"
- $/ as a regexp
<damian> Perl 6 will have this
- seamless integration between script and C++/Java/etc objects
<damian> Perl 6 will have this
- compiles to bytecode which I can ship independently of the souce
<damian> Perl 6 will have this
- compiler can check for typos in member names
<damian> Perl 6 will have this
- 2 dimensional playfield
<damian> Perl 6 will have this
- All programs should be expressible as combinations of Ook. Ook? and Ook!
<damian> Perl 6 will have this
- Given SMU (Symmetric Multi-Universe) hardware parallel programs should run in constant time
<damian> Perl 6 will have this
Re:$/ Regex
Damian on 2003-05-25T02:07:39
Actually...
- seamless integration between script and C++/Java/etc objects
<damian> Parrot will have this
- compiles to bytecode which I can ship independently of the source
<damian> Parrot will have this
- compiler can check for typos in member names
<damian> Perl 6 will have this
- 2 dimensional playfield
<damian> Perl 6 will have multidimensional arrays and vectorizable functions and operations.
- All programs should be expressible as combinations of Ook. Ook? and Ook!
<damian> Perl 5 already has this. Though I do like the idea of distinguishing the three constructs by font!
- Given SMU (Symmetric Multi-Universe) hardware parallel programs should run in constant time
<damian> Perl 6 will have junctions (the successor to quantum superpositions). Whether Parrot is ported to the necessary hardware, and whether the Perl 6 compiler is able to emit Parrot code that can take advantage of it are both, of course, entirely up to Dan's team.
;-) Re:$/ Regex
chromatic on 2003-05-24T16:59:51
Yes, handling non-greedy regexes (simple alternations, bounded ranges) is possible. I could see disallowing unbounded ranges or the
/m and /s flags. I wonder if that'd confuse people who don't know the implementation and the reason for the implementation, though.
Greedy regexes are troublesome for the same reason: in the edge cases you potentially have to scan an entire input to determine where to stop reading.
Nevertheless, despite those problems, I'm confident we'll be able to permit Perl 6 input streams to use a regex as an input record separator (they'll be per-filehandle, not global, in Perl 6).