$/ features

djberg96 on 2003-05-23T20:14:24

So, I was looking in the Camel, wondering if $/ can take a regex (it can't, dammit - neither can Ruby's), when I came across this interesting little trick:

$/ = \004; # read four bytes of data at a time.

Is there any advantage to using this over, say, sysread(4)? Also, is there any chance of getting $/ to accept regexen in the future?


$/ Regex

chromatic on 2003-05-23T21:31:08

This is a hard problem. Consider backtracking (stuff it back in a buffer) and greediness (how much do you read?). The trivial cases are pretty easy, but they rapidly get complex.

Re:$/ Regex

nicholas on 2003-05-24T10:15:45

There have been several threads about this on p5p, and I agree with what you say. However, I think that it may be possible to make it work reasonably for non-greedy regexps, and my hunch is that most of the time non-greedy regexps would be the correct way to express most people would want for a line ending.

Here's the middle of one p5p thread on $/ regexp If people are searching, I think that some of the other threads have had qr// in the subject. (Mmm. I'm linking to a message by me. Blantant self-promotion.)

Maybe we could do a use.perl poll on "features from other languages that I miss most in perl", as $/ as a regexp is probably not the only one.

Re:$/ Regex

nicholas on 2003-05-24T11:28:22

Hmm. I could see one way that this could go...

"features from other languages that I miss most in perl"

  • $/ as a regexp

<damian> Perl 6 will have this

  • seamless integration between script and C++/Java/etc objects

<damian> Perl 6 will have this

  • compiles to bytecode which I can ship independently of the souce

<damian> Perl 6 will have this

  • compiler can check for typos in member names

<damian> Perl 6 will have this

  • 2 dimensional playfield

<damian> Perl 6 will have this

  • All programs should be expressible as combinations of Ook. Ook? and Ook!

<damian> Perl 6 will have this

  • Given SMU (Symmetric Multi-Universe) hardware parallel programs should run in constant time

<damian> Perl 6 will have this

Re:$/ Regex

Damian on 2003-05-25T02:07:39

Actually...
  • seamless integration between script and C++/Java/etc objects

<damian> Parrot will have this

  • compiles to bytecode which I can ship independently of the source

<damian> Parrot will have this

  • compiler can check for typos in member names

<damian> Perl 6 will have this

  • 2 dimensional playfield

<damian> Perl 6 will have multidimensional arrays and vectorizable functions and operations.

  • All programs should be expressible as combinations of Ook. Ook? and Ook!

<damian> Perl 5 already has this. Though I do like the idea of distinguishing the three constructs by font!

  • Given SMU (Symmetric Multi-Universe) hardware parallel programs should run in constant time

<damian> Perl 6 will have junctions (the successor to quantum superpositions). Whether Parrot is ported to the necessary hardware, and whether the Perl 6 compiler is able to emit Parrot code that can take advantage of it are both, of course, entirely up to Dan's team. ;-)

Re:$/ Regex

chromatic on 2003-05-24T16:59:51

Yes, handling non-greedy regexes (simple alternations, bounded ranges) is possible. I could see disallowing unbounded ranges or the /m and /s flags. I wonder if that'd confuse people who don't know the implementation and the reason for the implementation, though.

Perl 6 may have what you want

Damian on 2003-05-24T03:07:13

As chromatic points out, matching regexes against an input stream complicates stream buffering considerably. But it's certainly not impossible, if you're prepared to put up with potentially having to buffer an entire input stream in the pathological cases.

Greedy regexes are troublesome for the same reason: in the edge cases you potentially have to scan an entire input to determine where to stop reading.

Nevertheless, despite those problems, I'm confident we'll be able to permit Perl 6 input streams to use a regex as an input record separator (they'll be per-filehandle, not global, in Perl 6).