Problem Learning Perl

xenchu on 2004-01-23T16:07:54

One of my problems with learning Perl is -- I don't know enough Perl. A common but frustrating problem. Let me explain my learning situation.

Here is an example from one of my Perl books (not Camel or Llama, by the way):

@numerics = map {/^\d + /?$_:()} <>;

Now, since the book tells me what this construct is doing (it calculates lines starting with numbers, and then builds a list of the lines input), most of operation is clear enough. Map reads a line from input, checks to find out if the line starts with a digit and then does -- what? I see $_ which is the line input but what does the ? mean? Is it part of the regex or does it operate on $_? What is the purpose of :()? Is :() one operation or two? And why '+ ' instead of \b?

I checked the index and found nothing I thought applied. I doubt that means the answer is not in the book, just that if I saw the answer I didn't recognize it. And if I have come across what I want to know in earlier reading I have either forgotten it or I wasn't paying sufficient attention. Either option has a 50% chance of being true.

Nothing as concise as map is present in any other language I program in. The ease with which it can be used is fascinating. I know I will find out what I want to know eventually. I am sure all of you who read this see my problem (he's an idiot) and understand what I am trying to say. The more you have of a thing the easier it is to get more. It works for Perl as with everything else.

The solution is obvious; keep studying Perl. Most things that frustrate me seem easy when I look back on them knowing the answer. This will probably not be an exception. And as they say, 'a good learning experience'. As I gain knowledge of Perl the pieces will start to come together.

Hello to chaoticset who has signed on as a Fan. Astounding, just astounding. And hello to merlyn and Louis_Wu as well. I am certain that all three know exactly what the example above means. Inspires me to keep trying to catch up with them to know others have learned before me.

And a public thanks to Corion who answered a Perl Monk question I asked in an earlier journal entry.


@numerics = map {/^\d + /?$_:()} ;

clscott on 2004-01-23T16:56:28

@numerics = map {/^\d + /?$_:()} ;

The ? and : are working together here.

For teh full details look up the perlvar man page and search for "Conditional Operator" and then Ternary operator "?:"

The expression /^\d + / ? $_ : ()

is equivalent to:

if (/^\d + /){
      $_
} else {
      ()
}

Translated to english:

"If $_ starts with a digit give me $_ otherwise give me the empty list ()"

Re:@numerics = map {/^\d + /?$_:()} ;

xenchu on 2004-01-23T17:48:41

Thank you. Another thing I am not used to yet is that Perl allows operations to be jammed together. I would have expected /^d + / ? $_ : (); In the languages I am familiar with that spacing makes a difference. It would be an error to write like that!

Actually...

bart on 2004-01-23T17:17:51

What you have is an emulated grep. I wonder why the author didn't just use that. Your snippet is equivalent to
@numerics = grep { /^\d + / } LIST;

Re:Actually...

xenchu on 2004-01-23T17:53:41

Actually he did list the grep you give as another way it would work along with a 'while' example. The reason I listed the map example is I didn't understand how it worked. I was displaying my ignorance, not the author's.

why +

jmm on 2004-01-23T18:32:50

Your main question has already been answered, but you also asked why the + was there and why it wasn't \b.

The + is actually irrelevant - it could be removed. /^\d+/ looks at the beginning of the matched text (^) for one digit (\d) or more digits (+). If the plus were changed to \b, it would require exactly one digit and disallow the line if that digit was immediately followed by an alphanumeric. If the + were removed, it would look for one digit at the start of a line (and if a line starts with one or more digits, it starts with one, so that still matches the exact same set of lines). The + would be useful in other instances, like for example, if it were just the leading numbers that were being captured instead of entire lines that start with a digit.

/^(\d+)/ && push @list, $1 for <IN>;

(Note that along with learning perl, you also demonstrated a need to learn html as it applies to posting here. To post the text <IN> you cannot just write it literally, since the html will decide that your IN html directive should be ignored. Instead you have to write the IO reader as &lt;IN&gt;. The preview button and a lot of experimentation are your friends...)

Re:why +

xenchu on 2004-01-23T19:23:35

Another visual disfunction. Usually I see that written as d+ and not 'd +'. In the languages I usually program you have to be exact. Somehow I equated that with \b. No good reason why.

I am a little puzzled by your comment on <N>. I don't recall using the term here. Was it in another journal entry? Other special characters appear in Preview as they are. But looking at it in Preview I see the angle characters vanish. Now I see. Thanks for the tip.

Re:why +

jmm on 2004-01-23T19:58:24

Oops, I didn't see the space in your message. That means the pattern is requiring exactly one digit, followed by one or more spaces. (The + is still pointless in this case.) If you use the x modifier on a regular expression, the space would be for formatting purposes and not matching, so /^\d+/ and /^ \d +/x are the same, but when the x is not there, spaces and other whitespace characters match themselves.

When it is <SPACE>+, it is still not the same as \b. The \b pattern matches a "word boundary", i.e. a zero-length position which has a word character on one side and does not have a word character on the other. (With the text "The 3 goats." \b would match the position before the T, the position before and after each blank, and the position before the period.) The difference between "\b" and " +" are twofold - first, \b matches strings that the other would not (e.g. the string "1=2" would be successfully matched by /^\d\b/ but not by /^\d +/) and second, if the pattern is being used for a substitution or if the matched text is referenced with $1, $&, or suchlike, the trailing blanks would be part of the match with /^\d +/ but they would not with /^\d\b/;

Ecode is your friend

gav on 2004-01-24T21:09:52

Just wrap your code examples with <ecode></ecode> and all the escaping is taken care of :)

The ?:

chaoticset on 2004-01-23T19:10:40

That threw me too, don't feel bad. I had never seen it in C before I started Perling.

Re:The ?:

xenchu on 2004-01-23T19:25:37

Thanks. I begin to think I am Perl visually-impaired. Another learning problem.

I don't know enough Perl

brian_d_foy on 2004-01-23T20:30:04

It just gets worse, and worse, and then more worse. :)

Re:I don't know enough Perl

xenchu on 2004-01-23T21:56:17

The lyf so short, the craft so long to lerne.

--Geoffrey Chaucer