Perl Grep Question

xenchu on 2004-02-04T04:03:25

This is a line I pulled out of a program on PerlMonks to find the largest 10 files in a directory. The program is clear enough, I suppose, but I don't understand the following line:

     @sizes = grep {length $_ -> [1]} @sizes;

Specifically, what is

{length $_ -> [1]}
doing? It seems to be sorting files by length but I don't understand the mechanism. Anyone care to elucidate?

I saw Windtalkers on DVD tonight. Lots of action, lots of explosions; I think Joe Bob Briggs would give it a thumbs up. I liked it and believe it is worth your time.

Thanks to merlyn and ybiC for their replies to my last entry. As a matter of fact I have a copy of Learning Perl. Unfortunately it is packed in one of dozens of boxes as part of our coming move. I am loathe to buy another copy of a book I already have. However, since I don't yet have a copy of Elements of Programming with Perl I plan to buy that. Scrooge McDuck ain't in it with me when it comes to pinching pennies.

I am going to learn Perl. As long it takes, whatever it takes, but learn it I will. A short attention span, dog-laziness, carelessness past comprehension and a thick head will not stop me.


grep-foo

triv on 2004-02-04T04:49:12

@sizes = grep {length $_ -> [1]} @sizes; Specifically, what is {length $_ -> [1]} doing?


From the docs:

Evaluates the BLOCK or EXPR for each element of LIST (locally setting $_ to each element) and returns the list value consist- ing of those elements for which the expression evaluated to true. In scalar context, returns the number of times the expression was true.


The most basic example is something like:
perl -e '@foo = grep { $_ } (1,0,undef); print "@foo\n"'
This goes though the list (1,0,undef) and takes only the elements that are true, this case 1.

Back to your question:
@sizes = grep {length $_ -> [1]} @sizes;
This goes thru @sizes (which is a list of array references) and takes just the elements where the length of the second index of the array ref has length greater than zero. This is quick way to write:
my @new_sizes;
foreach my $size (@sizes) {
    if (length $size->[1]) {
        push(@new_sizes, $size);
    }
}
@sizes = @new_sizes

Re:grep-foo

xenchu on 2004-02-04T14:46:44

Oh. Ow. Ouch! @size is not an array of file names but rather a line with file name, file size, etc. I might plead that I am a windows programmer but geez, even so, I should have realized that. My stumbling block was [1]. I couldn't see the relationship even knowing what -> was. Density^3. Thank you.

where's the rest of the code?

cog on 2004-02-04T12:40:43

I'm curious about the rest of the code. What happened to @sizes before that? Where can I see that?

Re:where's the rest of the code?

xenchu on 2004-02-04T14:14:00

Certainly. This was written by Abigail-II in answer the question Finding Top 10 Largest Files on PerlMonks. Since Abigail-II wrote it I have no doubt as to the correctness of the program:

#!/usr/bin/perl

use strict;
use warnings;
no warnings qw /syntax/;

open my $fh => "find / -type f |" or die;

my @sizes = map {[-1 => ""]} 1 .. 10;

while (<$fh>) {
    chomp;
    my $size = -s;
    next if $size < $sizes [-1] [0];
    foreach my $i (0 .. $#sizes) {
        if ($size >= $sizes [$i] [0]) {
            splice @sizes => $i, 0 => [$size => $_];
            pop @sizes;
            last;
        }
    }
}

@sizes = grep {length $_ -> [1]} @sizes;
printf "%8d: %s\n" => @$_ for @sizes;

__END__

Re:where's the rest of the code?

bart on 2004-02-04T14:24:36

The idea is to remove the empty entries, in case there are fewer than 10 real entries.