I'm trying to find files in one directory which are in another directory. The following works, but I assume there's an easier way?
for file in `ls aggtests/pips/api/v1/xml/*.t | cut -d/ -f6`; do find aggtests/pips/api/builder/ -name $file; done
Update: Rewrote to fix a tiny grep bug.
A homegrown version can be found here (http://www.perlmonks.org/?node_id=703798), it uses File::Find::Duplicate.
There are lots of hash based solutions that can do this, some of them are easy written in Perl...
http://en.wikipedia.org/wiki/Fdupes, also lists alternatives (including one of mine!).
(ls aggtests/pips/api/v1/xml/*.t; ls aggtests/pips/api/builder/*.t) | sort | uniq -d
Re:uniq
merlyn on 2009-10-09T00:00:11
"ls aggtests/pips/api/v1/xml/*.t""ls... glob" FAIL. Please don't do that.
Re:uniq
merlyn on 2009-10-09T00:03:27
I just realized that message is probably insufficient. Here's the "dangerous use of ls" message, spelled out a bit better: http://groups.google.com/group/comp.unix.shell/msg/5d19dadaf9329f87Re:uniq
mauzo on 2009-10-09T00:33:08
OK, so had I thought a bit more I might have writtenecho
.../*.t .../*.t | sort | uniq -d The other point, that filenames can contain special characters, I was aware of, but I tend to assume that 'my' files won't (unless I know that they do). If I were working on some arbitrary set of files I would have done the job in Perl (I was going to say
find -print0
and{sort,uniq} -z
would work, but apparently (my)uniq
doesn't have a-z
option. Weird.). Thanks for the correction, though, since it's important to be aware of in general.A more important bug I was also ignoring is that the length of the list of files may exceed
ARG_MAX
. Since this is one of Ovid's test directories, I presume that's not actually that unlikely:)
.Re:uniq
Aristotle on 2009-10-09T14:07:28
echo
.../*.t .../*.t | sort | uniq -d That won’t do what you wanted because
echo
will output the whole shebang on a single line. What you want instead isprintf '%s\n'
.../*.t .../*.t | sort | uniq -d But then that still won’t do what you wanted, because you aren’t chopping the base path off the file names, so no two lines will have the same content anyway. You need to something like this:
printf '%s\n'
.../*.t .../*.t | cut -d/ -f2- | sort | uniq -d Of course, as mentioned, that doesn’t account for the possibility of newlines in file names. And trying to do so is awkward since not all Unix utilities have switches to enable null- rather than newline-terminated records,
uniq
andcut
among them.