I needed to access iPhoto's AlbumData.xml file in order to read my iPhoto database.
Mac::iPhoto did not work for me it always failed loading the xml-file (now I know why).
Mac::PropertyList would work after massaging the XML data before handing it of to plist_parse. However the solution was painfully slow. My 600KB AlbumData.xml file took more than 30 secs to load and parse on a Mac Mini.
So I looked into other ways to load the data from AlbumData.xml. Via PerlObjCBridge I implemented some code using NSPropertyListSerialization. I'm happy to report that the 30 secs have turned into 3 secs. That's better ...
The data returned from plistToHash and loadiPhotoDB are different! There are some data types missing in the plistTraverse() sub!
--------------------------------------------------
use strict;
use Foundation;
use Mac::PropertyList;
use Time::HiRes qw{gettimeofday tv_interval};
use constant XML => qq{$ENV{HOME}/Pictures/iPhoto Library/AlbumData.xml};
my $t0 = [gettimeofday];
my $hash=plistToHash(XML);
my $elapsed = tv_interval($t0);
print "using plistToHash = $elapsed\n";
$t0 = [gettimeofday];
$hash=loadiPhotoDB(XML);
$elapsed = tv_interval($t0);
print "using Mac::PropertyList = $elapsed\n";
sub plistToHash {
my($filename)=@_;
my $data=NSData->dataWithContentsOfFile_($filename);
return undef unless($data);
my $plist=NSPropertyListSerialization->propertyListFromData_mutabilityOption_format_errorDescription_($data,0,undef,undef);
return undef unless($plist);
my %dict=();
return plistTraverse(\%dict,$plist,'dict',0);
}
sub plistTraverse {
my($dest,$src,$type,$depth)=@_;
my $e=($type eq 'dict')?$src->keyEnumerator():$src->objectEnumerator;
while(my $next = $e->nextObject()) {
last unless($$next);
my $obj=($type eq 'dict')?$src->objectForKey_($next):$next;
my $class=$obj->className->cString();
my $keyString=($type eq 'dict')?$next->cString:"";
if($class =~ /dictionary$/i){
my %dict=();
my $sub=plistTraverse(\%dict,$obj,'dict',$depth+1);
if($type eq 'dict') {
$dest->{$keyString}=$sub;
}else{
push(@$dest,$sub);
}
}elsif($class =~ /array$/i){
my @array=();
my $sub=plistTraverse(\@array,$obj,'array',$depth+1);
if($type eq 'dict') {
$dest->{$keyString}=$sub;
}else{
push(@$dest,$sub);
}
}elsif($class =~ /string$/i){
if($type eq 'dict') {
$dest->{$keyString}=$obj->cString;
} else {
push(@$dest,$obj->cString);
}
}elsif($class =~ /number$/i){
if($type eq 'dict') {
$dest->{$keyString}=$obj->doubleValue;
} else {
push(@$dest,$obj->doubleValue);
}
}elsif($class =~ /boolean$/i){
if($type eq 'dict') {
$dest->{$keyString}=($obj->boolValue eq 'YES')?1:0;
} else {
push(@$dest,($obj->boolValue eq 'YES')?1:0);
}
} else {
print STDERR "**** unhandled class: $class\n";
}
}
return $dest;
}
sub loadiPhotoDB {
my($catalogPath)=@_;
my $xml;
open(CATALOG, $catalogPath) || return undef;
{local $/=undef;$xml=
# Mac::PropertyList is pretty strict about what it expects to
# see in the XML file. We are trimming the file before handing
# it off to parse_plist
$xml =~ s{^.*
my $dict=Mac::PropertyList::parse_plist($xml);
return $dict;
}
Re:Seems odd, partly...
thoellri on 2005-06-14T17:35:39
Brian - i just ran t/time.t and I see this:
t/time.........Elapsed time is 0.021996
t/time.........ok
Pretty consistently at that value.
I also ran the sample through Devel::DProf and her's what I see:
macbox:~/tmp thoellri$ perl -d:DProf plist2.pl
I see you released 1.23 a few days ago - is it worth testing it with the newer release?
using Mac::PropertyList = 32.427314
macbox:~/tmp thoellri$ dprofpp
Total Elapsed Time = 31.58248 Seconds
User+System Time = 31.01248 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
61.6 19.12 31.044 11456 0.0017 0.0027 Mac::PropertyList::read_next
34.3 10.65 31.044 708 0.0151 0.0438 Mac::PropertyList::read_dict
2.20 0.682 0.883 10729 0.0001 0.0001 Mac::PropertyList::Scalar::new
0.65 0.201 0.201 10729 0.0000 0.0000 Mac::PropertyList::Item::new
0.64 0.198 0.761 6510 0.0000 0.0001 Mac::PropertyList::read_string
0.40 0.123 8.925 13 0.0095 0.6866 Mac::PropertyList::read_array
0.32 0.100 31.154 1 0.1000 31.153 main::loadiPhotoDB
0.13 0.039 0.233 2797 0.0000 0.0001 Mac::PropertyList::read_real
0.08 0.024 0.151 1422 0.0000 0.0001 Mac::PropertyList::read_integer
0.03 0.010 31.054 1 0.0100 31.053 Mac::PropertyList::parse_plist
0.03 0.010 0.010 2 0.0050 0.0050 DynaLoader::BEGIN
0.03 0.010 0.010 3 0.0033 0.0033 vars::BEGIN
0.03 0.010 0.029 4 0.0024 0.0072 main::BEGIN
0.00 - -0.000 1 - - DynaLoader::dl_install_xsub
0.00 - -0.000 1 - - Time::HiRes::bootstrap
macbox:~/tmp thoellri$ perl -MMac::PropertyList -e 'print $Mac::PropertyList::VERSION,qq{\n};'
1.21
Looking at this code in Mac::PropertyList::parse_plist
...
# we can handle either 0.9 or 1.0
$text =~ s|^<\?xml.*?>\s*<!DOC.*>\s*<plist.*?>\s*||;
$text =~ s|\s*</plist>\s*$||;...
you can see that the parser will fail to remove the plist-wrapper in case there is no "DOCTYPE" declaration. Well, my AlbumData.xml file (written by iPhoto) does not have a "DOCTYPE" declaration, which means that the first "read_next" will fail because it does not see what it expects to see.
By removing the wrapper before calling plist_parse I can avoid that problem.
Fell free to steal as much code as you want - that's why I posted it here;-) Re:Seems odd, partly...
brian_d_foy on 2005-06-14T18:30:12
Okay, good to know. I'll fix up the parser.
The newest version is a fix by Mike Ciul that made things a little bit faster for very large files. It might help.
What I really need to do is fix up Mike's enhancement so it can deal with files without reading them all in at once. That should be easy, but it's in line after all the other easy things.:)
After that, I need to add the Foundation stuff (or something similar) so the Mac users don't have to suffer the portability penalty.
Thanks again:)