Everyone I talk with says that mucking with the XML exported by Informatica's PowerCenter is not recommended, or that they tried and failed...so I had no choice but to ignore that advice when we were upgrading from version 7 to 8. I think the advice must be from people who don't quite know what they're doing, and/or "parse" with grep/awk/sed.
I've been messing with more XML in the last few weeks than I have in the last few years. I have my own (shared) opinion about Informatica/ETL (and I've taken up Aristotle's call to action), but at least it provides an opportunity to practice some XML-fu. Most transformations were simple changing of some attributes, but there was one issue where after importing into v8, if you delete a group from a Union transformation, the GUI crashes. So I created a Union transformation from scratch, exported it, and compared it to what I was importing, and hey, there was some stuff missing! So I wrote the following:
# Fix Union transformations my $union_cnt; for my $trans ( $root->findnodes(q[ //TRANSFORMATION[@TYPE="Custom Transformation" and @TEMPLATENAME="Union Transformation"] ]) ) { $union_cnt++; my $name = $trans->getAttribute('NAME'); my $parent = $trans->parentNode(); print "X: Fixing Union transformation $name\n"; my @output; for my $field ( $trans->findnodes(q[ TRANSFORMFIELD[@GROUP="OUTPUT"]/@NAME ]) ) { push @output, $field->value(); } my %dep; my $dep_cnt; for my $field ( $trans->findnodes(q[ TRANSFORMFIELD[@PORTTYPE="INPUT"] ]) ) { my $name = $field->getAttribute('NAME'); my $group = $field->getAttribute('GROUP'); my $dep_group = $output[$dep{$group}++]; my $new = $trans->addNewChild( '', 'FIELDDEPENDENCY' ); $new->setAttribute( 'INPUTFIELD', $name ); $new->setAttribute( 'OUTPUTFIELD', $dep_group ); } } $_->unbindNode() for $root->findnodes('//text()');
Warning: this code is not endorsed or guaranteed by anyone for anything!
The removing of all text nodes was so that the result would stay pretty-printed after output (is there a better/easier way?):eval { $doc->toFile($file, 1) } or die "Could not write to $file: $@";And there are no text nodes with anything but whitespace anyway. XML::LibXML seemed hard to use at first, but once you get used to how the docs are arranged (and learn some XPath), it's quite easy.