Forgotten Solutions: Blocking out the pain.

jk2addict on 2007-11-01T19:06:31

Ever solve a problem that was so painful you just blocked it out of your memory? :-)

I had one of those moments today. One of my co$workers was having problems with an XML file on win32 that he was getting from an HP system. Then the painful memories kicked in, and once I revisited the solution I had 2 years ago, I can't believe I ever figured it out [with the help of some Googlers as well].

The issue then was that the web took in latin encoded posts, and we were sending that buffer over the network to a main system that lived and breathed HPRoman8. Whilst in .NET 1.1, there was no native way to convert, so I worked together a character map of utf->hproman8 and an Encoding class for .NET so I could map data on the way out.

All because of some people with diacritic marks in their names. :-)

Now that I know WHY the XML file is broken, hopefully we can convince MSXML to do our bidding.

I've said it before, and I'll say it again. There are three things that O loathe in this world of programming: codepages/encodings, time/zone conversions, and USPS/ISO country code issues.


Dan Sugalski says…

Aristotle on 2007-11-01T22:36:21

I swear, text will be the death of me.”

EZ decoding in perl

runrig on 2007-11-01T22:43:31

I went through this when we were receiving a latin1 encoded XML file, and I needed to compare it to what got loaded into the database, which when fetched, was hp-roman8...so, you probably already know this...but I did something like:

use Encode qw(decode);
...
  open(my $from_db_h, ">:encoding(latin1)", $file) or die "Err opening $file: $^E";
...
    print $from_db_h decode( 'hp-roman8', join("|", @row{@columns})."\n");

Re:EZ decoding in perl

jk2addict on 2007-11-01T23:21:11

Thank god for iconv. MSXML doesn't like hp roman8, so a quick iconv on the file before loading it as utf does the trick for our xml input file problem.

Re:EZ decoding in perl

jhi on 2007-11-10T02:16:43

You know of piconv, right?