this is my receipt for your receipt

nicholas on 2009-07-30T15:10:44

So, this morning Elizabeth Sophia Pavlović was officially registered at the Cambridge Registry Office. "'Elizabeth' is a good name", said the registrar, but when she signed the certificate it became obvious that maybe there was a reason why she was biased. :-)

Curiously, we didn't have to do that much data entry as the details had already been sent electronically from The Rosie. The standards for "place of birth" are not quite consistent. She had to look mine up in a ring binder of typewritten pages ("Where is St Pancras now?" - answer, "St Pancras, Camden"), whereas Andrea was simply "Austria", even though we'd said Vienna.

I was quite surprised that they can do accents. Even accents not in ISO-8859-1. Let's see how the rest of the UK's IT infrastructure can cope. However, the registrar explained that accents were important. They've had more than one Polish name registered, but without the accent. The documentation is then sent off to get a Polish passport for the baby, and is rejected, because the name doesn't match - accented letters are not the same as unaccented letters. Austria seems to be more forgiving, but likely in this case it's because it's actually a Croatian surname, ć not being in the German alphabet.

Also, it turns out that "are you married?" is an important question. You might have thought reading out legal disclaimers came in with the F.S.A., but this one dates from 1973. However, she did joke "do you want to get married? I can do you a double." Unlike registering a birth, I don't think that that's free, though.

Next up, a passport.


Accents

rafael on 2009-07-30T17:26:29

I've been surprised at how consistently my name turns up on official documents here in France. The spelling Rafaël being completely abnormal, of course, and no-one ever spells it correctly, (even I can't bother spelling it correctly most of the time), but on my passport it's right.

I remember when I registered François last year, I got asked quite precise questions about the spelling: a dash or no dash between Garcia and Suarez? They care about that kind of stuff.

Re:Accents

rafael on 2009-07-30T17:28:34

Bah apparently the use.perl comment boxes are not friends with my browser :/

Those were, in order :

00EB LATIN SMALL LETTER E WITH DIAERESIS

00E7 LATIN SMALL LETTER C WITH CEDILLA

Re:Accents

Aristotle on 2009-07-30T22:05:32

To spell Rafaël and François properly you need to entity-encode the, uh, extravagant characters: use.perl is a Latin-1 Only Zone. Quelle bêtise…

Re:Accents

srezic on 2009-07-31T06:12:58

Hehe, I would say ASCII-only. Rafael's accents perfectly fit in the latin-1 charset.

Re:Accents

nicholas on 2009-07-31T08:25:20

I'm suspecting browser character set headers on the form submission, because I can paste a literal ć in no problem. It looks like his browser sent UTF-8, but either described it as ISO-8859-1, or didn't say, resulting in the far end treating it as ISO-8859-1.

Ho ho ho. When that ć comes back to me on preview, the HTML source has turned into ć.

Which reminds me. Currently, does pod2text use man as an intermediate step when generating its output?

Re:Accents

srezic on 2009-07-31T19:44:09

The initial problem is that the use.perl.org pages declare iso-8859-1
as its charset. So form data has also to be sent as iso-8859-1. Maybe
a browser shouldn't accept any non-latin1 characters when entering or
pasting data into form fields, but at least gecko-based browsers
doesn't do this. To do something with non-latin1 characters,
gecko-based browsers on Unix system seem to do use this heuristic:

* codepoints below 256 are fine

* if there are codepoints in the 0x80-0x9f range of win1252, then they
    are send like this (try LATIN CAPITAL LETTER S WITH CARON for a test)

* every other codepoint is sent as a numerical HTML entity

About pod2text: no, *pod2text* does not use man, but *perldoc* uses by
default pod2man. The plan was to fix pod2text encoding issues (there
are still some, but they are fixable, in contrast to pod2man) and then
to use something like Pod::Text::Overstrike or Pod::Text::Termcap
instead of Pod::Man.

I just right now created and uploaded
Pod-Perldoc-ToTextTermcap-0.00_50.tar.gz to CPAN. Just install it and
set

    export PERLDOC=-MPod::Perldoc::ToTextTermcap

or

    export PERLDOC=-MPod::Perldoc::ToTextOverstrike

and perldoc will use the new renderer. It looks somewhat different
than man output, but at least bold and underline is done (unlike with
stock Pod::Perldoc::ToText).