The third part of Tim Bray's four part "trilogy" on text processing concerns programming languages and text. For a while now Java has boasted Unicode...so it's interesting to hear about the wrinkles. These criticisms are relevant to Perl's Unicode handling as well:
print length("\N{LATIN CAPITAL LETTER A}\N{COMBINING ACUTE ACCENT}"), "\n";
prints 2 not 1.
Unicode is cool, but the issues are so huge (especially for a programming language that is so text oriented). I'm still amazed that the perl5 folks were able to graft Unicode into Perl5 without having to wait till Perl6.