Content-Type

davorg on 2004-03-30T12:58:57

This week I am mostly hating email programs that send out emails with the wrong charset listed in the Content-Type header.

The most common error I see is

Content-Type: text/plain; charset=iso-8859-1

When actually it should be

Content-Type: text/plain; charset=Windows-1252

If an email has the right charset then mutt will display it properly, but if it's wrong then I get a nasty mess and the mail is far harder to read.

It's therefore ironic that the worst offender is a regular email I get from a company who advises on small company PR :)


Sounds like a job for The Demorniser

ajt on 2004-03-30T15:37:35

Sounds like another job for The Demoroniser... Or an equivalent tool.

Re:Sounds like a job for The Demorniser

davorg on 2004-03-30T15:51:23

Actually in this case, the email isn't HTML (I don't read HTML email), it's plain text.

It's probably a programmer somewhere who thinks that iso-8859-1 and Windows-1252 are the same thing.

Re:Sounds like a job for The Demorniser

ajt on 2004-03-31T07:46:01

Fair point. You can probably assume that the main problem is when Windows sends out a Windows encoding say cp-1252, but lies and calls is iso-8859-1.

One possible solution is to scan any iso-8859-1 files, looking for any diagnostic control-code points (Demoroniser has some suggestions for that), and then ask a recoding program convert this to a civilised scheme, or correctly set the Content-Type.

I have to deal with this all the time, users think that Windows is "correct", then the output a document, incorrectly tagged, and then when they don't see a trademark symbol (™) or micon symbol (µ) on the web site it's my fault! Yesterday several hours were wasted because of the lies that Windows tells...

Re:Sounds like a job for The Demorniser

pudge on 2004-03-31T17:10:06

It's probably a programmer somewhere who thinks that iso-8859-1 and Windows-1252 are the same thing.

It's probably some programmer somewhere who has no idea what charsets are, or didn't bother caring enough to check the output, and just used someone else's code (either an existing library or copy/paste).

Re:Sounds like a job for The Demorniser

davorg on 2004-03-31T17:57:24

You're probably right. And if they did check the output then it was probably in something like Outlook which makes the same broken assumptions.