Catalyst, MySQL, SQLite, H::FH, UTF-8 and more

xsawyerx on 2009-12-22T14:28:58

this was originally posted on my new blogs.perl.org journal, which can be found here which is also the RSS feed for it.

I tried to update an online website with some changes. Generally, I run a production and a testing environment. Recently, however, I moved the code from using SQLite to MySQL and did not create a testing DB, so changes that require changing the text on the site are done in production. Not good? I know!

So the website is built in Catalyst. Originally used SQLite and then migrated to MySQL (which had to be done manually). It uses HTML::FormHandler to display the forms, with a generic CRUD layer I added.

When trying to load the form, I get weird characters for some of the page. From what I gathered, the data in the MySQL isn't kept in UTF8 but in latin1 but we declare the page as UTF8 encoding. The form isn't displayed in utf8 (which was changed using "use utf8;" in the form .pm file, or using Encode::Guess which yielded a better result). David Wheeler has a really interesting article on UTF8 in Perl here.

When that didn't work, I decided to convert the entire database to UTF-8. I read it using pure DBI, and using DBIC and explicit utf8 column declaration I inserted each table to a newly create database where the tables are charset UTF-8 with utf-8 collation. That didn't change anything. Apparently the latin1 was fine.

More investigation revealed that the HTML::FormHandler::Render::Simple was decoding some stuff which was screwing up a lot. Once that was fixed (AKA patching it up), more stuff remained unclear. It was as if H::FH wasn't able to read the correct record from the database. Trying to do it using $c->model('DB::Table')->search( { id => $id } ) worked just fine. Apparently the API in HTML::FormHandler changed or it has a critical bug.

I should read the manual. There is the manual, tutorial and an intro for it. They are all very very long and complex. They give me a headache. Honestly, I'm more comfortable reading the Perl XS tutorial rather than the current synopsis or ANY documentation of HTML::FormHandler.

Alas, I'll have to get rid of HTML::FormHandler. It's become so cumbersome I can't be bothered patching it any more even. I don't want to reinvent the form wheel again (because I probably won't put a lot into it). Maybe I'll add a layer on top of FormFu.

I remember asking mst about forms in #catalyst a long while ago. He said that "forms suck.. some people find FormFu helps make them suck less." I think I understand it better now.


Starting points

ddick on 2009-12-22T22:19:09

It's difficult for me to tell what's happening, but suggestions follow:

  • decide that UTF-8 is going to be your encoding
  • read up on the my $decoded = Encode::decode('UTF-8', $encoded) and my $encoded = Encode::encode('UTF-8', $decoded) functions
  • encode strings just before they leave your application (just before print, DBI, whatever)
  • decode strings as soon as possible (just after read, my $encoded = $cgi->param($name), $sth->fetchrow, whatever)
  • make sure your MySQL databases are created with "CREATE DATABASE `$database_name` CHARACTER SET utf8";
  • make sure all of your elements contain 'accept-charset="UTF-8"'
  • write a test script to prove it all works

Re:Starting points

ddick on 2009-12-22T22:21:21

Dammit, i meant to say

  • make sure all of your <form> elements contain 'accept-charset="UTF-8"'

That's not what bugs me...

xsawyerx on 2009-12-23T17:04:17

Apparently that didn't cause all the havoc. The havoc was caused by a breakage in the API.

What bugs me isn't even that, it's that it's so incredibly complex that it would take me a month to read the manual alone. That I have no way of writing really fast CRUDs, which is the core of my web apps.

I still appreciate the comments, and I'll definitely take them under advisement. Thanks!