Lingua::Identify v.0.15

cog on 2005-08-03T13:51:30

Months pass without anything happening with a module, and then, on the same day:

* I add some more features to respond to a friend's request, and then... * I get an email from someone wanting to help, and then... * I see another module with the same purpose being uploaded.

All in the same day :-)

Lingua::Identify is now on version 0.15 (on its way to CPAN), with support for big files working properly (I hope), 7 new languages (now up to 33, thanks to George Wilson) and a dummy-mode :-)

The dummy-mode helps me a lot while developing. You tell the module "but do this in dummy mode" and it prepares everything but, instead of analyzing the text, returns you what it prepared, so you can see what's under the hood. Great for debugging :-)

I also told the author of the other module about mine. I hope he finds it interesting :-)


Interesting ... how about Jèrriais ?

n1vux on 2005-08-04T23:23:29

This could be quite useful, especially for web-spiders. Those Machine translation websites could stop asking *me* to guess what it was to begin with.

If you want to add obscure European languages, samples of Jersey-Norman French aka Jèrriais, the native language of Jersey, queen of the Channel Isles, are available at ("site for language and literature in Jèrriais, the traditional language of Jersey, Channel Islands;" includes "poems, texts, Jèrriais-English vocabulary lists, pages of traditional sayings and proverbs". Jèrriais does not have an ISO Language code, but they do have JE as a ccTLDomain for the island, since they are semi-sovereign crown territory (Balliwick), and that is a ISO 3166-1 "UPU Reserve" code for Jersey. (State Heritage). Jèrriais is a blend of Norse and French, with less modern French influence than found in Normandy in recent times. Someone should complain to the ISO-639 maintainers about adding JE=Jèrriais to their list ... I guess the Societe Jèrsiaise could do that.

Re:Interesting ... how about Jèrriais ?

cog on 2005-08-05T15:22:04

This could be quite useful

This *is* quite useful :-)

And, moreover, it's one of the best tools out there for the job, as most of its competitors aren't free, most don't have a nice range of languages, and almost all of them are rather boring, when it comes to flexibility. They don't allow you to select methods, they simply implement one! Most don't allow you to teach new languages to the tool, most will only work in a specific platform and some of them won't even allow you to select which languages are active and which are not.

The market for this kind of tool is not that big, apparently, but from time to time I get an email from somebody using it, and they've only said good things about it so far :-)