While I was working on the module Win32::SAPI5, I noticed that the voices by Cepstral and Fluency all returned a languagecode 'US English'. Of course I sent these vendors an email informing them that they should change that.
Arthur Dirksen of Fluency let me know that he knew about this, but he just couldn't find anything in the SAPI5 SDK about what exactly he should put in there, since the 'ordinary' MS Languagecodes didn't work. I knew I had read something about it in the documentation.....but where??
Eventually, under the chapter 'SAPI 5 XML Tags' (of all places!) I found this:
In the Windows Registry, the language attribute for the Microsoft SAPI 5 English voices is labeled as '409;9' The '409' attribute information indicates the voice is specifically US English, and '9' refers to the English language. This language labeling convention for voices may not be followed by all engine manufacturers. For example, the LH voices may use '409' to indicate an English voice, while Microsoft uses '409;9' to specify the voice is specifically US English.
For example:
409;9 = US English
809;9 = British English
Not that it's very logical in the first place to document that piece in this chapter only, it's also rather strange to have this unusual format for a languagecode.
Anyway, he now knows how to correct it. And oh, pVoice now works with all SAPI4 and SAPI5 compliant voices (which is about 95% of all speechengines out there I guess)