PHP Scripts and Tools

Encoding Conversion Functions

These days, I use separate encoded text files for languages; one using UTF-8 and one using the traditional character encoding of the text. In the past, I translated the traditionally encoded text to UTF-8 on the fly, which only required one language file, but turned out to be too much of a hassle to keep up. Nevertheless, the recoding functions I translated into PHP from the original Perl module NexTrieve::UTF8, turned out to be useful in their own right.

There do exist PHP functions which convert any character encoding to any other with ease. However, because they're new, and/or require special compiled modules to use, they aren't always available to everyone. The functions below rely on basic low-level byte-by-byte conversion.

ISO-8859-9 to UTF-8
This simple function recodes an iso-8859-9 (Turkish) encoded string into UTF-8. I used this function for the Turkish language file distributed with the Orca Ringmaker.
ISO-8859-2 to UTF-8
This simple function recodes an iso-8859-2 (Polish) encoded string into UTF-8. I used this function for the Polish language file distributed with the Orca Knowledgebase.
Win-874 to UTF-8
This simple function recodes a windows-874 (Thai) encoded string into UTF-8. I used this function for the Thai language file distributed with the Orca Forum.