PHP Bulletin Board Home
News About Home
Features of phpBB Test drive phpBB Downloads Support for phpBB The phpBB Community Styles for customising phpBB 3rd party modifications to phpBB

Support Home | Knowledge Base Home | Submit Article | Search Articles | Browse Articles
 Setting a site to work with UTF-8 
Description: In order to set phpbb to work with UTF-8 some setting is required besided the charset.
Author: pichirichi
Date: Fri Jun 17, 2005 10:29 pm
Type: Tutorial
Keywords: utf-8 multi byte multilingual
Category: Installing/upgrading/converting
I've noticed many question on UTF-8 and several not so acurate tips on the support forum. I've gathered several points that need to be taken into considuration when doing that switch.

My interest with UTF-8 started at the point that I had to upgrade my site to UTF-8 since my hosting "forced" me. The MySql database was upgraded and all the data was shifted to a new charset: UTF-8.

Here are some tips regarding utf-8/unicode:

  1. Edit all the lang files and save them as utf-8 (you can use UltraEdit for that, use the convert option).
  2. Edit all e-mailslocated in the lang directory, replace the charset and save them as utf-8 (you can use ultraedit here as well).
    Code:
    Charset: utf-8

    There is an issue with the subject line (some charecters disappear, still looking for resolution).
  3. You need to convert the database data to utf-8.
    This can be done in few manners:

    1. lock the forum in the ACP for updates, export the data. convert the exported file to utf-8 (using an editor or a convert program). refresh the data.
    2. MySql commands - version 4.1.x come with several new features regarding charecter set handling.
      Columns COLLATE and CHARACTER should be defined.
      Code:
      ALTER TABLE `test table` CHANGE `a` `a` VARCHAR( 10 ) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL

      This command will alter the data stored in the column, there is a batch command that can be used for this process.
    3. you can find few utf-8 convertors on the "Convertors" forum.

  4. You got to use phpMyAdmin that support utf-8.
  5. after the connect to db command the following code should be added:
    Code:

    //
    // Set charecters set parameters according to MySql version.
    //
    $result = mysql_query('SELECT VERSION()') or die('Query failed: ' . mysql_error());
    $mysql_version = mysql_fetch_array($result, MYSQL_ASSOC);
    list($mysql_version_majour,$mysql_version_minor,$mysql_version_patch) = split('\.',$mysql_version['VERSION()']);
    if (($mysql_version_majour >=4) && ($mysql_version_minor>=1))
    {
       $result = mysql_query('SET character_set_client = utf8') or die('Query failed: ' . mysql_error());
       $result = mysql_query('SET character_set_results = utf8') or die('Query failed: ' . mysql_error());
       $result = mysql_query('SET character_set_connection = utf8') or die('Query failed: ' . mysql_error());
    }

  6. Special charecters: there are some charecters used by MOD developers that translate into gibrish when using utf-8.
    i.e. "»" that usualy used for won't be displayed correctly on utf-8, this sign should be replaced in tpl files or php code. It can be changed to
    Code:
    »

  7. String manipulation such as substr command won't work with multi byte/utf-8.
    Cutting charecters depen on how many bites they take, try to read on .
    php site in the comment area about this issue. You should define an encoding string based on the encoding you want to use. the best way would be to use the parameter defined in $lang['ENCODING'] however the value in this parameter is set in a later stage. In the extension.inc file define:
    Code:
    mb_internal_encoding('UTF-8');
    Note: There are some updates regarding the return values from the mb lib functions.
    ie. mb_string() returns empty _string_, when function substr() returns _boolean_ false in this case.

  8. MB lib - there is a server parameter that provide the ability to automaticly override all the relevant string functions with MultiByte functions. Read on PhpNet about it.
    If you'll set the parameter mbstring.func_overload in the php.ini to 7 you won't need to change and replace all the string functions. For those who don't have permitions to update the php.ini you can override the value in the .htaccess file with this code line:
    Code:
    php_value mbstring.func_overload 7


  9. highlight on Search - no working on UTF-8, I'm still looking into this issue.

Username: Password:
News | Features | Demo | Downloads | Support | Community | Styles | Mods | Links | Merchandise | About | Home
 © Copyright 2002 The phpBB Group.