eZtip: Character Encoding in Templates
Recently I was working on a multilingual site and had to add some static text in a number of languages. I added the supplied text and then viewed the resulting page. Having previously lost many hours of my life to dealing with character encoding issues I momentarily regretted I'd gotten out of bed when confronted with mess that was before me.
Quickly regaining my composure I realised that eZ Publish must be doing the transformation and with a bit of digging I discovered that by default templates are seen by the system as being in iso-8859-1 and are converted into utf-8 for display.
In my case eZ Publish was doing what it was told and converting the already utf-8 characters in utf-8.
There are 2 options for telling eZ publish the character encoding of a template. Firstly you can configure eZ publish that all templates are utf-8 by editing template.ini.append.php in overrides for the entire site or your specific siteaccess to limit the effect.
The other option is to specifically indicate a template as being encoded using utf-8. This involves adding the following line to the top of the template.
In end I utilised the latter method. As I was making a small change to a large site that I was not familiar with, the specific method allowed for the change to be limited the effected template only.
I must admit I was surprised that these defaults are in place, given that eZ moved to a Unicode default some time ago. From the comments on this issue it would appear that concerns about backward compatibility are the reason that the default remains iso-8859-1 while the output is set to utf-8. I do wonder what effect this has on performance though.
Quickly regaining my composure I realised that eZ Publish must be doing the transformation and with a bit of digging I discovered that by default templates are seen by the system as being in iso-8859-1 and are converted into utf-8 for display.
In my case eZ Publish was doing what it was told and converting the already utf-8 characters in utf-8.
There are 2 options for telling eZ publish the character encoding of a template. Firstly you can configure eZ publish that all templates are utf-8 by editing template.ini.append.php in overrides for the entire site or your specific siteaccess to limit the effect.
[CharsetSettings]
# The charset to use if no charset is specified in the template
DefaultTemplateCharset=utf-8
The other option is to specifically indicate a template as being encoded using utf-8. This involves adding the following line to the top of the template.
{*?template charset=utf-8?*}
In end I utilised the latter method. As I was making a small change to a large site that I was not familiar with, the specific method allowed for the change to be limited the effected template only.
I must admit I was surprised that these defaults are in place, given that eZ moved to a Unicode default some time ago. From the comments on this issue it would appear that concerns about backward compatibility are the reason that the default remains iso-8859-1 while the output is set to utf-8. I do wonder what effect this has on performance though.
Are you sure that correct charset is utf8 (without a hyphen)?
ReplyDeleteI have this setting set to
DefaultTemplateCharset=utf-8
and it works (at least i think so)
Hi Andrey
ReplyDeleteI was basing this on the info @ http://issues.ez.no/IssueView.php?Id=13835&ProjectId=3
I'll double check and update the post if required.
Cheers
Bruce
Hi Andry
ReplyDeleteIt would appear that both will work. eZCharsetInfo::aliasTable provides a lookup of character set aliases and utf8 = utf-8.
See: http://pubsvn.ez.no/doxygen/trunk/html/ezcharsetinfo_8php-source.html#l00055
utf-8 is the "correct" value and I've updated the post to reflect this.
Cheers
Bruce
What version of eZ was this in because I'm finding 4.0.1 [siteaccess]/template.ini.append.php already has the character encoding set to utf-8?
ReplyDeleteMaybe it depends on the design package installed?