| THE WEB STANDARDS GROUP RECOMMENDS that UCSB Web pages always specify the character encoding, either through the http header or the <meta> tag. Character encoding tells the browser how to correctly render characters in the page. Unicode Transform Format (UTF) character encoding is part of the XHTML standard; it can represent almost all characters currently in use worldwide and makes it possible to stream over a network. |
| The preferred method of indicating the encoding is by using the charset parameter of the Content-Type HTTP header. For example, to specify that an HTML document uses utf-8, a server would send the following header (note that utf-8 is also the default encoding): |
Content-Type: text/html; charset=utf-8
|
| It is also possible to specify the Content-Type using a <meta> tag within the document itself: |
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-5">
|
| Sometimes a Web page is in another language with a different character set. For example if the content is in Russian, as indicated by the charset referenced in the meta tag example above, then the character encoding will need to reflect that. The Internet Assigned Numbers Authority (IANA) is the organization responsible for naming character sets used on the Internet: http://www.iana.org/assignments/character-sets |
References:
http://www.w3.org/International/tutorials/tutorial-char-enc/
http://www.htmlhelp.org/tools/validator/charset.html |