0
4

1 回答 1

2

Because the data is coming from JSON, it should be encoded in a Unicode character set, the default being UTF-8 [Sources: Douglas Crockford, RFC4627].

This means that in order to store a non-ASCII character in your database, you will either need to convert the encoding of the incoming data to the character set of you database, or (preferably) use a Unicode character set for your database. The most common Unicode character set - and the one I'd recommend you use for this purpose - is UTF-8.

It is likely that your database is set up with one of the latin character sets (ISO-8859-*), in which case you will most likely simply need to change the character set used for your table and it won't break any of your existing data - assuming that you currently have no records that use any characters outside the lower 128. Based on you comments above, you should be able to make this change using phpMyAdmin - you will need to ensure that you change each existing column you wish to alter explicitly, changing the character set of a table/database will only affect new columns/tables that are created without specifying a character set.

When you are outputting data to the client, you will also need to tell it that you are outputting UTF-8 so it knows how to display the characters correctly. You do this by ensuring you append ; charset=utf-8 to the Content-Type: header you send along with text-based content.

For example, at the top of a PHP script that produces HTML that is encoded with UTF-8, you would add this line:

header('Content-Type: text/html; charset=utf-8');

It is also recommended that you declare the character set of the document within the document itself. This declaration must appear before any non-ascii characters that exist within the document - as a result, it is recommended that you place the following <meta> tag as the first child of the <head>:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

If you are producing XHTML with an XML declaration at the top, the character set may be declared there, instead of using a <meta> tag:

<?xml version="1.0" encoding="UTF-8" ?>

Remember, the use of a character set definition in the Content-Type: header is not limited to text/html - it makes sense in the context of any text/* family MIME type.

Further reading: What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text

Also, make sure you validate your markup.

于 2012-08-30T09:19:24.347 回答