1 回答
Because the data is coming from JSON, it should be encoded in a Unicode character set, the default being UTF-8 [Sources: Douglas Crockford, RFC4627].
This means that in order to store a non-ASCII character in your database, you will either need to convert the encoding of the incoming data to the character set of you database, or (preferably) use a Unicode character set for your database. The most common Unicode character set - and the one I'd recommend you use for this purpose - is UTF-8.
It is likely that your database is set up with one of the latin character sets (ISO-8859-*), in which case you will most likely simply need to change the character set used for your table and it won't break any of your existing data - assuming that you currently have no records that use any characters outside the lower 128. Based on you comments above, you should be able to make this change using phpMyAdmin - you will need to ensure that you change each existing column you wish to alter explicitly, changing the character set of a table/database will only affect new columns/tables that are created without specifying a character set.
When you are outputting data to the client, you will also need to tell it that you are outputting UTF-8 so it knows how to display the characters correctly. You do this by ensuring you append ; charset=utf-8
to the Content-Type:
header you send along with text-based content.
For example, at the top of a PHP script that produces HTML that is encoded with UTF-8, you would add this line:
header('Content-Type: text/html; charset=utf-8');
It is also recommended that you declare the character set of the document within the document itself. This declaration must appear before any non-ascii characters that exist within the document - as a result, it is recommended that you place the following <meta>
tag as the first child of the <head>
:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
If you are producing XHTML with an XML declaration at the top, the character set may be declared there, instead of using a <meta>
tag:
<?xml version="1.0" encoding="UTF-8" ?>
Remember, the use of a character set definition in the Content-Type:
header is not limited to text/html
- it makes sense in the context of any text/*
family MIME type.
Further reading: What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text
Also, make sure you validate your markup.