0

I'm going to store data (mostly Wikipedia page titles) in a table, that can contain characters for which full UTF8 is needed. The schema I'm using is

CREATE TABLE `en_brands` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `name` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
 `name_encoded` varchar(255) NOT NULL,
 `inserted` datetime NOT NULL,
 PRIMARY KEY (`id`),
 UNIQUE KEY `name` (`name`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT

As you can see, name is only 191 characters long. When using larger values, then MySQL refuses the creation of UNIQUE KEY name because such keys can only be 767 bytes long. What are my questions:

  • has name_encoded to be at least TEXT to fully store an URL-encoded (in PHP rawurlencode) UTF8 string? (I think an 255 byte long string could be represented by a 3060 character long URL in worst case - 255 chars x 4 bytes x 3 chars for encoded representation)
  • does it matter which collation I use for name_encoded (I think not, because URL-encoded they should fit into latin)
  • which data type and collation should I use for name to store at least 255 characters with full UTF8 support and to create an UNIQUE KEY nevertheless (I'd like to use a collation which allows native language sorting)

BTW: I'm using MySQL 5.6 Percona on Debian Wheezy

4

1 回答 1

0

回答我的问题:

name_encoded 至少是 TEXT 以完全存储 URL 编码(在 PHP rawurlencode 中)UTF8 字符串?

当然,一个 URL 编码的 UTF8 字符串可以长达 3060 个字符,所以TEXT是必需的

我对 name_encoded 使用哪种排序规则是否重要(我认为不是,因为 URL 编码的它们应该适合拉丁语)

URL 编码的字符串适合ASCII

我应该使用哪种数据类型和排序规则作为名称来存储至少 255 个字符并完全支持 UTF8 并创建一个 UNIQUE KEY

UNIQUE KEY这是不可能的,因为s中的文本列的长度限制

我通过执行SELECT第一个检查重复项,因此不需要一个UNIQUE KEY,但应用程序必须确保数据完整性

于 2015-02-17T07:17:54.373 回答