I'm going to store data (mostly Wikipedia page titles) in a table, that can contain characters for which full UTF8 is needed. The schema I'm using is
CREATE TABLE `en_brands` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
`name_encoded` varchar(255) NOT NULL,
`inserted` datetime NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `name` (`name`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT
As you can see, name
is only 191 characters long. When using larger values, then MySQL refuses the creation of UNIQUE KEY name
because such keys can only be 767 bytes long. What are my questions:
- has
name_encoded
to be at leastTEXT
to fully store an URL-encoded (in PHPrawurlencode
) UTF8 string? (I think an 255 byte long string could be represented by a 3060 character long URL in worst case - 255 chars x 4 bytes x 3 chars for encoded representation) - does it matter which collation I use for
name_encoded
(I think not, because URL-encoded they should fit intolatin
) - which data type and collation should I use for
name
to store at least 255 characters with full UTF8 support and to create anUNIQUE KEY
nevertheless (I'd like to use a collation which allows native language sorting)
BTW: I'm using MySQL 5.6 Percona on Debian Wheezy