2 回答
This is because the collation you used (utf8mb4_unicode_ci, utf8mb4_unicode_520_ci and utf8mb4_0900_ai_ci) only compares character's base letter. For example, 'ぺ' = 'へ' + U+309A ◌゚, 'へ' is the base letter of 'ぺ'. So for your case, all 3 characters' base letter is same, 'へ'. So it is correct result for those collations return '1'.
MySQL team is developing a new Japanese collation for utf8mb4 character set. It will differentiate these dakuten characters from base character. It will come soon.
SELECT 'へ' = 'ぺ' COLLATE utf8mb4_unicode_ci; --> 0 (ditto for general_ci)
SELECT 'へ' = 'ぺ' COLLATE utf8mb4_unicode_520_ci; --> 1
The latter is a newer Unicode standard, so it is, in theory, more correct.
But what are you really doing? Probably comparing one column to another? Are they both utf8mb4_unicode_520_ci
? (The database and the connection don't matter.)
Or is one side of =
a column and the other is a literal?
Do you establish the collation when connecting?
Addenda
In version 8.0.0, all of these give 1
:
utf8mb4_unicode_ci -- a change from 0 in 5.6.12, but 1 in 5.7.15?
utf8mb4_unicode_520_ci
utf8mb4_0900_ai_ci