1

在以下代码中,将相同的字母添加到比较的两个操作数会更改结果。尽管-不大于j,但-k大于jk

仅当操作数之一是减号 ( -) 或单引号 ( ') 时才会发生这种情况。

为什么会这样?都有些什么样的规矩?

if - gtr j (echo - greater than j) else echo - less than j
if "-" gtr "j" (echo "-" greater than "j") else echo "-" less than "j"
echo.
if -k gtr jk (echo -k greater than jk) else echo -k less than jk
if "-k" gtr "jk" (echo "-k" greater than "jk") else echo "-k" less than "jk"
echo.
if ' gtr u (echo ' greater than u) else echo ' less than u
if "'" gtr "u" (echo "'" greater than "u") else echo "'" less than "u"
echo.
if 'v gtr uv (echo 'v greater than uv) else echo 'v less than uv
if "'v" gtr "uv" (echo "'v" greater than "uv") else echo "'v" less than "uv"

结果是:

- less than j
"-" less than "j"

-k greater than jk
"-k" greater than "jk"

' less than u
"'" less than "u"

'v greater than uv
"'v" greater than "uv"
4

1 回答 1

2

您可能会假设字符串只是逐个字符比较,取它们的序数值。

这不是真的。排序比这复杂得多。

实际上,您可以在其他环境中看到相同的情况,例如 Windows PowerShell:

PS Home:\> '-' -gt 'j'
False
PS Home:\> '-k' -gt 'jk'
True
PS Home:\> '''' -gt 'u'
False
PS Home:\> '''v' -gt 'uv'
True

字符串的顺序很可能也因您的语言环境而异。

As for your particular problem here, quoting from the Unicode Collation Algorithm (UTS #10):

Collation order is not preserved under concatenation or substring operations, in general.

For example, the fact that x is less than y does not mean that x + z is less than y + z, because characters may form contractions across the substring or concatenation boundaries. In summary:

x < y does not imply that xz < yz
x < y does not imply that zx < zy
xz < yz does not imply that x < y
zx < zy does not imply that x < y

and to solve the misconveption you're likely under:

Collation is not code point (binary) order.

A simple example of this is the fact that capital Z comes before lowercase a in the code charts. As noted earlier, beginners may complain that a particular Unicode character is “not in the right place in the code chart.” That is a misunderstanding of the role of the character encoding in collation. While the Unicode Standard does not gratuitously place characters such that the binary ordering is odd, the only way to get the linguistically-correct order is to use a language-sensitive collation, not a binary ordering.

于 2011-05-15T15:20:58.020 回答