CollationKey
是一个抽象类。您的具体类型很可能是RuleBasedCollationKey
. 首先,我们看一下方法的JavaDoc:
将 CollationKey 转换为位序列。如果可以合法地比较两个 CollationKey,那么可以比较每个键的字节数组以获得相同的结果。字节数组首先组织最重要的字节。
显然,“a”的排序规则键不是由与字符串“a”相同的字节表示,这并不奇怪
下一步是查看其来源以了解它返回的确切内容:
public byte[] toByteArray() {
char[] src = key.toCharArray();
byte[] dest = new byte[ 2*src.length ];
int j = 0;
for( int i=0; i<src.length; i++ ) {
dest[j++] = (byte)(src[i] >>> 8);
dest[j++] = (byte)(src[i] & 0x00ff);
}
return dest;
}
是什么key
?它作为第二个构造函数参数传入。构造函数被调用RuleBasedCollator#getCollationKey
。源代码相当复杂,但该方法的 JavaDoc 指出:
将字符串转换为可以与 CollationKey.compareTo 进行比较的一系列字符。这会覆盖 java.text.Collator.getCollationKey。它可以在子类中被覆盖。
查看该方法的内联代码注释,进一步解释:
// The basic algorithm here is to find all of the collation elements for each
// character in the source string, convert them to a char representation,
// and put them into the collation key. But it's trickier than that.
// Each collation element in a string has three components: primary (A vs B),
// secondary (A vs A-acute), and tertiary (A' vs a); and a primary difference
// at the end of a string takes precedence over a secondary or tertiary
// difference earlier in the string.
//
// To account for this, we put all of the primary orders at the beginning of the
// string, followed by the secondary and tertiary orders, separated by nulls.
接下来是一个假设的例子:
// Here's a hypothetical example, with the collation element represented as
// a three-digit number, one digit for primary, one for secondary, etc.
//
// String: A a B \u00e9 <--(e-acute)
// Collation Elements: 101 100 201 510
//
// Collation Key: 1125<null>0001<null>1010
因此,假设 aCollationKey
的toByteArray()
方法将返回与 aString
的toByteArray()
方法相同的结果是完全错误的。
"a".toByteArray()
不一样Collator.getInstance().getCollationKey("a").toByteArray()
。如果是这样,我们真的不需要排序规则,对吗?