2

作为通用解决方案,我们如何在 Swift 中获取字符或字符串的 unicode 代码点?

考虑以下:

let A: Character = "A"     // "\u{0041}"
let Á: Character = "Á"     // "\u{0041}\u{0301}"

let sparklingHeart = ""  // "\u{1F496}"
let SWIFT = "SWIFT"        // "\u{0053}\u{0057}\u{0049}\u{0046}\u{0054}"

如果我没记错的话,所需的函数可能会返回一个字符串数组,例如:

extension Character {
    func getUnicodeCodePoints() -> [String] {
        //...
    }
}

A.getUnicodeCodePoints()
// the output should be: ["\u{0041}"]

Á.getUnicodeCodePoints()
// the output should be: ["\u{0041}", "\u{0301}"]

sparklingHeart.getUnicodeCodePoints()
// the output should be: ["\u{1F496}"]

SWIFT.getUnicodeCodePoints()
// the output should be: ["\u{0053}", "\u{0057}", "\u{0049}", "\u{0046}", "\u{0054}"]

任何更多建议的优雅方法将不胜感激。

4

1 回答 1

4

通常,unicodeScalarsa 的属性String返回其 unicode 标量值的集合。(Unicode 标量值是除高代理和低代理代码点之外的任何 Unicode 代码点。)

例子:

print(Array("Á".unicodeScalars))  // ["A", "\u{0301}"]
print(Array("".unicodeScalars)) // ["\u{0001F496}"]

在 Swift 3 之前,无法Character直接访问 a 的 unicode 标量值,它必须转换为String第一个(对于 Swift 4 状态,请参见下文)。

如果您想将所有 Unicode 标量值视为十六进制数字,则可以访问该value属性(它是一个UInt32数字)并根据您的需要对其进行格式化。

示例(使用U+NNNNUnicode 值的表示法):

extension String {
    func getUnicodeCodePoints() -> [String] {
        return unicodeScalars.map { "U+" + String($0.value, radix: 16, uppercase: true) }
    }
}

extension Character {
    func getUnicodeCodePoints() -> [String] {
        return String(self).getUnicodeCodePoints()
    }
}


print("A".getUnicodeCodePoints())     // ["U+41"]
print("Á".getUnicodeCodePoints())     // ["U+41", "U+301"]
print("".getUnicodeCodePoints())    // ["U+1F496"]
print("SWIFT".getUnicodeCodePoints()) // ["U+53", "U+57", "U+49", "U+46", "U+54"]
print("".getUnicodeCodePoints())    // ["U+1F1EF", "U+1F1F4"]

Swift 4 的更新:

从 Swift 4 开始,unicodeScalarsa 的Character可以直接访问,参见SE-0178 Add unicodeScalars property to Character。这使得转换为String 过时:

let c: Character = ""
print(Array(c.unicodeScalars)) // ["\u{0001F1EF}", "\u{0001F1F4}"]
于 2017-07-09T09:59:03.483 回答