0

我正在尝试将这样的非 Unicode 字符串'¹ûº¤¡¾¢º¤ìñ©2'转换为这样的 Unicode,'ໃຊ້ໃນຄົວເຮືອນ'是老挝语。我尝试了下面的代码,它的返回值是这样的,'??????' . 知道如何转换字符串吗?

Public Shared Function ConvertAsciiToUnicode(asciiString As String) As String
    ' Create two different encodings.
    Dim encAscii As Encoding = Encoding.ASCII
    Dim encUnicode As Encoding = Encoding.Unicode

    ' Convert the string into a byte[].
    Dim asciiBytes As Byte() = encAscii.GetBytes(asciiString)

    ' Perform the conversion from one encoding to the other.
    Dim unicodeBytes As Byte() = Encoding.Convert(encAscii, encUnicode, asciiBytes)

    ' Convert the new byte[] into a char[] and then into a string.
    ' This is a slightly different approach to converting to illustrate
    ' the use of GetCharCount/GetChars.
    Dim unicodeChars As Char() = New Char(encUnicode.GetCharCount(unicodeBytes, 0, unicodeBytes.Length) - 1) {}
    encUnicode.GetChars(unicodeBytes, 0, unicodeBytes.Length, unicodeChars, 0)
    Dim unicodeString As New String(unicodeChars)

    ' Return the new unicode string
    Return unicodeString
End Function
4

1 回答 1

4

您的 8 位编码老挝文本不是 ASCII,而是在一些代码页中,如 IBM CP1133 或 Microsoft LC0454,或者很可能是泰语代码页 874。您必须找出它是哪一个。

您如何获得(读取、接收、计算)输入字符串很重要。当你把它变成一个字符串时,它已经是 Unicode 并且很容易以 UTF-8 输出,例如,像这样:

Dim writer As New StreamWriter("myfile.txt", True, System.Text.Encoding.UTF8)
writer.Write(mystring)
writer.Close()

这是整个内存转换:

Dim utf8_input as Byte()
...
Dim converted as Byte() = Encoding.Convert(Encoding.GetEncoding(874), Encoding.UTF8, utf8_input)

该数字874是表示您的输入在哪个代码页中的数字。特定操作系统安装是否支持此代码页是另一个问题,但如果您只是使用它来编写 Stack Overflow 问题,您自己的系统几乎肯定会支持它。

于 2012-07-13T08:28:42.393 回答