如果您有一个 UTF-8 字符串,其中每个字节都是正确的('Ö' -> [195, 0] , [150, 0]),您可以使用以下内容:
public static string Utf8ToUtf16(string utf8String)
{
/***************************************************************
* Every .NET string will store text with the UTF-16 encoding, *
* known as Encoding.Unicode. Other encodings may exist as *
* Byte-Array or incorrectly stored with the UTF-16 encoding. *
* *
* UTF-8 = 1 bytes per char *
* ["100" for the ansi 'd'] *
* ["206" and "186" for the russian '?'] *
* *
* UTF-16 = 2 bytes per char *
* ["100, 0" for the ansi 'd'] *
* ["186, 3" for the russian '?'] *
* *
* UTF-8 inside UTF-16 *
* ["100, 0" for the ansi 'd'] *
* ["206, 0" and "186, 0" for the russian '?'] *
* *
* First we need to get the UTF-8 Byte-Array and remove all *
* 0 byte (binary 0) while doing so. *
* *
* Binary 0 means end of string on UTF-8 encoding while on *
* UTF-16 one binary 0 does not end the string. Only if there *
* are 2 binary 0, than the UTF-16 encoding will end the *
* string. Because of .NET we don't have to handle this. *
* *
* After removing binary 0 and receiving the Byte-Array, we *
* can use the UTF-8 encoding to string method now to get a *
* UTF-16 string. *
* *
***************************************************************/
// Get UTF-8 bytes and remove binary 0 bytes (filler)
List<byte> utf8Bytes = new List<byte>(utf8String.Length);
foreach (byte utf8Byte in utf8String)
{
// Remove binary 0 bytes (filler)
if (utf8Byte > 0) {
utf8Bytes.Add(utf8Byte);
}
}
// Convert UTF-8 bytes to UTF-16 string
return Encoding.UTF8.GetString(utf8Bytes.ToArray());
}
在我的情况下,DLL 结果也是一个 UTF-8 字符串,但不幸的是,UTF-8 字符串是用 UTF-16 编码('Ö' -> [195, 0], [19, 32])解释的。因此,150 的 ANSI '-' 被转换为 8211 的 UTF-16 '-'。如果您也遇到这种情况,您可以使用以下代码:
public static string Utf8ToUtf16(string utf8String)
{
// Get UTF-8 bytes by reading each byte with ANSI encoding
byte[] utf8Bytes = Encoding.Default.GetBytes(utf8String);
// Convert UTF-8 bytes to UTF-16 bytes
byte[] utf16Bytes = Encoding.Convert(Encoding.UTF8, Encoding.Unicode, utf8Bytes);
// Return UTF-16 bytes as UTF-16 string
return Encoding.Unicode.GetString(utf16Bytes);
}
或本机方法:
[DllImport("kernel32.dll")]
private static extern Int32 MultiByteToWideChar(UInt32 CodePage, UInt32 dwFlags, [MarshalAs(UnmanagedType.LPStr)] String lpMultiByteStr, Int32 cbMultiByte, [Out, MarshalAs(UnmanagedType.LPWStr)] StringBuilder lpWideCharStr, Int32 cchWideChar);
public static string Utf8ToUtf16(string utf8String)
{
Int32 iNewDataLen = MultiByteToWideChar(Convert.ToUInt32(Encoding.UTF8.CodePage), 0, utf8String, -1, null, 0);
if (iNewDataLen > 1)
{
StringBuilder utf16String = new StringBuilder(iNewDataLen);
MultiByteToWideChar(Convert.ToUInt32(Encoding.UTF8.CodePage), 0, utf8String, -1, utf16String, utf16String.Capacity);
return utf16String.ToString();
}
else
{
return String.Empty;
}
}
如果您需要它,请参阅Utf16ToUtf8。希望我能有所帮助。