c# - 在 .Net 字符串中表示大于 127 的字节值

Question

我正在使用字符串在.Net 中编写一些二进制协议消息，除了一种特殊情况外，它大部分都有效。

我要发送的消息是：

String cmdPacket = "\xFD\x0B\x16MBEPEXE1.";  
myDevice.Write(Encoding.ASCII.GetBytes(cmdPacket));

（为了帮助解码，这些字节是 253、11、22，然后是 ASCII 字符：）"MBEPEXE1."。

除非我执行Encoding.ASCII.GetBytes，否则0xFD以字节形式出现0x3F （值 253 更改为 63）。

（我应该指出\x0Band\x16被正确解释为Hex 0B& Hex 16）

我也试过Encoding.UTF8and Encoding.UTF7，无济于事。

我觉得可能有一种很好的简单方法可以在字符串中表达高于 128 的值，并将它们转换为字节，但我错过了它。

有什么指导吗？

score 4 · Accepted Answer

忽略您所做的事情是好是坏，编码ISO-8859-1会将其所有字符映射到 Unicode 中具有相同代码的字符。

// Bytes with all the possible values 0-255
var bytes = Enumerable.Range(0, 256).Select(p => (byte)p).ToArray();

// String containing the values
var all1bytechars = new string(bytes.Select(p => (char)p).ToArray());

// Sanity check
Debug.Assert(all1bytechars.Length == 256);

// The encoder, you could make it static readonly
var enc = Encoding.GetEncoding("ISO-8859-1"); // It is the codepage 28591

// string-to-bytes
var bytes2 = enc.GetBytes(all1bytechars);

// bytes-to-string
var all1bytechars2 = enc.GetString(bytes);

// check string-to-bytes
Debug.Assert(bytes.SequenceEqual(bytes2));

// check bytes-to-string
Debug.Assert(all1bytechars.SequenceEqual(all1bytechars2));

从维基：

ISO-8859-1 被合并为 ISO/IEC 10646 和 Unicode 的前 256 个代码点。

string或者一种将 a 转换为a的简单快速的方法byte[]（带有unchecked和checked变体）

public static byte[] StringToBytes(string str)
{
    var bytes = new byte[str.Length];

    for (int i = 0; i < str.Length; i++)
    {
        bytes[i] = checked((byte)str[i]); // Slower but throws OverflowException if there is an invalid character
        //bytes[i] = unchecked((byte)str[i]); // Faster
    }

    return bytes;
}

score 2 · Accepted Answer

ASCII 是 7 位代码。高位曾用作奇偶校验位，因此“ASCII”可以有偶校验、奇校验或无奇偶校验。您可能会注意到 0x3F（十进制 63）是 ASCII 字符?。这就是 CLR 的 ASCII 编码将非 ASCII 八位字节（大于 0x7F/十进制 127 的字节）转换为的内容。原因是0x80–0xFF 范围内的代码点没有标准的ASCII字符表示。

C# 字符串在内部是 UTF-16 编码的 Unicode。如果您关心的是字符串的字节值，并且您知道字符串实际上是 Unicode 代码点在到范围内的字符U+0000，U+00FF那么这很容易。Unicode 的前 256 个代码点 (0x00–0xFF)，Unicode 块C0 控件和基本拉丁文(\x00-\x7F) 和C1 控件和拉丁文补充(\x80-\xFF) 是“正常”的 ISO-8859-1 字符。像这样一个简单的咒语：

String cmdPacket = "\xFD\x0B\x16MBEPEXE1.";  
byte[] buffer = cmdPacket.Select(c=>(byte)c).ToArray() ;
myDevice.Write(buffer);

会给你byte[]你想要的，在这种情况下

// \xFD   \x0B   \x16   M      B      E     P      E      X      E      1      .
[  0xFD , 0x0B , 0x16 , 0x4d , 0x42 , 0x45, 0x50 , 0x45 , 0x58 , 0x45 , 0x31 , 0x2E ]

score 1 · Accepted Answer

使用 LINQ，您可以执行以下操作：

String cmdPacket = "\xFD\x0B\x16MBEPEXE1.";  
myDevice.Write(cmdPacket.Select(Convert.ToByte).ToArray());

编辑： 添加了解释

首先，您认识到您的字符串实际上只是一个字符数组。您想要的是一个“等效”字节数组，其中每个字节对应一个字符。

要获取数组，您必须将原始数组的每个字符“映射”为新数组中的一个字节。为此，您可以使用内置System.Convert.ToByte(char)方法。

一旦您描述了从字符到字节的映射，它就像通过映射将输入字符串投影到一个数组中一样简单。

希望有帮助！

score 0 · Accepted Answer

我使用 Windows-1252，因为它似乎为字节提供了最大的好处
并且与所有 .NET 字符串值兼容
您可能想要注释掉 ToLower
这是为了与 SQL char（单字节）兼容而构建的

namespace String1byte
{
    /// <summary>
    /// Interaction logic for MainWindow.xaml
    /// </summary>
    public partial class MainWindow : Window
    {
        public MainWindow()
        {
            InitializeComponent();
            String8bit s1 = new String8bit("cat");
            String8bit s2 = new String8bit("cat");
            String8bit s3 = new String8bit("\xFD\x0B\x16MBEPEXE1.");
            HashSet<String8bit> hs = new HashSet<String8bit>();
            hs.Add(s1);
            hs.Add(s2);
            hs.Add(s3);
            System.Diagnostics.Debug.WriteLine(hs.Count.ToString());
            System.Diagnostics.Debug.WriteLine(s1.Value + " " + s1.GetHashCode().ToString());
            System.Diagnostics.Debug.WriteLine(s2.Value + " " + s2.GetHashCode().ToString());
            System.Diagnostics.Debug.WriteLine(s3.Value + " " + s3.GetHashCode().ToString());
            System.Diagnostics.Debug.WriteLine(s1.Equals(s2).ToString());
            System.Diagnostics.Debug.WriteLine(s1.Equals(s3).ToString());
            System.Diagnostics.Debug.WriteLine(s1.MatchStart("ca").ToString());
            System.Diagnostics.Debug.WriteLine(s3.MatchStart("ca").ToString());
        }
    }

    public struct String8bit
    {
        private static Encoding EncodingUnicode = Encoding.Unicode;
        private static Encoding EncodingWin1252 = System.Text.Encoding.GetEncoding("Windows-1252");
        private byte[] bytes;
        public override bool Equals(Object obj)
        {
            // Check for null values and compare run-time types.
            if (obj == null) return false;
            if (!(obj is String8bit)) return false;
            String8bit comp = (String8bit)obj;
            if (comp.Bytes.Length != this.Bytes.Length) return false;
            for (Int32 i = 0; i < comp.Bytes.Length; i++)
            {
                if (comp.Bytes[i] != this.Bytes[i])
                    return false;
            }
            return true;
        }
        public override int GetHashCode()
        {
            UInt32 hash = (UInt32)(Bytes[0]); 
            for (Int32 i = 1; i < Bytes.Length; i++) hash = hash ^ (UInt32)(Bytes[0] << (i%4)*8);
            return (Int32)hash;
        }
        public bool MatchStart(string start)
        {
            if (string.IsNullOrEmpty(start)) return false;
            if (start.Length > this.Length) return false;
            start = start.ToLowerInvariant();   // SQL is case insensitive
            // Convert the string into a byte array
            byte[] unicodeBytes = EncodingUnicode.GetBytes(start);
            // Perform the conversion from one encoding to the other 
            byte[] win1252Bytes = Encoding.Convert(EncodingUnicode, EncodingWin1252, unicodeBytes);
            for (Int32 i = 0; i < win1252Bytes.Length; i++) if (Bytes[i] != win1252Bytes[i]) return false;
            return true;
        }
        public byte[] Bytes { get { return bytes; } }
        public String Value { get { return EncodingWin1252.GetString(Bytes); } }
        public Int32 Length { get { return Bytes.Count(); } }
        public String8bit(string word)
        {
            word = word.ToLowerInvariant();     // SQL is case insensitive
            // Convert the string into a byte array 
            byte[] unicodeBytes = EncodingUnicode.GetBytes(word);
            // Perform the conversion from one encoding to the other 
            bytes = Encoding.Convert(EncodingUnicode, EncodingWin1252, unicodeBytes);
        }
        public String8bit(Byte[] win1252bytes)
        {   // if reading from SQL char then read as System.Data.SqlTypes.SqlBytes
            bytes = win1252bytes;
        }
    }
}

c# - 在 .Net 字符串中表示大于 127 的字节值

4 回答 4

Related

Reference