我一直在寻找解决此问题的通用方法,而不是只关注 EBCDIC 37,并且我不想直观地比较两个代码图表。我编写了一个简短的程序(在 VB.NET 中)来查找存在于一个代码页而不是另一个代码页中的所有字符。
' Pick source and target codepages.
Dim sourceEncoding As Encoding = Encoding.Default ' This is Windows 1252 on Windows OS.
Dim targetEncoding As Encoding = Encoding.GetEncoding("IBM037")
' Get every character in the codepage.
Dim inbytes(256) As Byte
For code As Integer = 0 To 255
inbytes(code) = Convert.ToByte(code)
Next
' Convert the bytes from the source encoding to the target, then back again.
' Those bytes that convert back to the original value exist in both codepages.
' The bytes that change do not exist in the target encoding.
Dim input As String = sourceEncoding.GetString(inbytes)
Dim outbytes As Byte() = Encoding.Convert(sourceEncoding, targetEncoding, inbytes)
Dim convertedbytes As Byte() = Encoding.Convert(targetEncoding, sourceEncoding, outbytes)
Dim output As String = sourceEncoding.GetString(convertedbytes)
Dim diffs As New List(Of Char)()
For idx As Integer = 0 To input.Length - 1
If input(idx) <> output(idx) Then
diffs.Add(input(idx))
End If
Next
' Print results.
Console.WriteLine("Source: " + input)
Console.WriteLine("(Coded): " + String.Join(" ", inbytes.Select(Function (x) Convert.ToInt32(x).ToString()).ToArray()))
Console.WriteLine()
Console.WriteLine("Target: " + output)
Console.WriteLine("(Coded): " + String.Join(" ", convertedbytes.Select(Function (x) Convert.ToInt32(x).ToString()).ToArray()))
Console.WriteLine()
Console.WriteLine("Cannot convert: " + String.Join(" ", diffs.Select(Function (x) Convert.ToInt32(x).ToString()).ToArray()))
对于 Windows 1252 到 EBCDIC 37 的情况,有 27 个字符未映射。我选择了我认为最适合这些角色的东西。