0

我有使用 win98 以旧哈萨克字体(哈萨克斯坦)编写的文件。现在我们使用 Times New Roman,但这种字体显示奇怪的 unicode 字符。我可以使用替换 (Ctrl + H) 将所有符号更改为 Times New Roman 编码,但我们有 42 个(两种情况下都是 84 个)字母。

例如,我在第一行有来自旧字体的所有符号,在第二行有来自新字体的所有符号,顺序相同。

有人可以编写一个示例脚本,逐个字符地读取这两行,在Java中制作类似字典的东西,然后进行全局替换。

更新

谢谢罗曼普利施克!

我写了一个宏,它递归地应用于某个文件夹中的所有 *.doc 文件。

Sub Substitution()
'
' Substitution of the chars from font Times/Kazakh
' to Times New Roman
' Chars to substitute are 176-255 bytes, 73 and 105 byte
Dim sTab As String
    sTab = "£ª½¥¡¯Ž¼º¾´¢¿žÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ"
    Selection.Find.Font.Shadow = False
    Selection.Find.Replacement.Font.Shadow = False
    For i = 1 To Len(sTab)
    With Selection.Find
        .Text = ChrW(i + 175)
        .Replacement.Text = Mid(sTab, i, 1)
        .Forward = True
        .Wrap = wdFindContinue
        .Format = True
        .MatchCase = True
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    Selection.Find.Text = Selection.Find.Text
    Next i
    Selection.Find.Execute Replace:=wdReplaceAll
    With Selection.Find
        .Text = ChrW(105)
        .Replacement.Text = "³"
        .Forward = True
        .Wrap = wdFindContinue
        .Format = True
        .MatchCase = True
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    Selection.Find.Text = Selection.Find.Text

    With Selection.Find
        .Text = ChrW(73)
        .Replacement.Text = "²"
        .Forward = True
        .Wrap = wdFindContinue
        .Format = True
        .MatchCase = True
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    Selection.Find.Text = Selection.Find.Text

    ' kazakh language
    Selection.WholeStory
    Selection.LanguageID = WdLanguageID.wdKazakh
    Application.CheckLanguage = False
    Selection.Collapse Direction:=wdCollapseStart
End Sub

    ' Function that Call Substitution() for all documents
    ' in folder vDirectory
Sub LoopDirectory()
    Dim vDirectory As String
    Dim oDoc As Document

    vDirectory = "E:\soft\Dedushka\not\"

    vFile = Dir(vDirectory & "*.doc")

    Do While vFile <> ""
    Set oDoc = Documents.Open(FileName:=vDirectory & vFile)

    Debug.Print ActiveDocument.Name + " Started"
    Call Zamena
    Debug.Print ActiveDocument.Name + " Finish"

    oDoc.Close SaveChanges:=True
    vFile = Dir
    Loop
End Sub
4

1 回答 1

1

我将这个子程序用于类似的转换。代码的“心脏”是字符串sTab的定义。此字符串包含代码 127 及以上的所有字符。一个一个地用新字符填充这个字符串。

如果您有旧哈萨克语编码的代码表,这很简单:在 VBA 编辑器中输入所有以 127 字符开头的字符。VBA 编辑器在 Unicode 中工作,所以这工作。

如果您没有代码表,则必须获取每个字符的旧代码(尝试选择此字符并按 Alt+X)并手动将其写入字符串的正确位置。

在这两种情况下,对于未使用(或异常)的字符,您可以填充空格或其他字符。

其余代码将每个字符替换为 127 以上的代码,用于sTab中的新字符。

Sub Convert()
    Dim sTab As String
    Dim sKod As String
    Dim i As Long
    Dim ch As String

    'new chars 127-255:
    'note: for each character above 127 fill in this table unicode character
    sTab = "ÄÃãÉ¥ÖÜá¹ÈäèÆæéŸÏí“”ëEóeôöoúÌìü†°Ê£§•¶ß®©™ê¨‡gIlÎ__îK__³Ll¼¾ÅåNnѬVñÒ_«»… òÕOõO–—“”‘’÷_OÀàØ‹›øRrŠ‚„šŒœÁÍŽžUÓÔuÙÚùÛûUuÝýk¯£¿G¡"

    'clear all shadow - we use this attrib as flag for changed characters
    Selection.Find.ClearFormatting
    Selection.Find.Font.Shadow = True
    Selection.Find.replacement.ClearFormatting
    Selection.Find.replacement.Font.Shadow = False
    With Selection.Find
        .Text = ""
        .replacement.Text = ""
        .Forward = True
        .Wrap = wdFindContinue
        .Format = True
        .MatchCase = True
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    'changing characters by codetable
    Selection.Find.Font.Shadow = False
    Selection.Find.replacement.Font.Shadow = True
    For i = 1 To Len(sTab)
        With Selection.Find
            ch = Chr(126 + i)
            If ch = "^" Then ch = "^^"
            .Text = ch
            ch = Mid(sTab, i, 1)
            If ch = "^" Then ch = "^^"
            .replacement.Text = ch
            .Forward = True
            .Wrap = wdFindContinue
            .Format = True
            .MatchCase = True
            .MatchWholeWord = False
            .MatchWildcards = False
            .MatchSoundsLike = False
            .MatchAllWordForms = False
        End With
        Selection.Find.Execute Replace:=wdReplaceAll
        Selection.Find.Text = Selection.Find.Text
    Next i
    'clear shadows
    Selection.Find.Font.Shadow = True
    Selection.Find.replacement.Font.Shadow = False
    With Selection.Find
        .Text = ""
        .replacement.Text = ""
        .Forward = True
        .Wrap = wdFindContinue
        .Format = True
        .MatchCase = True
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    ' kazakh language
    Selection.WholeStory
    Selection.LanguageID = WdLanguageID.wdKazakh
    Application.CheckLanguage = False
    Selection.Collapse Direction:=wdCollapseStart
End Sub
于 2013-11-11T12:25:53.833 回答