1

我正在编写的宏有问题。我需要将数据库(词汇表)中的导出文件转换为另一个标签结构,以便能够将其导入另一个数据库。

我做了几乎所有的步骤,但我遇到的麻烦是接下来的事情。大多数条目是双语的。但是,有些条目有多个值(最多 3 个)英文条目。

因此,我需要检查标签序列,如果发现双英文条目,则将其转换为两个条目。这个完成了。

问题在于,即使宏找到“正确”条目,它也不会忽略它并跳转到下一个条目,而是会尝试修改它,就好像它是错误的一样。

这是宏代码:

Sub CheckTagSequence()
'DECLARATION OF VARIABLES
Dim textline As String
Dim SourceLang, TargetLang, EntryID As String
Dim i As String
Dim objWdRange As String

'ASSIGNING VALUES TO THE VARIABLES
SourceLang = "<enTerm>"
TargetLang = "<frTerm>"
i = "<entry id="">"

'GO TO FIRST LINE
Selection.GoTo what:=gotoline, which:=GoToFirst
' MOVE DOWN TWO LINES
Selection.MoveDown unit:=wdLine, Count:=2
CONTINUA:
If Left(textline, 8) = i Then ID = textline
Selection.MoveDown unit:=wdLine, Count:=1
If Left(textline, 8) = "<subject" Then su = textline
Selection.MoveDown unit:=wdLine, Count:=1
If Left(textline, 8) = SourcLang Then en = textline
Selection.MoveDown unit:=wdLine, Count:=1
**If Left(textline, 8) = TargetLang Then fr = textline
Selection.MoveDown unit:=wdLine, Count:=1
If Left(textline, 8) = "</entry>" Then**
Selection.GoTo CONTINUA
ElseIf Left(textline, 8) = SourceLang Then GoTo CORREGGI
End If

CORREGGI:
Selection.MoveUp unit:=wdLine, Count:=3
Selection.HomeKey unit:=wdLine
Selection.MoveDown unit:=wdLine, Count:=2, Extend:=wdExtend
Selection.Copy
Selection.MoveDown unit:=wdLine, Count:=1
Selection.Paste
Selection.MoveDown unit:=wdLine, Count:=1
Selection.MoveDown unit:=wdLine, Count:=2, Extend:=wdExtend
Selection.Copy
Selection.MoveUp unit:=wdLine, Count:=3
Selection.HomeKey unit:=wdLine
Selection.Paste
Selection.MoveDown unit:=wdLine, Count:=1
If Left(textline, 8) = i Then GoTo CONTINUA
End Sub

它阻塞在这些行:

If Left(textline, 8) = TargetLang Then fr = textline
Selection.MoveDown unit:=wdLine, Count:=1
If Left(textline, 8) = "</entry>" Then
Selection.GoTo CONTINUA

这是一个示例文件的内容:

<?xml version=“1.0” encoding=“UTF-8”?>
<body>
<entry id=““&gt;
<subject>IRECRUITMENT</subject>
<enTerm>Media Relations</enTerm>
<frTerm>Relations avec les médias</frTerm>
</entry>
<entry id=““&gt;
<subject>IRECRUITMENT</subject>
<enTerm>OCEM</enTerm>
<frTerm>Relations avec les médias</frTerm>
</entry>
<entry id=““&gt;
<subject>IRECRUITMENT</subject>
<enTerm>STATISTICS</enTerm>
<enTerm>FIPSS</enTerm>
<frTerm>STATISTIQUES</frTerm>
</entry>
<entry id=““&gt;
<subject>IRECRUITMENT</subject>
<enTerm>3rd Nationality</enTerm>
<frTerm>3ème nationalité</frTerm>
</entry>
<entry id=””&gt;
<subject>IRECRUITMENT</subject>
<enTerm>FINANCE</enTerm>
<enTerm>CSSDF</enTerm>
<frTerm>FINANCES</frTerm>
</entry>
</body>

预先感谢您的帮助!

4

1 回答 1

0

I wouldn't attempt this the way you are going about it, or even the way I am going to suggest here. I'd probably try to read the text into an MSXML object and manipulate the XML tree in there, then write it back. But if your data is as simple as you describe, you only have and elements to deal with, and you can only have 1, 2 or 3 elements, then I think the following code will work and will show you another way that you can begin to approach this kind of task. However, if your actual data is more complex, there will still be quite a lot of work to do.

I make some comments after the code.

Sub reorgEntries()
Const strFindEnTerm As String = "(\<enTerm\>*\</enTerm\>^13)"
Dim i As Integer
Dim rngContent As Word.Range
Dim rngEntry As Word.Range
Dim strFIndEntry As String
Dim strFindEnTerms(2) As String
Dim strReplaceEnTerms(2) As String
' Finds a complete entry
strFIndEntry = "^13\<entry*\</entry\>"
' First find and replace entries with 3 En terms
strFindEnTerms(1) = "(^13*)" & strFindEnTerm & strFindEnTerm & strFindEnTerm & "(*\</entry\>)"
strReplaceEnTerms(1) = "\1\2\5\1\3\5\1\4\5"
' Then with 2 terms
strFindEnTerms(2) = "(^13*)" & strFindEnTerm & strFindEnTerm & "(*\</entry\>)"
strReplaceEnTerms(2) = "\1\2\4\1\3\4"
For i = 1 To 2
  Call ClearFindAndReplaceParameters
  Set rngContent = ActiveDocument.Range
  With rngContent.Find
    .ClearFormatting
    .Text = strFIndEntry
    .Replacement.Text = ""
    .Forward = True
    .Wrap = wdFindStop
    .Format = False
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = True
    .MatchSoundsLike = False
    .MatchAllWordForms = False
    While .Execute ' (Replace:=WdReplace.wdReplaceNone)
      Set rngEntry = rngContent.Duplicate
      With rngEntry.Find
        .ClearFormatting
        .Text = strFindEnTerms(i)
        .Replacement.Text = strReplaceEnTerms(i)
        .Forward = True
        .Wrap = wdFindStop
        .Format = False
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = True
        .MatchSoundsLike = False
        .MatchAllWordForms = False
        While .Execute(Replace:=WdReplace.wdReplaceOne)
        Wend
      End With
      Set rngEntry = Nothing
    Wend
  End With
  Set rngContent = Nothing
Next
End Sub

Sub ClearFindAndReplaceParameters()

' You may need this to make wildcard searches
' work properly after a failed wildcard search
' (there is/was an error in Word)
With Selection.Find
  .ClearFormatting
  .Replacement.ClearFormatting
  .Text = ""
  .Replacement.Text = ""
  .Forward = True
  .Wrap = wdFindStop
  .Format = False
  .MatchCase = False
  .MatchWholeWord = False
  .MatchWildcards = False
  .MatchSoundsLike = False
  .MatchAllWordForms = False
End With

End Sub

As you can tell, the code uses Word's Find/Replace methods to do the replacement, and specifically uses wildcard matching. This is similar to using "regex" or "regular expressions", but Word's builtin regular expression syntax is different (and generally more unhelpful) from most other regex dialects. If you are unfamiliar with regex, it may take some time to understand how this works, but you can do an internet search for articles about Word's regex and work it out. The main thing is that the "()" group the regex into numbered parts, and when you replace by "\1\2\4\1\3\4" you are replacing the found text with parts 1, 2 and 4, then parts 1, 3 and 4.

To make the regex expressions simpler the code loops looking for "Entries", then processes each Entry. Searching only for entries with more than one enTerm element is considerably harder - in fact I am not even sure it is possible in Word's regex dialect. If someone has a regex that works, I hope they'll tell us.

Unfortunately, this particular code is already at its limits - if you had to search for 4 EnTerms as well, you could not simply extend it because you can only specify 10 repelacmeent parts in the Replace string.

Because you also have to consider what happens to the Ranges after you make a replacement, it was simpler in this case to make two complete passes through the text.

Now, just some comments on the code you posted here in case you would prefer to try to fix that.

  • You don't set textline (you would need to set it to Selection.Text or possibly Selection.Paragraphs(1).Text )
  • You would need to compare Left(textline,8) with Left(i,8) rather than with i
  • I think you would probably need Goto CONTINUA rather than Selection.Goto CONTINUA
  • As a matter of programming style, it's better to try to avoid Goto statements. Apart from anything else, it becomes very hard for people to understand what you are actually trying to achieve.

Finally, the variable name "i" is usually used as an integer variable, especially as a loop counter and so on. For a temporary string, some people would use "s" . Others would always use a longer name such as strEntry.

于 2013-10-10T13:50:30.067 回答