1

我会尽量让你容易理解:

   <!--
   <BASIC_INFO>
       KOREAN                      =   ¼®À¯
       ENGLISH                     =   OIL
       CODE                        =   AA01
       ACTIVE                      =   FALSE
       LABEL                       =   0
   </BASIC_INFO>
   <OPTION>
       ANIMATION                   =   ¿©±â¿¡ ¼³¸í
   </OPTION>
   <BUY_INFO>
       BUYABLE                     =   FALSE
       BUYTYPE                     =   9
       BUYOPTION                   =   0
       COST                        =   0
       ADD_DINAR                   =   0
       REQ_BP                      =   0
       REQ_LVL                     =   1
       RANDOM_NUM                  =   0
   </BUY_INFO>
   <USE_INFO>
       APPLY_TARGET                =   0
       APPLY_OPTION                =   0
       ADD_POING                   =   0
       DURATIONTIME                =   0
   </USE_INFO>
   <ABILITY_INFO>
   </ABILITY_INFO>
   //-->
   <!--
   <BASIC_INFO>
       KOREAN                      =   Âü³ª¹«
       ENGLISH                     =   OAK
       CODE                        =   AB01
       ACTIVE                      =   FALSE
       LABEL                       =   0
   </BASIC_INFO>
   <OPTION>
       ANIMATION                   =   ¿©±â¿¡ ¼³¸í
   </OPTION>
   <BUY_INFO>
       BUYABLE                     =   FALSE
       BUYTYPE                     =   9
       BUYOPTION                   =   0
       COST                        =   0
       ADD_DINAR                   =   0
       REQ_BP                      =   0
       REQ_LVL                     =   1
       RANDOM_NUM                  =   0
   </BUY_INFO>
   <USE_INFO>
       APPLY_TARGET                =   0
       APPLY_OPTION                =   0
       ADD_POING                   =   0
       DURATIONTIME                =   0
   </USE_INFO>
   <ABILITY_INFO>
   </ABILITY_INFO>
   //-->

我想匹配 <!-- //--> 中的所有内容,找不到正则表达式...第一个匹配项应如下所示:

   <BASIC_INFO>
       KOREAN                      =   ¼®À¯
       ENGLISH                     =   OIL
       CODE                        =   AA01
       ACTIVE                      =   FALSE
       LABEL                       =   0
   </BASIC_INFO>
   <OPTION>
       ANIMATION                   =   ¿©±â¿¡ ¼³¸í
   </OPTION>
   <BUY_INFO>
       BUYABLE                     =   FALSE
       BUYTYPE                     =   9
       BUYOPTION                   =   0
       COST                        =   0
       ADD_DINAR                   =   0
       REQ_BP                      =   0
       REQ_LVL                     =   1
       RANDOM_NUM                  =   0
   </BUY_INFO>
   <USE_INFO>
       APPLY_TARGET                =   0
       APPLY_OPTION                =   0
       ADD_POING                   =   0
       DURATIONTIME                =   0
   </USE_INFO>
   <ABILITY_INFO>
   </ABILITY_INFO>
<!--(?<NodeContent>[^//\-\-\>]*)//-->

这是我尝试过的,但它匹配每个字符!这意味着如果 /、- 和 > 在 <!-- //--> 内,它将失败。有人知道如何解决这个问题吗?

编辑

这是整个文档结构的样子:http://pastebin.com/cyESrLTB - 我的目标是将其转换为 XML。

4

2 回答 2

3

尝试:

<!--(?<NodeContent>.*?)//-->

?匹配标记为“惰性”,因此它将尝试匹配尽可能少的字符。打破这个:

  • <!--- 匹配<!--
  • (?<NodeContent>.*?)- 懒惰匹配.*?,并给它一个组名NodeContent
  • //-->- 匹配//-->
于 2013-08-29T13:38:36.077 回答
2

你在这里不需要正则表达式,使用像HtmlAgilityPack这样的 html 解析器

var doc = new HtmlAgilityPack.HtmlDocument();
doc.Load(fname);
var comments = doc.DocumentNode.SelectNodes("//comment()")
                .Select(n => n.InnerText)
                .ToList();
于 2013-08-29T13:40:07.870 回答