这是如何解决这个问题的第一次尝试,考虑以下几点:
1. XML 字符串将是有效的(即标签之间不会有任何无效字符)
像这样:
string xml = @"<ENTRY><AUTHOR>C. Qiao</AUTHOR>
<AUTHOR>R.Melhem</AUTHOR>
<TITLE>Reducing Communication </TITLE>
<DATE>1995</DATE>
</ENTRY>";
2. 拆分将按空间完成' '
string xml = @"<ENTRY><AUTHOR>C. Qiao</AUTHOR>
<AUTHOR>R.Melhem</AUTHOR>
<TITLE>Reducing Communication </TITLE>
<DATE>1995</DATE>
</ENTRY>";
XElement doc = XElement.Parse(xml);
foreach (XElement element in doc.Elements())
{
var values = element.Value.Split(' ');
foreach (string value in values)
{
Console.WriteLine(element.Name + " " + value);
}
}
会打印出来
AUTHOR C.
AUTHOR Qiao
AUTHOR R.Melhem
TITLE Reducing
TITLE Communication
TITLE
DATE 1995
编辑:
现在,根据“。”进行拆分。和一个空间,最好的办法是使用正则表达式。像这样:
var values = Regex.Split(element.Value, @"(\.| )");
foreach (string value in values.Where(x=>!String.IsNullOrWhiteSpace(x)))
{
Console.WriteLine(element.Name + " " + value);
}
如果您愿意,可以添加更多分隔符。以下示例将为您提供以下内容:
AUTHOR C
AUTHOR .
AUTHOR Qiao
AUTHOR R
AUTHOR .
AUTHOR Melhem
TITLE Reducing
TITLE Communication
DATE 1995
Edit2:
这是一个适用于您的原始字符串的示例,它很可能不是最好的方法,因为它没有正确的标记顺序,但它应该非常接近:
string xml = @" <entry>
<AUTHOR>C. Qiao</AUTHOR>
and
<AUTHOR>R.Melhem</AUTHOR>,
""<TITLE>Reducing Communication </TITLE>""
,<DATE>1995</DATE>.
</entry>";
//Parse xml to XDocument
XDocument doc = XDocument.Parse(xml);
// Get first element (we only have one)
XElement element = doc.Descendants().FirstOrDefault();
//Create a copy of an element for use by child elements.
XElement copyElement = new XElement(element);
//Remove all child nodes from root leaving only text
element.Elements().Remove();
//Splitting based on the tokens specified
var values = Regex.Split(element.Value, @"(\.| |\,|\"")");
foreach (string value in values.Where(x => !String.IsNullOrWhiteSpace(x)))
{
Console.WriteLine(value);
}
//Getting children nodes and splitting the same way
foreach (XElement elem in copyElement.Elements())
{
var val = Regex.Split(elem.Value, @"(\.| |\,|\"")");
foreach (string value in val.Where(x => !String.IsNullOrWhiteSpace(x)))
{
Console.WriteLine(value + " " + elem.Name);
}
}
//You can try to play with DescendantsAndSelf
//to see if you can do it in single action and with order preserved.
//foreach (XElement elem in element.DescendantsAndSelf())
//{
// //....
//}
这将打印出以下内容:
and
,
"
"
,
.
C AUTHOR
. AUTHOR
Qiao AUTHOR
R AUTHOR
. AUTHOR
Melhem AUTHOR
Reducing TITLE
Communication TITLE
1995 DATE