1

我计划<image>@matchedFilePathToAnImageHere</image>只附加到那些标签<item></item>之间的值<name></name>,当转换为小写,用下划线等替换空格时,将与单独文件夹中的实际图像文件名匹配的节点。

该代码将大约 95% 的图像与项目正确匹配,最终将每个匹配的图像文件名附加<image></image>到第一个<item></item>.

我将如何将每个附加<image></image>到他们的适当位置<item></item>每个项目只需要一个图像。

图片文件夹

名称1.jpg

name_2.jpg

名称3.jpg

...

名称 998.jpg

XML(解析前)

<items>
 <item>
  <name>Name1</name>
  <price>Price1</price>
  <description>Description1</description>
 </item>
 <item>
  <name>Name2</name>
  <price>Price2</price>
  <description>Description2</description>
 </item>
 <item>
  <name>Name3</name>
  <price>Price3</price>
  <description>Description3</description>
 </item>
</items>

XML(解析后的期望结果)

<items>
 <item>
  <name>name1</name>
  <price>Price1</price>
  <description>Description1</description>
  <image>C:\path\to\name1.jpg</image>
 </item>
 <item>
  <name>Name2</name>
  <price>Price2</price>
  <description>Description2</description>
      <!-- no image file name matched `name2`(command line notice), so skip appending image tags here BUT I add the image tag here later by hand, because I find out that there's an image `name_2.jpg` -->
 </item>
     <item>
  <name>Name3</name>
  <price>Price3</price>
  <description>Description3</description>
  <image>C:\path\to\name3.jpg</image>
 </item>
</items>

代码

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
using System.IO;

namespace myXmlParser
{
    class Program
    {
        static void Main(string[] args)
        {
            // load the xml
            XmlDocument doc = new XmlDocument();
            doc.Load(@"C:\items.xml");


            // retrieve the values between <name></name> for the every item element
            XmlNodeList nodes = doc.SelectNodes("//item/name/text()");

            // convert every extracted name value to lower case
            // replace spaces with underscores
            // remove the ' symbols
            // to have higher chance of matching images' file names
            // ",_the" and "_,a" replaces don't seem to work( oh well )
            for (int i = 0; i < nodes.Count; i++)
            {

                // do the magic!
                string searchKeyword = nodes.Item(i).Value.Replace(" ", "_").Replace("'","").Replace(",_the",(string)"").Replace(",_a","").ToLower();

                //Console.WriteLine(searchKeyword);

                // Now find me any images whose filenames match the searchKeyword minus the extensions
                string[] filePaths = Directory.GetFiles(@"C:\images", searchKeyword + "*", SearchOption.TopDirectoryOnly);

                // if something was found/matched then append <image>@pathToImageHere</image> to the current
                // item node, otherwise log any item nodes that didn't have a match to an image
                // ! Doesn't APPEND properly !
                if (filePaths.Length > 0)
                {
                    XmlDocumentFragment frag = doc.CreateDocumentFragment();
                    frag.InnerXml = @"<image>" + filePaths[0] + @"</image>";
                    doc.DocumentElement.FirstChild.AppendChild(frag);
                }
                else
                {
                    Console.WriteLine("NO IMAGE WAS FOUND!!! {0}", searchKeyword);
                }

                //Console.WriteLine(filePaths[j]);
                //foreach (string filePath in filePaths)
                //{
                    //blah  
                //}
            }

            // now save the new parsed xml somewhere
            doc.Save("items_with_images.xml");

            Console.ReadKey();

        }// main
    }// class
}// namespace
4

1 回答 1

0
doc.DocumentElement.FirstChild.AppendChild(frag);

为什么要附加到文档的第一个孩子。您不希望附加到当前节点吗?

 nodes.Item(i).ParentNode.ParentNode.AppendChild(frag);
于 2012-08-08T00:59:53.767 回答