我的来源是:
<content>
<caption>text 1</caption>
<element1>Notepad is a basic text-editing program and it's most commonly used to view or edit text files. A text <bold>file</bold> is a <a>file</a> type typically identified by the .txt file name extension.</element1>
<section1>
<element2>Notepad is a basic text-editing program and it's most commonly used to view or edit text files. A text file is a file type typically identified by the .txt file name extension.</element2>
</section1>
</content>
我正在尝试为同时具有子(字符元素)和文本的元素(它可能是任何元素)以及只有文本的元素提取和创建唯一 ID。和元素不应分开<bold>
。<a>
<caption id="id1">Text 1</caption>
<element1 id="id2">Notepad is a basic text-editing program and it's most commonly used to view or edit text files. A text <bold>file</bold> is a <a>file</a> type typically identified by the .txt file name extension.</element1>
<element2 id="id3">Notepad....</element2>
任何想法将不胜感激......