文件 a.xml:
<?xml version="1.0" encoding="UTF-8"?>
<TABLE NAME="pivot.cs">
<DATA RECORDS="2">
<RECORD ID="1">
<INTERNALID>5510</INTERNALID>
<SOMED>1</SOMED>
<PEMED>1</PEMED>
<CODAL>PLACEHOLD</CODAL>
</RECORD>
<RECORD ID="2">
<INTERNALID>5511</INTERNALID>
<SOMED>1</SOMED>
<PEMED>1</PEMED>
<CODAL>PLACEHOLD</CODAL>
</RECORD>
<INTERNALID>5537</INTERNALID>
<SOMED>1</SOMED>
<PEMED>1</PEMED>
<CODAL>PLACEHOLD</CODAL>
</DATA>
</TABLE>
文件 b.xml:
<?xml version="1.0" encoding="UTF-8"?>
<TABLE NAME="ALT.CS">
<DATA RECORDS="20">
<RECORD ID="53">
<RECNO>5510</RECNO>
<TOBEEXTRACTED>TIM</TOBEEXTRACTED>
</RECORD>
<RECORD ID="53">
<RECNO>5510</RECNO>
<TOBEEXTRACTED>KLM</TOBEEXTRACTED>
</RECORD>
<RECORD ID="54">
<RECNO>5510</RECNO>
<TOBEEXTRACTED>KAB</TOBEEXTRACTED>
</RECORD>
<RECORD ID="55">
<RECNO>5511</RECNO>
<TOBEEXTRACTED>BUS WEE</TOBEEXTRACTED>
</RECORD>
<RECORD ID="59">
<RECNO>5512</RECNO>
</RECORD>
<RECORD ID="60">
<RECNO>5513</RECNO>
</RECORD>
<RECORD ID="5511">
<RECNO>5598</RECNO>
<TOBEEXTRACTED>FBV</TOBEEXTRACTED>
</RECORD>
</RECORD>
</DATA>
</TABLE>
并且输出文件应该是文件 a.xml,但如果匹配一两次,则将 TOBEEXTRACTED 元素文本附加到 [] 中:
<?xml version="1.0" encoding="UTF-8"?>
<TABLE NAME="pivot.cs">
<DATA RECORDS="2">
<RECORD ID="1">
<INTERNALID>5510</INTERNALID>
<SOMED>1</SOMED>
<PEMED>1</PEMED>
<CODAL>PLACEHOLD</CODAL>
</RECORD>
<RECORD ID="2">
<INTERNALID>5511</INTERNALID>
<SOMED>1</SOMED>
<PEMED>1</PEMED>
<CODAL>PLACEHOLD [BUS WEE]</CODAL>
</RECORD>
<INTERNALID>5537</INTERNALID>
<SOMED>1</SOMED>
<PEMED>1</PEMED>
<CODAL>PLACEHOLD</CODAL>
</DATA>
</TABLE>
此外,如果我们可以有一个 txt 文件作为输出,这将有很大帮助,它将包含以下信息:来自文件 a.xml,
INTERNALID: 5511 (and all the rest in a normal xml file) was matched.
INTERNALID: 5510 was matched more than two times, so no join took place.
INTERNALID: 5537 did not match
RECNO 5512 did not have a TOBEEXTRACTED element.