I have written the code below to read XML files (file_1.xml and file_2.xml) and to extract the string between tags and to write it down into a TXT file. The issue is that some strings include double quotation marks and the program then takes these characters as being proper instructions (not part of the strings)...
Content of file_1.xml :
<AAA>C086002-T1111</AAA>
<AAA>C086002-T1222 </AAA>
<AAA>C086002-TR333 "</AAA>
<AAA>C086002-T5444 </AAA>
Content of file_2.xml :
<AAA>C086002-T5555 </AAA>
<AAA>C086002-T1666</AAA>
<AAA>C086002-T1777 "</AAA>
<AAA>C086002-T1888 "</AAA>
My code :
@echo off
setlocal enabledelayedexpansion
for /f "delims=;" %%f in ('dir /b D:\depart\*.xml') do (
for /f "usebackq delims=;" %%z in ("D:\depart\%%f") do (
(for /f "delims=<AAA></AAA> tokens=2" %%a in ('echo "%%z" ^| Findstr /r "<AAA>"') do (
set code=%%a
set code=!code:""=!
set code=!code: =!
echo !code!
)) >> result.txt
)
)
I get this in result.txt :
C086002-T1111
C086002-T1222
C086002-T5444
C086002-T5555
C086002-T1666
In fact, 3 out of the 8 lines are missing. These lines include double quotation marks or follow lines that include double quotation marks...
How can I deal with these characters and consider them as parts of the strings ?