I have annotated text file on the following format:
<paragraph><weakness>Buffer</weakness> <weakness>Overflow</weakness>
in <location>client/mysql.cc</location> in <application>Oracle</application>
<application>MySQL</application> and <application>MariaDB</application>
<version>before</version> <version>5.2</version> <vulnerability>allows
</vulnerability> <vulnerability>remote</vulnerability>
<application>database</application> <application>servers</application>
...
...
What I would like to do is to create a Java code to parse the above text file and put it in the following format:
Buffer weakness
overflow weakness
in O <--- 'O' means doesn't have annotation
Oracle application
MySQL application
...
...
I tried to tokenize the file, but the problem is, I will do parsing and formatting again, and I could lose some useful information!!
Please any help !!