对于以下输入,如何在 Java 中实现标记化?
输入是一个文本文件,需要得到前面提到的输出。
样本输入:
<Person>Peter Cathery</Person> lives in
<Location>Melbourne</Location>,
<Location>Australia</Location>
. He is 50 years old and he spent
<Money>$100000</Money>
on his house and any kind of disputes on his house will be part of the jurisdiction of
<Location>Melbourne</Location>
police station. His date of birth:
<Date>11-12-1982</Date> and he works for
<Organization>IBM Corporation Ltd</Organization>.
需要输出:
Token 1: Peter Cathery
Token 2: lives
Token 3: in
Token 4: Melbourne
Token 5: ,