I want to replace rare words with _RARE_
in a JSON tree using JAVA.
My rareWords list contains
late
populate
convicts
So for JSON below
["S", ["PP", ["ADP", "In"], ["NP", ["DET", "the"], ["NP", ["ADJ", "late"], ["NOUN", "1700<s"]]]], ["S", ["NP", ["ADJ", "British"], ["NOUN", "convicts"]], ["S", ["VP", ["VERB", "were"], ["VP", ["VERB", "used"], ["S+VP", ["PRT", "to"], ["VP", ["VERB", "populate"], ["WHNP", ["DET", "which"], ["NOUN", "colony"]]]]]], [".", "?"]]]]
I should get
["S", ["PP", ["ADP", "In"], ["NP", ["DET", "the"], ["NP", ["ADJ", "_RARE_"], ["NOUN", "1700<s"]]]], ["S", ["NP", ["ADJ", "British"], ["NOUN", "_RARE_"]], ["S", ["VP", ["VERB", "were"], ["VP", ["VERB", "used"], ["S+VP", ["PRT", "to"], ["VP", ["VERB", "populate"], ["WHNP", ["DET", "which"], ["NOUN", "colony"]]]]]], [".", "?"]]]]
Notice how
["ADJ","late"]
was replaced by
["ADJ","_RARE_"]
My code so far is like below:
I recursively iterate over the tree and as soon as rare word is found, I create a new JSON array and try to replace the existing tree's node with it. See // this Doesn't work
in below, that is where I got stuck. The tree remains unchanged outside of this function.
public static void traverseTreeAndReplaceWithRare(JsonArray tree){
//System.out.println(tree.getAsJsonArray());
for (int x = 0; x < tree.getAsJsonArray().size(); x++)
{
if(!tree.get(x).isJsonArray())
{
if(tree.size()==2)
{
//beware it will get here twice for same word
String word= tree.get(1).toString();
word=word.replaceAll("\"", ""); // removing double quotes
if(rareWords.contains(word))
{
JsonParser parser = new JsonParser();
//This works perfectly
System.out.println("Orig:"+tree);
JsonElement jsonElement = parser.parse("["+tree.get(0)+","+"_RARE_"+"]");
JsonArray newRareArray = jsonElement.getAsJsonArray();
//This works perfectly
System.out.println("New:"+newRareArray);
tree=newRareArray; // this Doesn't work
}
}
continue;
}
traverseTreeAndReplaceWithRare(tree.get(x).getAsJsonArray());
}
}
code for calling above, I use google's gson
JsonParser parser = new JsonParser();
JsonElement jsonElement = parser.parse(strJSON);
JsonArray tree = jsonElement.getAsJsonArray();