-1

I have stored some JSON objects in MongoDB collections and using Mongo jar and Java I did full text search, and I am extracting one of my DB fields using below code:

String tags2=dbo.getString("Tags");

Result:

[["pdf","java","c++"]["perl","pdf","c"]["java","c++"]]

My requirement is to split all the words and remove duplicates. I require the following output:

pdf
java
c++
c
perl

Can you please suggest a way to get this?

4

3 回答 3

1

由于您的 JSON 无效,您可以使用正则表达式模式解析输出以提取值。然后可以将这些添加到 aSet以删除重复项,例如:

final Pattern p = Pattern.compile("\"(.*?)\"");
final Matcher m = p.matcher(tags2);

final Set<String> unique = new HashSet<String>();
while (m.find()) {
    unique.add(m.group());
}

assert unique == ["perl", "java", "c", "c++", "pdf"];

Set如果某种排序对您很重要,您可能需要使用不同的实现。

或者,如果您的 JSON 有效,您可以简单地执行以下操作:

final String[][] result = new Gson().fromJson(tags2, String[][].class);

然后将 的内容添加resultSet.

于 2013-07-12T12:22:27.357 回答
0

GSON等各种库将帮助您做到这一点。

整数示例:

int[] ints2 = gson.fromJson("[1,2,3,4,5]", int[].class); 

您的示例通过以下方式解决:

String tags2 = "[[\"pdf\",\"java\",\"c++\"],[\"perl\",\"pdf\",\"c\"],[\"java\",\"c++\"]]"; 
// added commas between the arrays to make sure the JSON is valid
// your code: String tags2=dbo.getString("Tags");
Set<String> elems = new HashSet<String>();
JsonElement rootJSonElement  = new JsonParser().parse(tags2);
for (JsonElement jsonElement : rootJSonElement.getAsJsonArray()) {
    for (JsonElement innerJsonElement : jsonElement.getAsJsonArray()) {
        elems.add(innerJsonElement.getAsString());
    } 
}
System.out.println(elems);
于 2013-07-12T11:32:51.267 回答
0

在 GSON 中这样使用

JsonArray finalResult = null;
JsonParser  parser = new JsonParser();
String tags2=dbo.getString("Tags");
JsonElement elem   = parser.parse(tags2);
finalResult = elem.getAsJsonArray();
for(int i=o;i<finalResult .size();i++)
{
//Get individual array and get the fields as String and store it anywhere
}
于 2013-07-12T11:36:22.673 回答