java - 从 Java 中对 XML 进行排序和比较

Question

我得到了 XML 和模式文件。我的目标是输出 XML 中的所有数据（不重复）并按出生日期排序此列表。目前我打印出了所有数据（有重复），我不知道下一步该做什么。我尝试了不同的事情，但没有成功。

score 1 · Accepted Answer

HashSet将取决于Node.equals()确定相等性的方法，并且您正在添加不同的节点，尽管具有相同的基础文本。从文档：

如果此集合不包含元素 e2，则将指定元素 e 添加到此集合中，使得 (e==null ? e2==null : e.equals(e2))

我会从 , 中提取基础文本 ( String) Node，并且 aHashSet<String>将正确确定唯一性。

score 0 · Accepted Answer

最好使用单节点详细信息创建 Java Bean (POJO)。覆盖equals()和hashcode()相同。将所有节点数据存储到 Bean 列表中。然后用于LinkedHashSet删除重复项。实现Comparable或使用Comparator和Collections.sort()排序相同。

在另一个类中扩展或封装Node并覆盖equals()和hashcode()相同。将所有Nodes 存储到List新类实例中。然后用于LinkedHashSet删除重复项。实现Comparable或使用Comparator和Collections.sort()排序相同。

score 0 · Accepted Answer

编辑

再次阅读该帖子后，我意识到我也需要删除重复项，所以：

您可以使用 aTreeSet强加 unqiueness 并按 DOB 排序 - 我假设具有相同名字、姓氏和出生日期的人是同一个人。

首先，我会将您的 Node 包装在一个实现的类中，该类Comparable还可以获取您拥有的所有这些属性。包装器需要实现Comparable，因为TreeSet使用此方法来确定元素是否不同（a.compareTo(b) != 0）以及如何对它们进行排序。

public static final class NodeWrapper implements Comparable<NodeWrapper> {

    private static final SimpleDateFormat DOB_FORMAT = new SimpleDateFormat("yyyy-MM-dd");
    private final Element element;
    private final Date dob;
    private final String firstName;
    private final String surName;
    private final String sex;

    public NodeWrapper(final Node node) {
        this.element = (Element) node;
        try {
            this.dob = DOB_FORMAT.parse(initDateOfBirth());
        } catch (ParseException ex) {
            throw new RuntimeException("Failed to parse dob", ex);
        }
        this.firstName = initFirstName();
        this.surName = initSurnameName();
        this.sex = initSex();
    }

    private String initFirstName() {
        return getNodeValue("firstname");
    }

    private String initSurnameName() {
        return getNodeValue("surname");
    }

    private String initDateOfBirth() {
        return getNodeValue("dateofbirth");
    }

    private String initSex() {
        return getNodeValue("sex");
    }

    private String getNodeValue(final String name) {
        return element.getElementsByTagName(name).item(0).getTextContent();
    }

    public Node getNode() {
        return element;
    }

    Date getDob() {
        return dob;
    }

    public String getFirstName() {
        return firstName;
    }

    public String getSurName() {
        return surName;
    }

    public String getDateOfBirth() {
        return DOB_FORMAT.format(dob);
    }

    public String getSex() {
        return sex;
    }

    public int compareTo(NodeWrapper o) {
        int c;
        c = getDob().compareTo(o.getDob());
        if (c != 0) {
            return c;
        }
        c = getSurName().compareTo(o.getSurName());
        if (c != 0) {
            return c;
        }
        return getFirstName().compareTo(o.getFirstName());
    }

    @Override
    public int hashCode() {
        int hash = 5;
        hash = 47 * hash + (this.dob != null ? this.dob.hashCode() : 0);
        hash = 47 * hash + (this.firstName != null ? this.firstName.hashCode() : 0);
        hash = 47 * hash + (this.surName != null ? this.surName.hashCode() : 0);
        return hash;
    }

    @Override
    public boolean equals(Object obj) {
        if (obj == null) {
            return false;
        }
        if (getClass() != obj.getClass()) {
            return false;
        }
        final NodeWrapper other = (NodeWrapper) obj;
        if (this.dob != other.dob && (this.dob == null || !this.dob.equals(other.dob))) {
            return false;
        }
        if ((this.firstName == null) ? (other.firstName != null) : !this.firstName.equals(other.firstName)) {
            return false;
        }
        if ((this.surName == null) ? (other.surName != null) : !this.surName.equals(other.surName)) {
            return false;
        }
        return true;
    }

    @Override
    public String toString() {
        return "FirstName: " + getFirstName() + ". Surname: " + getSurName() + ". DOB: " + getDateOfBirth() + ". Sex: " + getSex() + ".";
    }
}

因此，如果出生日期、姓氏和名字都相等，我们假设它是同一个人 - 我们返回 0。这是一个很好的做法，如果compareTo以这种方式使用以使其与 equals 一致，那么如果a.compareTo(b)==0then a.equals(b)，我添加了要求equals和hashCode方法也是如此。

现在你可以TreeSet在你的代码中使用 a 来自动排序并保证 unqiueness：

final Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new File("file.xml"));

final Set<NodeWrapper> inimesteList = new TreeSet<NodeWrapper>();

final NodeList isa = doc.getElementsByTagName("isa");
for (int i = 0; i < isa.getLength(); i++) {
    inimesteList.add(new NodeWrapper(isa.item(i)));
}
final NodeList ema = doc.getElementsByTagName("ema");
for (int i = 0; i < ema.getLength(); i++) {
    inimesteList.add(new NodeWrapper(ema.item(i)));
}
final NodeList isik = doc.getElementsByTagName("isik");
for (int i = 0; i < isik.getLength(); i++) {
    inimesteList.add(new NodeWrapper(isik.item(i)));
}
System.out.println();
System.out.println("Total: " + inimesteList.size());

for (final NodeWrapper nw : inimesteList) {
    System.out.println(nw);
}

我还添加了一个toString方法并用它来打印节点 - 这使代码更清晰。

这种Document方法虽然看起来比 JAXB 简单，但却充满了这种乏味。由于您已经有一个架构，我强烈建议您迁移到xjcJAXB 解组 - 这将使这类事情变得容易数百倍。

java - 从 Java 中对 XML 进行排序和比较

3 回答 3

Related

Reference