0

我正在尝试读取输入文件。输入文件的每个值都插入到 TreeMap 中

  1. 如果单词不存在:将单词插入树形图并将单词与 ArrayList(docId, Count) 关联。
  2. 如果 Word 存在于 TreeMap 中,则检查当前 DocID 是否在 ArrayList 中匹配,然后增加计数。


对于 ArrayList,我创建了另一个类,如下所示:

public class CountPerDocument
{
    private final String documentId;
    private final int count;
    
    CountPerDocument(String documentId, int count)
    {

        this.documentId = documentId;
        this.count = count;
    }

    public String getDocumentId()
    {
        return this.documentId;
    }

    public int getCount()
    {
        return this.count;
    }

}

之后,我尝试将 TreeMap 作为 <DocID - Count> 打印到文本文件中,不确定我在这里做错了什么,但我得到的输出如下:

The Stem is todai:[CountPerDocument@5caf905d, CountPerDocument@27716f4, CountPerDocument@8efb846, CountPerDocument@2a84aee7, CountPerDocument@a09ee92, CountPerDocument@30f39991]

想知道是否有人可以指导我做错了什么,如果我的方法不正确,我应该怎么做?

public class StemTreeMap
{
    private static final String r1 = "\\$DOC";
    private static final String r2 = "\\$TITLE";
    private static final String r3 = "\\$TEXT";
    private static Pattern p1,p2,p3;
    private static Matcher m1,m2,m3;

    public static void main(String[] args)
    {
        BufferedReader rd,rd1;
        String docid = null;
        String id;
        int tf = 0;
        //CountPerDocument cp = new CountPerDocument(docid, count);
        List<CountPerDocument> ls = new ArrayList<>();
        Map<String,List<CountPerDocument>> mp = new TreeMap<>();
        
        try
        {
            rd = new BufferedReader(new FileReader(args[0]));
            rd1= new BufferedReader(new FileReader(args[0]));
            int docCount = 0;
            String line = rd.readLine();
            p1 = Pattern.compile(r1);
            p2 = Pattern.compile(r2);
            p3 = Pattern.compile(r3);
            while(line != null)
            {
                m1 = p1.matcher(line);
                m2 = p2.matcher(line);
                m3 = p3.matcher(line);
                if(m1.find())
                {
                    docid = line.substring(5, line.length());
                    docCount++;
                    //System.out.println("The Document ID is :");
                    //System.out.println(docid);
                    line = rd.readLine();
                }
                if(m2.find()||m3.find())
                {
                    line = rd.readLine();
                    
                }
                else
                {
                    if(!(mp.containsKey(line))) // if the stem is not on the TreeMap
                    {
                        //System.out.println("The stem is not present in the tree");
                        tf = 1;
                        ls.add(new CountPerDocument(docid,tf));
                        mp.put(line, ls);   
                        line = rd.readLine();
                    }
                    else
                    {
                        if(ls.indexOf(docid) > 0) //if its last entry matches the current document number
                        {
                            //System.out.println("The Stem is present for the same docid so incrementing docid");
                            tf = tf+1;
                            ls.add(new CountPerDocument(docid,tf));
                            line = rd.readLine();
                        }
                        else
                        {
                            //System.out.println("Stem is present but not the same docid so inserting new docid");
                            tf = 1;
                            ls.add(new CountPerDocument(docid,tf)); //set did to the current document number and tf to 1
                            line = rd.readLine();
                        }
                    }
                    
                    
                }
                
                
            }
            rd.close();
            System.out.println("The Number of Documents in the file is:"+ docCount);
            
            //Write to an output file
            String l = rd1.readLine();
            File f = new File("dictionary.txt");
            if (f.createNewFile())
            {
                System.out.println("File created: " + f.getName());
            }
            else 
            {
                System.out.println("File already exists.");
                Path path = Paths.get("dictionary.txt");
                Files.deleteIfExists(path);
                System.out.println("Deleted Existing File:: Creating New File");
                f.createNewFile();    
            }
            FileWriter fw = new FileWriter("dictionary.txt");
            fw.write("The Total Number of Stems: " + mp.size() +"\n");
            
            fw.close();
            System.out.println("The Stem is todai:" + mp.get("todai"));
            
        }catch(IOException e)
        {
            e.printStackTrace();
        }
        
        
        
    }
        

}
4

2 回答 2

1

您没有在类CountPerDocument中定义函数String toString()。因此,当您尝试打印 CountPerDocument 变量时,默认打印值为 CountPerDocument@hashcode。

要决定如何在代码中表示CountPerDocument变量,请在您的类中添加下一个函数:

@Override
public String toString() {
     return "<" + this.getDocumentId() + ", " + this.getCount() + ">";
}
于 2020-12-22T09:59:30.563 回答
0

尝试覆盖 CountPerDocument 中的 toString 方法。像这样的东西:

public class CountPerDocument
{
    private final String documentId;
    private final int count;
    
    CountPerDocument(String documentId, int count)
    {

        this.documentId = documentId;
        this.count = count;
    }

    public String getDocumentId()
    {
        return this.documentId;
    }

    public int getCount()
    {
        return this.count;
    }

    @Override
    public String toString() {
        return documentId + "-" + count;
    }
}
于 2020-12-22T09:59:22.107 回答