我正在尝试读取输入文件。输入文件的每个值都插入到 TreeMap 中
- 如果单词不存在:将单词插入树形图并将单词与 ArrayList(docId, Count) 关联。
- 如果 Word 存在于 TreeMap 中,则检查当前 DocID 是否在 ArrayList 中匹配,然后增加计数。
对于 ArrayList,我创建了另一个类,如下所示:
public class CountPerDocument
{
private final String documentId;
private final int count;
CountPerDocument(String documentId, int count)
{
this.documentId = documentId;
this.count = count;
}
public String getDocumentId()
{
return this.documentId;
}
public int getCount()
{
return this.count;
}
}
之后,我尝试将 TreeMap 作为 <DocID - Count> 打印到文本文件中,不确定我在这里做错了什么,但我得到的输出如下:
The Stem is todai:[CountPerDocument@5caf905d, CountPerDocument@27716f4, CountPerDocument@8efb846, CountPerDocument@2a84aee7, CountPerDocument@a09ee92, CountPerDocument@30f39991]
想知道是否有人可以指导我做错了什么,如果我的方法不正确,我应该怎么做?
public class StemTreeMap
{
private static final String r1 = "\\$DOC";
private static final String r2 = "\\$TITLE";
private static final String r3 = "\\$TEXT";
private static Pattern p1,p2,p3;
private static Matcher m1,m2,m3;
public static void main(String[] args)
{
BufferedReader rd,rd1;
String docid = null;
String id;
int tf = 0;
//CountPerDocument cp = new CountPerDocument(docid, count);
List<CountPerDocument> ls = new ArrayList<>();
Map<String,List<CountPerDocument>> mp = new TreeMap<>();
try
{
rd = new BufferedReader(new FileReader(args[0]));
rd1= new BufferedReader(new FileReader(args[0]));
int docCount = 0;
String line = rd.readLine();
p1 = Pattern.compile(r1);
p2 = Pattern.compile(r2);
p3 = Pattern.compile(r3);
while(line != null)
{
m1 = p1.matcher(line);
m2 = p2.matcher(line);
m3 = p3.matcher(line);
if(m1.find())
{
docid = line.substring(5, line.length());
docCount++;
//System.out.println("The Document ID is :");
//System.out.println(docid);
line = rd.readLine();
}
if(m2.find()||m3.find())
{
line = rd.readLine();
}
else
{
if(!(mp.containsKey(line))) // if the stem is not on the TreeMap
{
//System.out.println("The stem is not present in the tree");
tf = 1;
ls.add(new CountPerDocument(docid,tf));
mp.put(line, ls);
line = rd.readLine();
}
else
{
if(ls.indexOf(docid) > 0) //if its last entry matches the current document number
{
//System.out.println("The Stem is present for the same docid so incrementing docid");
tf = tf+1;
ls.add(new CountPerDocument(docid,tf));
line = rd.readLine();
}
else
{
//System.out.println("Stem is present but not the same docid so inserting new docid");
tf = 1;
ls.add(new CountPerDocument(docid,tf)); //set did to the current document number and tf to 1
line = rd.readLine();
}
}
}
}
rd.close();
System.out.println("The Number of Documents in the file is:"+ docCount);
//Write to an output file
String l = rd1.readLine();
File f = new File("dictionary.txt");
if (f.createNewFile())
{
System.out.println("File created: " + f.getName());
}
else
{
System.out.println("File already exists.");
Path path = Paths.get("dictionary.txt");
Files.deleteIfExists(path);
System.out.println("Deleted Existing File:: Creating New File");
f.createNewFile();
}
FileWriter fw = new FileWriter("dictionary.txt");
fw.write("The Total Number of Stems: " + mp.size() +"\n");
fw.close();
System.out.println("The Stem is todai:" + mp.get("todai"));
}catch(IOException e)
{
e.printStackTrace();
}
}
}