我会在HashMap中维护计数。我也会避免使用readAll()
,这样您就不必重复数据两次。
只需声明地图
Map<Object, Integer> countMap = new HashMap<String, Integer>();
然后为您在第 3 列中遇到的每个值保留一个计数
String [] row;
while ((row = reader.readNext()) != null) {
String value = row[2]; // value in 3rd column
// default count to 0 if not in map
Integer count = countMap.get(value) != null ? countMap.get(value) : 0;
// increment count in map
countMap.put(value, count + 1);
}
System.out.println("UDP count: " + countMap.get("UDP"));
System.out.println("TCP count: " + countMap.get("TCP"));
作为替代方案,您可以使用高度灵活/可配置的Super CSV 。上述解决方案适用于微不足道的场景(例如保留 1 列的计数),但如果您继续添加越来越多的功能,它很容易变得不可读。Super CSV 具有强大的单元处理器API,它可以自动进行转换和约束,从而大大简化这一过程。
例如,您可以编写一个自定义单元处理器,为它遇到的每个唯一列值维护一个计数。
package example;
import java.util.Map;
import org.supercsv.cellprocessor.CellProcessorAdaptor;
import org.supercsv.cellprocessor.ift.CellProcessor;
import org.supercsv.util.CsvContext;
public class Counter extends CellProcessorAdaptor {
private final Map<Object, Integer> countMap;
public Counter(final Map<Object, Integer> countMap) {
super();
if (countMap == null){
throw new IllegalArgumentException("countMap should not be null");
}
this.countMap = countMap;
}
public Counter(final Map<Object, Integer> countMap, final CellProcessor next) {
super(next);
if (countMap == null){
throw new IllegalArgumentException("countMap should not be null");
}
this.countMap = countMap;
}
@Override
public Object execute(Object value, CsvContext context) {
validateInputNotNull(value, context);
// get count from map (default to 0 if doesn't exist)
Integer count = countMap.get(value) != null ? countMap.get(value) : 0;
countMap.put(value, count + 1);
return next.execute(value, context);
}
}
然后使用第三列的处理器
package example;
import java.io.IOException;
import java.io.StringReader;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.supercsv.cellprocessor.ParseDate;
import org.supercsv.cellprocessor.constraint.NotNull;
import org.supercsv.cellprocessor.ift.CellProcessor;
import org.supercsv.io.CsvListReader;
import org.supercsv.io.ICsvListReader;
import org.supercsv.prefs.CsvPreference;
public class Counting {
private static final String CSV = "id,time,protocol\n" + "1,01:23,UDP\n"
+ "2,02:34,TCP\n" + "3,03:45,TCP\n" + "4,04:56,UDP\n"
+ "5,05:01,TCP";
public static void main(String[] args) throws IOException {
final Map<Object, Integer> countMap = new HashMap<Object, Integer>();
final CellProcessor[] processors = new CellProcessor[] {
new NotNull(), // id
new ParseDate("hh:mm"), // time
new NotNull(new Counter(countMap)) // protocol
};
ICsvListReader listReader = null;
try {
listReader = new CsvListReader(new StringReader(CSV),
CsvPreference.STANDARD_PREFERENCE);
listReader.getHeader(true);
List<Object> row;
while ((row = listReader.read(processors)) != null) {
System.out.println(row);
}
} finally {
listReader.close();
}
System.out.println("Protocol count = " + countMap);
}
}
输出:
[1, Thu Jan 01 01:23:00 EST 1970, UDP]
[2, Thu Jan 01 02:34:00 EST 1970, TCP]
[3, Thu Jan 01 03:45:00 EST 1970, TCP]
[4, Thu Jan 01 04:56:00 EST 1970, UDP]
[5, Thu Jan 01 05:01:00 EST 1970, TCP]
Protocol count = {UDP=2, TCP=3}