3

事情已经完成:


从以下链接安装 Hadoop:

http://www.cloudera.com/content/cloudera/en/documentation/cdh4/v4-2-0/CDH4-Installation-Guide/cdh4ig_topic_4_4.html


安装 Hping3 以使用以下命令生成洪水请求:

sudo hping3 -c 10000 -d 120 -S -w 64 -p 8000 --flood --rand-source 192.168.1.12

安装了 snort 以使用以下命令记录上述请求:

sudo snort -ved -h 192.168.1.0/24 -l .

这会生成日志文件 snort.log.1427021231

我可以用它来阅读

sudo snort -r snort.log.1427021231

它给出了表单的输出:

=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ =+=+=+=+=+=+=+=+=+=+=+=+

03/22-16:17:14.259633 192.168.1.12:8000 -> 117.247.194.105:46639 TCP TTL:64 TOS:0x0 ID:0 IpLen:20 DgmLen:44 DF A Seq: 0x6EEE4A6B Ack:0x6DF6015B Win:0x6DF6015B Win: : 24 TCP 选项 (1) => MSS: 1460 =+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ =+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+


我用了

hdfs dfs -put <localsrc> ... <dst>

将此日志文件复制到 HDFS。

现在,Thnigs 我需要帮助:

如何统计日志文件中源 IP 地址、目标 IP 地址、端口地址、协议、时间戳的总数。

(我是否必须编写自己的 Map reduce 程序?或者有一个库。)


我也发现

https://github.com/ssallys/p3

但无法让它运行。查看了 JAR 文件的内容,但无法运行它。

ratan@lenovo:~/Desktop$ hadoop jar ./p3lite.jar p3.pcap.examples.PacketCount

Exception in thread "main" java.lang.ClassNotFoundException:        nflow.runner.Runner
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.hadoop.util.RunJar.main(RunJar.java:201)

谢谢。

4

1 回答 1

1

快速搜索后,您可能需要自定义 MapReduce 作业。

该算法看起来类似于以下伪代码:

Parse the file line by line (or parse every n lines if logs are more than one line long).

in the mapper, use regex to figure out if something is a source IP, destination IP etc.

output these with key value structure of <Type, count> 
    type is the type of text that was matched (ex. source IP)
    count is the number of times it was matched in the record

have reducer sum all of the values from the mappers, and get global totals for each type of information you want

write to file in desired format.
于 2015-03-31T17:57:24.150 回答