1

我是 R 的新手。这次我真的需要读取包括时间、ip 和类似这样的数据:

18:00:04.940864 129.63.50.235.53 > 129.63.71.70.1111:  udp 107
18:00:04.957456 129.63.80.240.161 > 129.63.152.10.39518:  udp 151
18:00:04.958432 129.63.152.10.39518 > 129.63.80.240.161:  udp 136 (DF)
18:00:04.963312 217.79.96.182.53 > 129.63.1.1.1564:  udp 48 (DF)
18:00:05.000976 129.63.50.235.1028 > 218.232.110.133.53:  udp 34
18:00:05.207888 129.63.50.235.1028 > 203.50.0.24.53:  udp 32

我从

read.table(file='sample.txt',head=F,'%H:%M:%S',sep='')

比我被困在这一点上,因为分离类型很少:空格,'>'和':'最后是最后一个可以或不可以有(DF)的向量。

谁能给我一个解决这种数据的想法?非常感谢

4

1 回答 1

0

这是一种蛮力方法。

tt <- read.table(header=FALSE, fill=TRUE, stringsAsFactors=FALSE,
text="18:00:04.940864 129.63.50.235.53 > 129.63.71.70.1111:  udp 107
18:00:04.957456 129.63.80.240.161 > 129.63.152.10.39518:  udp 151
18:00:04.958432 129.63.152.10.39518 > 129.63.80.240.161:  udp 136 (DF)
18:00:04.963312 217.79.96.182.53 > 129.63.1.1.1564:  udp 48 (DF)
18:00:05.000976 129.63.50.235.1028 > 218.232.110.133.53:  udp 34
18:00:05.207888 129.63.50.235.1028 > 203.50.0.24.53:  udp 32")

last <- apply(tt[-(1:4)], 1, paste, collapse=' ')
tt[,5] <- last
tt[,4] <- sub(':', '', tt[,4])
tt <- tt[c(1,2,4,5)]

> tt
##               V1                  V2                  V4           V5
## 1 18:00:04.940864    129.63.50.235.53   129.63.71.70.1111     udp 107 
## 2 18:00:04.957456   129.63.80.240.161 129.63.152.10.39518     udp 151 
## 3 18:00:04.958432 129.63.152.10.39518   129.63.80.240.161 udp 136 (DF)
## 4 18:00:04.963312    217.79.96.182.53     129.63.1.1.1564 udp  48 (DF)
## 5 18:00:05.000976  129.63.50.235.1028  218.232.110.133.53     udp  34 
## 6 18:00:05.207888  129.63.50.235.1028      203.50.0.24.53     udp  32 
于 2013-01-05T14:34:43.187 回答