不幸的是,gnuplot 不太适合处理诸如此类的数据处理任务。您可能会想出一个解决方案,但它会非常混乱且超级难以使用。幸运的是,gnuplot 可以从其他程序的管道中读取——所以最简单的解决方案是编写一个简单的脚本来处理输入数据并将其写入标准输出。我会选择python:
import time
from datetime import datetime
from collections import defaultdict
import sys
def datetime_2_epoch(dt):
return int(time.mktime(dt.timetuple()))
def epoch_2_datetime(epoch):
return datetime.fromtimestamp(epoch)
data = defaultdict(list)
with open(sys.argv[1]) as fin:
for line in fin: #Parse file 1 line at a time
timestr,datastr = line.rsplit(None,1)
try:
dt = datetime.strptime(timestr,"%Y%m%d %H:%M:%S")
val = float(datastr)
except ValueError: #couldn't read this line. must be a comment or something.
continue
bin = datetime_2_epoch(dt)//300 #300 = 60*5 -- 5 minute bin size
data[bin].append(val)
for bin,lst in sorted(data.items()):
cum_sum = sum(lst)
avg = cum_sum/len(lst)
print epoch_2_datetime(bin*300),avg,cum_sum
这会将您的数据文件(在您的示例数据上运行)格式化为:
2013-02-06 11:45:00 5029.5 30177.0
2013-02-06 11:55:00 5029.5 30177.0
可以用 gnuplot 中的框绘制:
set xdata time
set timefmt '%Y-%m-%d %H:%M:%S'
set yrange [0:*]
plot '<python test.py test.dat' u 1:3 w boxes title "5 minute average"
或者
plot '<python test.py test.dat' u 1:4 w boxes title "5 minute sum"