python 听起来是一个不错的选择,因为它有一个很好的线程 API(虽然实现是有问题的)、matplotlib 和 pylab。我错过了你的更多规范,但也许这对你来说是一个很好的起点:matplotlib: async plotting with threads. 我会选择一个线程来处理批量磁盘 i/o 读取并将队列同步到线程池以进行数据处理(如果您有固定的记录长度,通过预先计算读取偏移量并将偏移量传递给线程池,事情可能会变得更快) ; 使用 diskio 线程,我将映射数据源文件,读取预定义的 num 字节 + 一次读取,最终将最后一个字节抓取到当前数据源 lineinput 的末尾;numbytes 应该选择在您的平均 lineinput 长度附近的某个地方;接下来是通过队列进行池馈送以及在线程池中进行的数据处理/绘图;我这里没有一张好照片(你到底在画什么),但我希望这会有所帮助。
编辑:有 file.readlines([sizehint]) 一次抓取多行;好吧,它可能不会那么快,因为文档说它在内部使用 readline()
编辑:一个快速的骨架代码
import threading
from collections import deque
import sys
import mmap
class processor(Thread):
"""
processor gets a batch of data at time from the diskio thread
"""
def __init__(self,q):
Thread.__init__(self,name="plotter")
self._queue = q
def run(self):
#get batched data
while True:
#we wait for a batch
dataloop = self.feed(self._queue.get())
try:
while True:
self.plot(dataloop.next())
except StopIteration:
pass
#sanitizer exceptions following, maybe
def parseline(self,line):
""" return a data struct ready for plotting """
raise NotImplementedError
def feed(self,databuf):
#we yield one-at-time datastruct ready-to-go for plotting
for line in databuf:
yield self.parseline(line)
def plot(self,data):
"""integrate
https://www.esclab.tw/wiki/index.php/Matplotlib#Asynchronous_plotting_with_threads
maybe
"""
class sharedq(object):
"""i dont recall where i got this implementation from
you may write a better one"""
def __init__(self,maxsize=8192):
self.queue = deque()
self.barrier = threading.RLock()
self.read_c = threading.Condition(self.barrier)
self.write_c = threading.Condition(self.barrier)
self.msz = maxsize
def put(self,item):
self.barrier.acquire()
while len(self.queue) >= self.msz:
self.write_c.wait()
self.queue.append(item)
self.read_c.notify()
self.barrier.release()
def get(self):
self.barrier.acquire()
while not self.queue:
self.read_c.wait()
item = self.queue.popleft()
self.write_c.notify()
self.barrier.release()
return item
q = sharedq()
#sizehint for readine lines
numbytes=1024
for i in xrange(8):
p = processor(q)
p.start()
for fn in sys.argv[1:]
with open(fn, "r+b") as f:
#you may want a better sizehint here
map = mmap.mmap(f.fileno(), 0)
#insert a loop here, i forgot
q.put(map.readlines(numbytes))
#some cleanup code may be desirable