2

我现在正在开发适合 SIP 数据的程序。不幸的是,数据位于具有以下结构的 csv 表中:

       f;      Abs(Zm);     Std(Abs);      Phi(Zm);     Std(Phi);       Re(Zm);       Im(Zm);     Time [s]
1.0000000e-001;    7712.6262;       0.0247;    -0.003774;     0.000001;    7712.5713;     -29.1074;   3418623040
2.0000000e-001;    7712.4351;       0.0030;    -0.007543;     0.000001;    7712.2157;     -58.1732;   3418623056
5.0000000e-001;    7710.8455;       0.0094;    -0.018837;     0.000002;    7709.4775;    -145.2434;   3418623063
1.0000000e+000;    7705.3763;       0.0098;    -0.037637;     0.000000;    7699.9195;    -289.9395;   3418623067
2.0000000e+000;    7683.8120;       0.0241;    -0.075058;     0.000001;    7662.1778;    -576.1935;   3418623069
5.0000000e+000;    7539.7945;       0.0080;    -0.184724;     0.000002;    7411.5201;   -1384.8720;   3418623071
1.0000000e+001;    7088.6894;       0.0060;    -0.351521;     0.000001;    6655.2169;   -2440.8206;   3418623072


         f;     Abs(Z12);     Phi(Z12);     Abs(Z34);     Phi(Z34);     Abs(Z14);     Phi(Z14);     Time [s]
1.0000000e-001;       1.7821;     3.139014;       0.2545;    -3.141592;    7710.5896;    -0.003774;   3418623040
2.0000000e-001;       1.7850;     3.133381;       0.2572;    -3.126220;    7710.3930;    -0.007543;   3418623056
5.0000000e-001;       1.7755;     3.121223;       0.2514;    -3.133763;    7708.8186;    -0.018838;   3418623063
1.0000000e+000;       1.7683;     3.100815;       0.2503;     3.139466;    7703.3580;    -0.037638;   3418623067
2.0000000e+000;       1.8091;     3.058834;       0.2538;    -3.123705;    7681.7502;    -0.075060;   3418623069
5.0000000e+000;       1.5547;     2.943611;       0.2398;    -3.136317;    7538.0045;    -0.184727;   3418623071

我正在使用 numpy.loadtxt() 例程从表中收集数据,如下所示:

def load_datafile(filename):
try:
     x_data, y_data = numpy.loadtxt(filename , unpack=True, usecols=(0,1),) 
except IOError:
    print('There was an error opening the file: {0}'.format(filename))
    x_data=[]
    y_data=[]
return x_data, y_data

我知道在 loadtxt() 命令中使用表中的特定块没有进一步的标识符。但是有没有方便的解决方法?

否则是否有一个简单的脚本可以将 csv 输入文件重新排列为单个块列?

提前致谢!问候,贡纳尔

4

2 回答 2

1

假设像输入一样总是有一对空的换行符,这个小脚本应该返回一堆文件对象:

def parseMultiblockCSV(filename):
    original = open(filename, "r")
    newlines = 0
    block = 0
    current = open(filename + "." + str(block), "w")
    for line in original:
        if line == "":
            newlines += 1
        if newlines >= 2:
            current.close()
            block += 1
            current = open(filename + "." + str(block), "w")
        current.write(line)
    current.close()
    files = []
    for n in range(block + 1):
        files.append(open(filename + "." + str(n)))
    return files

如果您随后需要将它们都放在同一个表中,我假设它具有将多个文件加载到单个表中的功能。除此以外:

def combineCSVFiles(files, output):
    if len(files) == 1:
        return files[0]
    start = file[0]
    files = file[1:]
    out = open(output, "w")
    for line in start:
        out.write(line)
    for input in files:
        first = false
        for line in input:
        if not first:
            first = true
            continue
        out.write(line)
    out.close()
    return open(output, "r")

这应该返回一个包含给定文件对象的连接内容的文件对象,忽略除第一个文件之外的任何内容的第一个标题行。

于 2012-05-03T07:56:49.540 回答
1

您可以先将输入数据拆分为块,然后使用 loadtxt 或 genfromtxt(我更喜欢这个,因为它有读取标题的选项)。

from numpy import genfromtxt
from StringIO import StringIO

def read_by_block(filename):
    blocks = []
    data = open(filename).read()
    for blk in data.split('\n\n'): # we assume that blocks are separated by two newlines
        blocks.append(genfromtxt(StringIO(blk), delimiter=';', names=True))
    return blocks

data = read_by_block('data.txt')

print data[0].dtype.names # print fields for first block
print data[0]['StdPhi'] # print column 'Std(Phi)' in 1st block
于 2012-05-03T12:21:07.027 回答