我正在寻找删除文本文件中带有字符串或空行的行。它看起来像这样。正如您所看到的,标题在文件中自我重复。包含数据的行数因每个块而异。我需要将它作为数组导入 numpy. 起初我用逗号表示小数点,至少我能够改变它。


from types import StringType

z = open('D:\Desktop\cycle 1-20 20-50 kPa (dot).dat', 'r')
for line in z.readlines():
    for x in z:
        if type(z.readline(x)) is StringType:
            print line



bla bla

cyclical stuff                      Time:   81.095947   Sec 2012-08-02 17:05:42
stored :    1   cycle           stores for :    62  seg-cycle
Points :    4223
Servo_Hyd count Temps   Servo_Air pressure  Servo_Hyd load Servo_Hyd LVDT1  Servo_Hyd LVDT2 Servo_Hyd LVDT3
name1    name1    name1 name1   name1   name1   name1
1   60.102783   0.020013755 89.109558   0.3552089   0.4015148   -0.33822596
1   60.107666   0.020006953 89.025749   0.35519764  0.4015218   -0.33821729
1   60.112549   0.02000189  88.886292   0.3551946   0.4015184   -0.33822691
1   60.117432   0.020007374 89.559196   0.35519707  0.40151948  -0.33823174
1   60.122314   0.019991774 89.741402   0.35519552  0.40151322  -0.33822927
1   60.127197   0.020003742 89.748924   0.35520011  0.40150556  -0.33822462

bla bla

cyclical stuff                      Time:   81.095947   Sec 2012-08-02 17:05:42
stored :    1   cycle           stores for :    62  seg-cycle
Points :    4223
Servo_Hyd count Temps   Servo_Air pressure  Servo_Hyd load Servo_Hyd LVDT1  Servo_Hyd LVDT2 Servo_Hyd LVDT3
name1    name1    name1 name1   name1   name1   name1
1   60.102783   0.020013755 89.109558   0.3552089   0.4015148   -0.33822596
1   60.107666   0.020006953 89.025749   0.35519764  0.4015218   -0.33821729
1   60.112549   0.02000189  88.886292   0.3551946   0.4015184   -0.33822691
1   60.117432   0.020007374 89.559196   0.35519707  0.40151948  -0.33823174
1   60.122314   0.019991774 89.741402   0.35519552  0.40151322  -0.33822927
1   60.127197   0.020003742 89.748924   0.35520011  0.40150556  -0.33822462

2 回答 2


Python 最初会将所有文件元素读取为字符串,除非您强制转换它们,因此您的方法将不起作用。


f = open("datafile")
for line in f:
  #Catch everything that has a non-number/space in it
  if re.search("[^-0-9.\s]",line): 
  # Catch empty lines
  if len(line.strip()) == 0:
  # Keep the rest

于 2012-12-03T16:05:32.647 回答


yourArry = np.loadtxt(open('yourfilename.txt', skiprows=7)

此外,由于您有 heder(它应该是标题,可以在文件顶部找到),您可以将文件拆分为多个文件。您可以使用 Python 来完成,也可以使用 UNIX 命令csplit。怎么做,你会得到什么:

oz123@:~/tmp> csplit -k data.txt   '/^bla/' '{*}'
oz123@:~/tmp> ls xx
xx00  xx01  xx02
oz123@:~/tmp> ls xx00
oz123@:~/tmp> cat xx00
oz123@:~/tmp> cat xx01
bla bla

cyclical stuff                      Time:   81.095947   Sec 2012-08-02 17:05:42
stored :    1   cycle           stores for :    62  seg-cycle
Points :    4223
Servo_Hyd count Temps   Servo_Air pressure  Servo_Hyd load Servo_Hyd LVDT1  Servo_Hyd LVDT2 Servo_Hyd LVDT3
name1    name1    name1 name1   name1   name1   name1
1   60.102783   0.020013755 89.109558   0.3552089   0.4015148   -0.33822596
1   60.107666   0.020006953 89.025749   0.35519764  0.4015218   -0.33821729
1   60.112549   0.02000189  88.886292   0.3551946   0.4015184   -0.33822691
1   60.117432   0.020007374 89.559196   0.35519707  0.40151948  -0.33823174
1   60.122314   0.019991774 89.741402   0.35519552  0.40151322  -0.33822927
1   60.127197   0.020003742 89.748924   0.35520011  0.40150556  -0.33822462

oz123@:~/tmp> cat xx02
bla bla

cyclical stuff                      Time:   81.095947   Sec 2012-08-02 17:05:42
stored :    1   cycle           stores for :    62  seg-cycle
Points :    4223
Servo_Hyd count Temps   Servo_Air pressure  Servo_Hyd load Servo_Hyd LVDT1  Servo_Hyd LVDT2 Servo_Hyd LVDT3
name1    name1    name1 name1   name1   name1   name1
1   60.102783   0.020013755 89.109558   0.3552089   0.4015148   -0.33822596
1   60.107666   0.020006953 89.025749   0.35519764  0.4015218   -0.33821729
1   60.112549   0.02000189  88.886292   0.3551946   0.4015184   -0.33822691
1   60.117432   0.020007374 89.559196   0.35519707  0.40151948  -0.33823174
1   60.122314   0.019991774 89.741402   0.35519552  0.40151322  -0.33822927
1   60.127197   0.020003742 89.748924   0.35520011  0.40150556  -0.33822462
于 2012-12-03T16:03:34.180 回答