5

我需要以以下格式读取复数列:

# index; (real part, imaginary part); (real part, imaginary part) 

  1              (1.2, 0.16)                  (2.8, 1.1)
  2              (2.85, 6.9)                  (5.8, 2.2)

NumPy 似乎非常适合读取只有一个分隔符的数据列,但括号似乎破坏了使用numpy.loadtxt().

有没有一种聪明的方法可以用 Python 读取文件,或者最好只读取文件,删除所有括号,然后将其提供给 NumPy?

这将需要为数千个文件完成,所以我想要一种自动化的方式,但也许 NumPy 无法做到这一点。

4

3 回答 3

5

这是比@Jeff 的答案更直接的方法,告诉使用映射到loadtxt的辅助函数将其直接加载到复杂数组中:parse_pair(1.2,0.16)1.20+0.16j

>>> import re
>>> import numpy as np

>>> pair = re.compile(r'\(([^,\)]+),([^,\)]+)\)')
>>> def parse_pair(s):
...    return complex(*map(float, pair.match(s).groups()))

>>> s = '''1 (1.2,0.16) (2.8,1.1)
2 (2.85,6.9) (5.8,2.2)'''
>>> from cStringIO import StringIO
>>> f = StringIO(s)

>>> np.loadtxt(f, delimiter=' ', dtype=np.complex,
...            converters={1: parse_pair, 2: parse_pair})
array([[ 1.00+0.j  ,  1.20+0.16j,  2.80+1.1j ],
       [ 2.00+0.j  ,  2.85+6.9j ,  5.80+2.2j ]])

或者在熊猫中:

>>> import pandas as pd
>>> f.seek(0)
>>> pd.read_csv(f, delimiter=' ', index_col=0, names=['a', 'b'],
...             converters={1: parse_pair, 2: parse_pair})
             a           b
1  (1.2+0.16j)  (2.8+1.1j)
2  (2.85+6.9j)  (5.8+2.2j)
于 2013-05-21T00:38:26.120 回答
4

由于这个问题在 pandas中仍然没有解决,让我添加另一个解决方案。阅读后,您可以DataFrame使用单线对其进行修改:

import pandas as pd

df = pd.read_csv('data.csv')
df = df.apply(lambda col: col.apply(lambda val: complex(val.strip('()'))))
于 2015-10-30T15:23:52.277 回答
2

If your file only has 5 columns like you've shown, you could feed it to pandas with a regex for conversion, replacing the parentheses with commas on every line. After that, you could combine them as suggested in this SO answer to get complex numbers.

Pandas makes it easier, because you can pass a regex to its read_csv method, which lets you write clearer code and use a converter like this. The advantage over the numpy version is that you can pass a regex for the delimiter.

import pandas as pd
from StringIO import StringIO
f_str = "1 (2, 3) (5, 6)\n2 (3, 4) (4, 8)\n3 (0.2, 0.5) (0.6, 0.1)"
f.seek(0)

def complex_converter(txt):
    txt = txt.strip("()").replace(", ", "+").replace("+-", "-") + "j"
    return complex(txt)

df = pd.read_csv(buf, delimiter=r" \(|\) \(", converters = {1: complex_converter, 2: complex_converter}, index_col=0)

EDIT: Looks like @Dougal came up with this just before I posted this...really just depends on how you want to handle the complex number. I like being able to avoid the explicit use of the re module.

于 2013-05-21T00:20:51.717 回答