python - 如何在 Python 中替换 dtype.names 中的某些符号？

Question

我有一个格式如下的数据文件：

column_1    column 2    column-3    column-4    column_5    column 6
1   2   3   1   2   3
4   3   2   3   2   4
1   4   3   1   4   3
5   6   4   5   6   4

当我导入以下文件时，带有空格的标题名称将自动替换为下划线，我将其替换为空格。但是如何保留连字符。我使用的代码是：

import numpy as np
with open('data.dat', 'rb') as f:
    header = f.readline().split('\t')
    arr = np.genfromtxt(f, names = header, comments='#', delimiter='\t', dtype=None)
arr.dtype.names = [j.replace('_', ' ').replace('-', ' ') for j in arr.dtype.names]
print arr.dtype.names

输出

('column_1', 'column_2', 'column3', 'column4', 'column_5', 'column_6')

如何在 Python 中取回第 3 列和第 4 列的连字符？

score 0 · Accepted Answer

提示 - 您可以使用正则表达式来提取列中的数据，对于上述情况，表达式将类似于此exp = r'column.\d'

score 0 · Accepted Answer

确保您的标题\t在您的文件中被分隔：

import numpy as np
with open('data.dat', 'rb') as f:
    header = f.read().split("\n")[0].split("\t")
    arr = np.genfromtxt(f, names = header,comments='#', delimiter='\t', dtype=object)
arr.dtype.names = [j.replace('_', ' ') if j[:-1]+"-"+j[-1] not in header else j[:-1]+"-"+j[-1] for j in arr.dtype.names]
print arr.dtype.names

>> ('column 1', 'column 2', 'column-3', 'column-4', 'column 5', 'column 6')

python - 如何在 Python 中替换 dtype.names 中的某些符号？

2 回答 2

Related

Reference