python - 如何查找数组的类型

Question

我正在尝试编写将 CSV 转换为 ARFF 的代码。我将每个之间的值导入","到数组的一个单元格中，例如一个实例，例如：

Monday,176,49,203,27,77,38,Second

转换为：

['Monday', '176', '49', '203', '27', '77', '38', 'Second']

问题是 Python 将每个单元格识别为字符串，您可以在示例中看到 Python 识别的类型：

[<type 'str'>, <type 'str'>, <type 'str'>, <type 'str'>, <type 'str'>, <type 'str'>, <type 'str'>, <type 'str'>]

我正在寻找一种区分名义属性和数字属性的方法？

score 3 · Accepted Answer

我能想到的最好的就是这样，使用ast.literal_eval：

import ast

def converter(x):
    try:
        val = ast.literal_eval(x)
        return val
    except ValueError:
        return x

这使

>>> seq = ['Monday', '176', '49', '203', '27', '77', '38', 'Second']
>>> newseq = [converter(x) for x in seq]
>>> newseq
['Monday', 176, 49, 203, 27, 77, 38, 'Second']
>>> map(type, newseq)
[<type 'str'>, <type 'int'>, <type 'int'>, <type 'int'>, <type 'int'>, <type 'int'>, <type 'int'>, <type 'str'>]

使用的好处ast.literal_eval是它以一种很好的方式处理更多的情况：

>>> seq = ['Monday', '12.3', '(1, 2.3)', '[2,"fred"]']
>>> newseq = [converter(x) for x in seq]
>>> newseq
['Monday', 12.3, (1, 2.3), [2, 'fred']]

score 2 · Accepted Answer

for i in lst:
    try:
        int(i)
        #whatever you want to do
    except ValueError:
        #error handling

这将起作用，尽管这样会更好：

for i in lst:
    if i[-1].isdigit():  #it is a number
        #whatever
    else:
        #whatever else

取自这里

参见：str.isdigit() 方法

score 1 · Accepted Answer

如果性能在这里很重要，我将尝试采用三步法。这种方法不必要地避免了将字符串转换为intorfloat然后通过对第一个字符进行简单检查而失败。

对于每个块，检查第一个字符是否为数字
如果是，首先尝试将其解析为 an int，如果失败，则将其解析为float
如果一切都失败了，那你就有大问题了:)

就像是：

for chunk in chunks:
    if chunk[0].isdigit():
        try:
            return int(chunk)
        except ValueError:
            return float(chunk)
    else:
        # It's a string (a non-numeric entity)
        return chunk

您当然需要更特殊的处理来支持 text/csv 文件中的十六进制/八进制文字，但我认为这对您来说不是正常情况？

编辑：想一想，Volatility使用了类似的方法，唯一的区别是调用isdigit整个字符串而不仅仅是第一个字符。isdigit如果我们有长的数字序列，在每个字符上都调用它，这可能需要更多的时间，而我的方法总是检查第一个字符，所以可能会更快一些。

python - 如何查找数组的类型

3 回答 3

Related

Reference