python - 如何使用 Python 从文件的一部分制作整数列表？

Question

我有一个如下所示的文件：

@ junk
...
@ junk
    1.0  -100.102487081243
    1.1  -100.102497023421
    ...   ...
    3.0  -100.102473082342
&
@ junk
...

我只对@和&字符之间给出的两列数字感兴趣。这些字符可能出现在文件中的任何其他位置，但绝不会出现在数字块内。

我想创建两个列表，一个带有第一列，一个带有第二列。

List1 = [1.0, 1.1,..., 3.0]
List2 = [-100.102487081243, -100.102497023421,..., -100.102473082342]

我一直在使用 shell 脚本为这些文件准备一个更简单的生成列表的 Python 脚本，但是，我正在尝试将这些进程迁移到 Python 以获得更一致的应用程序。有任何想法吗？我在 Python 和文件处理方面的经验有限。

编辑：我应该提一下，这个数字块出现在文件的两个地方。两个数字块是相同的。

Edit2：一般功能对此最满意，因为我会将其放入自定义库中。

目前的努力

我目前使用 shell 脚本将除数字块之外的所有内容修剪成两个单独的列。从那里使用以下功能对我来说是微不足道的

def ReadLL(infile):
    List = open(infile).read().splitlines()
    intL = [int(i) for i in List]
    return intL

通过从我的主要调用它

import sys
import eLIBc
infile = sys.argv[1]
sList = eLIBc.ReadLL(infile)

问题是知道如何使用 Python 从原始文件中提取数字块，而不是使用 shell 脚本。

score 1 · Accepted Answer

您想循环遍历文件本身，并在发现第一行没有字符时设置一个标志@，之后您就可以开始收集数字了。当你在一行中找到&字符时停止阅读。

def readll(infile):    
    with open(infile) as data:
        floatlist1, floatlist2 = [], []
        reading = False

        for line in data:
            if not reading:
                if '@' not in line:
                    reading = True
                else:
                    continue

            if '&' in line:
                return floatlist1, floatlist2

            numbers = map(float, line.split())
            floatlist1.append(numbers[0])
            floatlist2.append(numbers[1])

所以以上：

将 'reading' 设置为False，并且仅当'@'找到没有的行时，才设置为True。
当“阅读”是True：
- 如果该行包含，则返回读取的数据&
- 否则假定该行包含两个由空格分隔的浮点值，它们被添加到各自的列表中

通过返回，函数结束，文件自动关闭。只有第一个块被读取，文件的其余部分被简单地忽略。

score 1 · Accepted Answer

试试这个：

with open("i.txt") as fp:
    lines = fp.readlines()
    data = False
    List1 = []
    List2 = []
    for line in lines:
        if line[0] not in ['&', '@']:
            print line
            line = line.split()
            List1.append(line[0])
            List2.append(line[1])
            data = True
        elif data == True:
            break

print List1
print List2

这应该给你第一个数字块。

输入：

@ junk
@ junk
1.0  -100.102487081243
1.1  -100.102497023421
3.0  -100.102473082342
&
@ junk
1.0  -100.102487081243
1.1  -100.102497023421

输出：

['1.0', '1.1', '3.0']
['-100.102487081243', '-100.102497023421', '-100.102473082342']

更新

如果你需要两个块，那么使用这个：

with open("i.txt") as fp:
    lines = fp.readlines()
    List1 = []
    List2 = []
    for line in lines:
        if line[0] not in ['&', '@']:
            print line
            line = line.split()
            List1.append(line[0])
            List2.append(line[1])

print List1
print List2

python - 如何使用 Python 从文件的一部分制作整数列表？

2 回答 2

Related

Reference