python - 代码中的 Python 条件错误

Question

我之前曾问过这个问题，试图开始使用这段代码：命令行参数需要采用 2 或 3 个参数

-s：这是一个可选参数或开关，表示用户想要剪接的基因序列（去除内含子）。用户不必提供这个（意味着他想要整个基因序列），但他确实提供了它，那么它必须是第一个参数

输入文件（带有基因）

输出文件（程序将在其中创建以存储 fasta 文件

该文件包含这样的行：

NM_001003443 chr11 + 5925152 592608098 2 5925152,5925652, 5925404,5926898,

然后我需要创建多个条件以确保输入的所有内容都是正确的，否则程序将退出：

用户指定不以 .genes 结尾的输入文件名
用户指定不以 .fa 或 .fasta 结尾的输出名称
用户提供少于两个或多于三个参数
用户的第一个参数以破折号开头，但不是'-s'
输入文件违反以下任何一项：
- 第一行应以“#”符号开头
- 每行应该正好有十列（由一个或多个空格分隔的列）
- 第 2 列（从 0 开始计数）应为 + 或 - 符号
- 第 8 列应该是一个制表符分隔的整数列表
- 第 9 列应该是一个制表符分隔的整数列表，其整数与第 8 列完全相同。

我已经为此编写了代码，但其中某处存在错误。但是，我最近无法找到错误。有人可以帮我看看我的代码，看看某处是否存在错误吗？我真的很感激！！

在我的实际代码中，所有 if 语句都被标记了，但是我在这里导入它时遇到了麻烦......

import sys

p = '(NM_\d+)\s+(chr\d+)([(\+)|(-)])\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+,\d+,)s+(\d+,\d+,)'
e = '([(\+)|(-)])'
def getGenes(spliced, infile, outfile):
spliced = False
if '-s' in sys.argv:
    spliced = True
    sys.argv.remove('s')
    infile, outfile = sys.argv[1:]
if '.genes' not in infile:
    print('Incorrect input file type')
    sys.exit(1)
if '.fa' or '.fasta' not in outfile:
    print('Incorrect output file type')
    sys.exit(1)
if len(sys.argv[0]) < 2 or len(sys.argv[0]) > 3:
    print('Command line parameters missing')
    sys.exit(1)
if sys.argv[1] != '-s':
    print('Invalid parameter, if spliced, must be -s')
    sys.exit(1)
fp = open(infile, 'r')
wp = open(outfile, 'r')
FirstLine = fp.readline().strip()
if not FirstLine.startswith('#'):
    print ('First line does not start with #')
    sys.exit(1)
n = 1
for line in fp.readlines():
    n += 1
    cols = line.strip().split('')
    if len(cols) != 10:
        print('Lenth not equal to 10')
        sys.exit(1)
    if cols[2] != '+' or '-':
        print('Column 2 is not a + or - symbol')
        sys.exit(1)
    if cols[8] != '\t\d+':
        print('Column 8 is not a tab-separated list of integers')
        sys.exit(1)
    if cols[9] != '\t\d+' and len(cols[9]) != len(cols[8]):
        print('Column 9 in not a tab-separated list of integers with the exact same number of integers in column 8')
        sys.exit(1)

score -1 · Accepted Answer

删除此块：

if sys.argv[1] != '-s':
    print('Invalid parameter, if spliced, must be -s')
    sys.exit(1)

sys.argv[1]将始终不等于'-s'，因为如果'-s'存在于 argv 中，则您之前将其删除了一些行：

if '-s' in sys.argv:
    spliced = True
    sys.argv.remove('s')

这条线

if len(sys.argv[0]) < 2 or len(sys.argv[0]) > 3:

不检查有用的东西，并且会更频繁地触发。它检查调用脚本的名称的长度是否正好是 2 或 3 个字符。那没有意义。看起来您想检查两个文件名是否都通过了 -s 标志，仅此而已。

在这种情况下，您的意思是：

if not 3 <= len(sys.argv) <= 4: # len(sys.argv) - 1 is the number of parameters for the script, as sys.argv[0] is the scriptname itself

如果您需要更多帮助，您必须更准确地了解观察到的不当行为。

编辑：

if cols[8] != '\t\d+':

won't work the way you'd like it. it compares the value in cols[8] to the literal '\t\d+' string. You might want to learn about the re module. same problem in the next if line.

python - 代码中的 Python 条件错误

1 回答 1

Related

Reference