0

这是我的基本问题:

我有以下内容: 文件名:parseFastq.py 执行:通过命令行代码运行它:python3 parseFastq.py --fastq /Users/remaining_dir/test1.fastq

此代码有效!

但是,当我复制 parseFastq.py 的组件时,就会出现问题。

下面是代码:

首先定义类...这部分在我的新脚本上运行良好。

import argparse
import gzip
#Example use is 
# python parseFastq.py --fastq /Users/remaining_dir/test1.fastq

################################################
# You can use this code and put it in your own script
class ParseFastQ(object):
    """Returns a read-by-read fastQ parser analogous to file.readline()"""
    def __init__(self,filePath,headerSymbols=['@','+']):
        """Returns a read-by-read fastQ parser analogous to file.readline().
        Exmpl: parser.__next__()
        -OR-
        Its an iterator so you can do:
        for rec in parser:
            ... do something with rec ...

        rec is tuple: (seqHeader,seqStr,qualHeader,qualStr)
        """
        if filePath.endswith('.gz'):
            self._file = gzip.open(filePath)
        else:
            self._file = open(filePath, 'rU')
        self._currentLineNumber = 0
        self._hdSyms = headerSymbols

    def __iter__(self):
        return self

    def __next__(self):
        """Reads in next element, parses, and does minimal verification.
        Returns: tuple: (seqHeader,seqStr,qualHeader,qualStr)"""
        # ++++ Get Next Four Lines ++++
        elemList = []
        for i in range(4):
            line = self._file.readline()
            self._currentLineNumber += 1 ## increment file position
            if line:
                elemList.append(line.strip('\n'))
            else: 
                elemList.append(None)

        # ++++ Check Lines For Expected Form ++++
        trues = [bool(x) for x in elemList].count(True)
        nones = elemList.count(None)
        # -- Check for acceptable end of file --
        if nones == 4:
            raise StopIteration
        # -- Make sure we got 4 full lines of data --
        assert trues == 4,\
               "** ERROR: It looks like I encountered a premature EOF or empty line.\n\
               Please check FastQ file near line number %s (plus or minus ~4 lines) and try again**" % (self._currentLineNumber)
        # -- Make sure we are in the correct "register" --
        assert elemList[0].startswith(self._hdSyms[0]),\
               "** ERROR: The 1st line in fastq element does not start with '%s'.\n\
               Please check FastQ file near line number %s (plus or minus ~4 lines) and try again**" % (self._hdSyms[0],self._currentLineNumber) 
        assert elemList[2].startswith(self._hdSyms[1]),\
               "** ERROR: The 3rd line in fastq element does not start with '%s'.\n\
               Please check FastQ file near line number %s (plus or minus ~4 lines) and try again**" % (self._hdSyms[1],self._currentLineNumber) 
        # -- Make sure the seq line and qual line have equal lengths --
        assert len(elemList[1]) == len(elemList[3]), "** ERROR: The length of Sequence data and Quality data of the last record aren't equal.\n\
               Please check FastQ file near line number %s (plus or minus ~4 lines) and try again**" % (self._currentLineNumber) 

        # ++++ Return fatsQ data as tuple ++++
        return tuple(elemList)
##########################################################################

这是在同一个脚本中调用它时不起作用的代码;它与将碎片放入:

if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Process fasq files and seperaate into 4 categories')
parser.add_argument("-f",  "--fastq", required=True, help="Place fastq inside here")
args = parser.parse_args()

fastqfile = ParseFastQ(args.fastq)

我尝试了以下方法,但无法获得应包含以下元组的 fastqfile: (seqHeader,seqStr,qualHeader,qualStr)

尝试:

parser.add_argument("-/Users/remaining_dir/test1.fastq",  "--fastq", required=True, help="Place fastq inside here")

错误:

argument -/Users/remaining_dir/test1.fastq/--fastq: conflicting option string: --fastq

尝试:

parser.add_argument("-/Users/remaining_dir/test1.fastq",  "-@", required=True, help="Place fastq inside here")

输出[332]:

_StoreAction(option_strings=['-/Users/remaining_dir/test1.fastq', '-@'], dest='/Users/remaining_dir/test1.fastq', nargs=None, const=None, default=None, type=None, choices=None, help='Place fastq inside here', metavar=None)

下一行:

错误:

usage:  [-h] -/Users/remaining_dir/test1.fastq
        /USERS/REMAINING_DIR/TEST1.FASTQ
: error: the following arguments are required: -/Users/remaining_dir/test1.fastq/-@
An exception has occurred, use %tb to see the full traceback.

SystemExit: 2

 when %tb selected the following info was give: 
 File "/Users/brownbear/opt/anaconda3/lib/python3.7/argparse.py", line 2508, in error
    self.exit(2, _('%(prog)s: error: %(message)s\n') % args)

  File "/Users/brownbear/opt/anaconda3/lib/python3.7/argparse.py", line 2495, in exit
    _sys.exit(status)

如果有帮助,我将包含一些示例 fastq 数据

@seq13534-419
GCAGTAGCGGTCATAAGTGGTACATTACGAGATTCGGAGTACCATAGATTCGCATGAATCCCTGTGGATACGAGAGTGTGAGATATATGTACGCCAATCCAGTGTGATACCCATGAGATTTAGGACCGATGATGGTTGAGGACCAAGGATTGACCCGATGGATGCAGATTTGACCCCAGATAGAATAAATGCGATGAGATGATTTGGCCGATAGATAGATAGTGTCGTGAGGTGACGTCCGTCACTGGACGAA
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIDIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFFFFDFFDFFDDFDFDFFFFDDFFDDFDDFF
@seq86249-867
GGATTAGCGGTCATAAGTCGTACATTACGAGATTCGGAGTACCATAGATTCGCATGAATCCCTGTGGATACGAGAGTGTGAGATATATGTACGCCAATCCAGTGTGATACCCATGAGATTTAGGACCGATGATGGTTGAGGACCAAGGATTGACCCGATGGATGCAGATTTGACCCCAGATAGAATAAATGCGATGAGATGATTTGGCCGATAGATAGATAGAGGTCAGTATAACCTCTCAAAGCTTTATCTACGGATGGATCCGCGC
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIDIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIDDFDDDDDDFFDFDDFDDDFDFFDDFFFFFFFFFDDFDFFDDFDDF
@seq46647-928
GACCTAGCGGTCATAAGTGGTACATTACGAGATTCGGAGTACCATAGATTCGCATGAATCCCTGTGGATACGAGAGTGTGAGATATATGTACGCCAATCCAGTGTGATACCCATGAGATTTAGGACCGATGATGGTTGACGACCAAGGATTGACCCGATGGATGCAGATTTGACCCCAGATAGAATAAATGCGATGAGATGATTTGGCCGATAGATAGATAGTAAGTAAATGCCACGGACTCGTCACGTG
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIDIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIDDDFDFDFFFFFDFFDFDFDDDDDFDFF

任何帮助将不胜感激为什么当我运行脚本时它会起作用,但现在当我尝试并合并到脚本中时

4

2 回答 2

0

解决方案是两个主要部分

我试图通过 IDE (Spyder) 运行 argparse,并且只运行选定的代码而不是整个脚本。

对于那些不熟悉 python 并且第一次使用 argparse 的人......这个工具仅在从命令行调用时才有效。

因此,一旦您创建了 args 表

你将运行如下L

从命令行:

python3 parseFastq.py --fastq test1.fastq 

为了从初始设置中进一步分解,您基本上将您的 test1.fastq 文件标记为标签 --fastq ...这是至关重要的,如果您收到错误,它需要以特定格式是您有成对添加它们...在这个特定的示例中,您还可以使用“-f”的简写进行标记。因此,它也可以作为...运行

从命令行:

python3 parseFastq.py -f test1.fastq 

只要您的 py 脚本与您调用的文件在同一目录中运行,您就不需要完整的扩展名。

于 2020-05-29T22:29:57.550 回答
0

要按照我的理解回答您的问题,您可以像这样简单地向解析器添加另一个参数。

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description='Process fasq files and seperaate into 4 
            categories')
    parser.add_argument("-f",  "--fastq", required=True, help="Place fastq inside here")
    parser.add_argument("-t",  "--type", required=True, help="The type of file")
    args = parser.parse_args()

    print(args.fastq)
    print(args.type)

然后这样称呼它。

python3 parseFastq.py --fastq /Users/remaining_dir/test1.fastq --type fastq
于 2020-05-27T10:38:13.387 回答