0

我有一个文件,其中包含描述特定参数的列列表:

大小 大小 光度

我只需要这个文件中的特定数据(特别是行和列)。到目前为止,我在 python 中有一个代码,我在其中附加了必要的行号。我只需要知道如何匹配它以在文本文件中获取正确的字符串以及列(幅度)和(亮度)中的变量。关于如何解决这个问题的任何建议?

这是我的代码示例(#comments 描述了我做了什么以及我想做什么):

temp_ListMatch = (point[5]).strip() 
if temp_ListMatch:
    ListMatchaddress = (point[5]).strip()
    ListMatchaddress = re.sub(r'\s', '_', ListMatchaddress) 
    ListMatch_dirname = '/projects/XRB_Web/apmanuel/499/Lists/' + ListMatchaddress
    #print ListMatch_dirname+"\n" 

    try:
        file5 = open(ListMatch_dirname, 'r')
    except IOError:
        print 'Cannot open: '+ListMatch_dirname

    Optparline = []
    for line in file5:
        point5 = line.split()
        j = int(point5[1])
        Optparline.append(j)
        #Basically file5 contains the line numbers I need, 
        #and I have appended these numbers to the variable j. 
        temp_others = (point[4]).strip()
        if temp_others: 
            othersaddress = (point[4]).strip()
            othersaddress =re.sub(r'\s', '_', othersaddress) 
            othersbase_dirname = '/projects/XRB_Web/apmanuel/499/Lists/' + othersaddress
            try:
                file6 = open(othersbase_dirname, 'r')
            except IOError:
                print 'Cannot open: '+othersbase_dirname

            gmag = []
            z = []
            rh = []
            gz = []

            for line in file6:
                point6 = line.split()
                f = float(point6[2])
                g = float(point6[4])
                h = float(point6[6])
                i = float(point6[9])
         # So now I have opened file 6 where this list of data is, and have
        # identified the columns of elements that I need. 
        # I only need the particular rows (provided by line number) 
        # with these elements chosen. That is where I'm stuck!
4

1 回答 1

0

将整个数据文件加载到 pandas DataFrame 中(假设数据文件有一个标题,我们可以从中获取列名)

import pandas as pd
df = pd.read_csv('/path/to/file') 

将行号文件加载到熊猫系列中(假设每行一个):

#  squeeze = True makes the function return a series
row_numbers = pd.read_csv('/path/to/rows_file', squeeze = True)

仅返回行号文件中的那些行,以及列的大小和亮度(假设第一行编号为 0):

relevant_rows = df.ix[row_numbers][['magnitude', 'luminosity']
于 2013-03-20T00:40:32.363 回答