0

我有一个包含大约 2000 行太阳黑子数据的文件。我需要每个月找出它的平均值并将其写入一个新文件。我如何对月份进行分组以便获得平均值?我已经阅读了一些建议使用 panda 的线程,但是由于我们还没有在课堂上到达那里,所以我宁愿在没有完全掌握它的作用的情况下不使用它。

到目前为止,我的代码将年月日分开。如何将月份组合在一起以找到平均太阳黑子?

到目前为止,这是我的代码:

def OpenFile(File):
    outfile = open ("Monthlytemp.txt","w")

    try:
        Lines= open(File).readlines()
    except IOError:
        Lines=[]
    for line in Lines:
        Dates = line.split()
        Year= str(Dates[0][0:4])
        Month = str(Dates[0][4:6])
        Date = str(Dates [0][6:8])
        Spots = int(Dates [2])
        if Spots == 999:
            Spots= ''
        Spots = str(Spots)
        Data = [Year, Month, Date, Spots, '\n']
        Data = ' '.join(Data)
        outfile.write(str(Data))
        #print (Data)
    outfile.close()
    return Data
4

2 回答 2

0

一种可能的解决方案(对您的方法进行最小的更改):

def WriteAvg(outfile, year, month, avg):
    Data = [year, month, avg, '\n']
    Data = ' '.join(Data)
    outfile.write(str(Data))

def OpenFile(File):
    outfile = open ("Monthlytemp.txt","w")
    PrevMonth = ""
    PrevYear = ""
    SpotSum = 0
    Days = 0

    try:
        Lines= open(File).readlines()
    except IOError:
        Lines=[]
    for line in Lines:
        Dates = line.split()
        Year= str(Dates[0][0:4])
        Month = str(Dates[0][4:6])
        Date = str(Dates [0][6:8])
        Spots = int(Dates [2])
        if PrevMonth != Month && PrevMonth!="":
            MonthAvg = str(SpotSum*1./Days)
            WriteAvg(outfile, PrevYear, PrevMonth, MonthAvg)
            Days = 0
            SpotSum = 0
        if Spots!= 999:
            Days +=1
            SpotSum += Spots
        PrevMonth = Month
        PrevYear = Year
    #one last time
    MonthAvg = str(SpotSum*1./Days)
    WriteAvg(outfile, PrevYear, PrevMonth, MonthAvg)

    outfile.close()
    return Data
于 2013-10-27T17:11:27.860 回答
0

您可以使用字典。

def OpenFile(File):
    outfile = open ("Monthlytemp.txt","w")

    # stores (year, month): spots
    spots_by_month = dict()

    try:
        Lines= open(File).readlines()
    except IOError:
        Lines=[]
    for line in Lines:
        Dates = line.split()
        Year= str(Dates[0][0:4])
        Month = str(Dates[0][4:6])
        Date = str(Dates [0][6:8])
        Spots = int(Dates [2])

        # Not sure if this should be here, might want to place it
        # in an else clause after that if clause
        spots_by_month.get((Year, Month), []).append(Spots)

        if Spots == 999:
            Spots= ''

        Spots = str(Spots)

        Data = [Year, Month, Date, Spots, '\n']
        Data = ' '.join(Data)
        outfile.write(str(Data))
        #print (Data)

    # Getting averages as a dictionary
    averages = {
        date:sum(spots_list) / len(spots_list)
        for date, spots_list in spots_by_month.items()
    }
    print(averages)

    # Alternatively getting the averages as a sorted list
    averages = [
        (date, sum(spots_list) / len(spots_list))
        for date, spots_list in spots_by_month.items()
    ]
    averages.sort()
    print(averages)

    outfile.close()
    return Data
于 2013-10-27T17:12:37.617 回答