python - Python CSV 作业程序

Question

我有一个家庭作业要做，与通过 csv 和函数读取文件有关。

基本思想是计算两年内足球运动员的冲球得分。我们使用提供给我们的文件中的数据。示例文件将是：

名称,,pos,team,g,rush,ryds,rtd,rtdr,ravg,fum,fuml,fpts,year
AJ,Feeley,QB,STL,5,3,4,0,0,1.3,3,2,20.3,2011
亚伦,布朗,RB,DET,1,1,0,0,0,0,0,0,0.9,2011
亚伦,罗杰斯,QB,GB,15,60,257,3,5,4.3,4,0,403.4,2011
阿德里安,彼得森,RB,MIN,12,208,970,12,5.8,4.7,1,0,188.9,2011
Ahmad,Bradshaw,RB,NYG,12,171,659,9,5.3,3.9,1,1,156.6,2011

换句话说，我们必须从文件中删除第一行，然后读取其余行，以逗号分隔。

要计算 rusher 评分，我们需要：

Yds 是每次尝试的平均码数增益。这是 [总码数 / (4.05 * 尝试)]。如果此数字大于 2.375，则应使用 2.375。

perTDs 是每次进位达阵的百分比。这是 [(39.5 * 达阵数) / 尝试]。如果此数字大于 2.375，则应使用 2.375。

perFumbles 是每次进位失败的百分比。这是 [2.375 - ((21.5 * fumbles) / 尝试)]。

rusher 评分为 [Yds + perTDs + perFumbles] * (100 / 4.5)。

我到目前为止的代码：

playerinfo = []
teaminfo10 = []
teaminfo11 = []

import csv

file = raw_input("Enter filename: ")
read = open(file,"rU")
read.readline()
fileread = csv.reader(read)

#Each line is iterated through, and if rush attempts are greater than 10, the
#player may be used for further statistics.
for playerData in fileread:
    if int(playerData[5]) > 10:
    
        attempts = int(playerData[5])
        totalYards = int(playerData[6])
        touchdowns = int(playerData[7])
        fumbles = int(playerData[10])
    
        #Rusher rating for each player is found. This rating, coupled with other
        #data about the player is formatted and appended into a list of players.
        rushRating = ratingCalc(attempts,totalYards,touchdowns,fumbles)
        rusherData = rushFunc(playerData,rushRating)
        playerinfo.append(rusherData)
    
        #Different data about the player is formatted and added to one of two
        #lists of teams, based on year. 
        teamData = teamFunc(playerData)
        if playerData[13] == '2010':
            teaminfo10.append(teamData)
        else:
            teaminfo11.append(teamData)

#The list of players is sorted in order of decreasing rusher rating.
playerinfo.sort(reverse = True)
#The two team lists of players are sorted by team.
teaminfo10.sort()
teaminfo11.sort()

print "The following statistics are only for the years 2010 and 2011."
print "Only those rushers who have rushed more than 10 times are included."
print
print "The top 50 rushers based on their rusher rating in individual years are:"

#50 players, in order of decreasing rusher ratings, are printed along with other
#data.
rushPrint(playerinfo,50)

#A similar list of running backs is created, in order of decreasing rusher
#ratings.
RBlist = []
for player in playerinfo:
    if player[2] == 'RB':
        RBlist.append(player)

print "\nThe top 20 running backs based on their rusher rating in individual\
years are:"
#The top 20 running backs on the RBlist are printed, with other data.
rushPrint(RBlist,20)


#The teams with the greatest overall rusher rating (if their attempts are
#greater than 10) are listed in order of decreasing rusher rating, for both 2010
#and 2011.
teamListFunc(teaminfo10,'2010')

teamListFunc(teaminfo11,'2011')

#The player(s) with the most yardage is printed.
yardsList = mostStat(6,fObj,False)
print "\nThe people who rushed for the most yardage are:"
for item in yardsList:
    print "%s rushing for %d yards for %s in %s."\
    % (item[1],item[0],item[2],item[3])

#The player(s) with the most touchdowns is printed.
TDlist = mostStat(7,fObj,False)
print"\nThe people who have scored the most rushing touchdowns are:"
for item in TDlist:
    print "%s rushing for %d touchdowns for %s in %s."\
    % (item[1],item[0],item[2],item[3])

#The player(s) with the most yardage per rushing attempt is printed.
ypaList = mostStat(6,fObj,True)
print"\nThe people who have the highest yards per rushing attempt with over 10\
rushes are:"
for item in ypaList:
    print "%s with a %.2f yards per attempt rushing average for %s in %s."\
    % (item[1],item[0],item[2],item[3])

#The player(s) with the most fumbles is printed.
fmblList = mostStat(10,fObj,False)
print"\nThere are %d people with the most fumbles. They are:" % (len(fmblList))
for item in fmblList:
    print "%s with %d fumbles for %s in %s." % (item[1],item[0],item[2],item[3])


def ratingCalc(atts,totalYrds,TDs,fmbls):
    """Calculates rusher rating."""
    yrds = totalYrds / (4.05 * atts)
    if yrds > 2.375:
        yrds = 2.375

    perTDs = 39.5 * TDs / atts
    if perTDs > 2.375:
        perTDs = 2.375

    perFumbles = 2.375 - (21.5 * fmbls / atts)

    rating = (yrds + perTDs + perFumbles) * (100/4.5)

    return rating    

def rushFunc(information,rRating):
    """Formats player info into [rating,name,pos,team,yr,atts]"""
    rusherInfo = []
    rusherInfo.append(rRating)
    name = information[0] + ' ' + information[1]
    rusherInfo.append(name)
    rusherInfo.append(information[2])
    rusherInfo.append(information[3])
    rusherInfo.append(information[13])
    rusherInfo.append(information[5])

    return rusherInfo


def teamFunc(plyrInfo):
    """Formats player info into [team,atts,yrds,TDs,fmbls] for team sorting"""
    teamInfo = []
    teamInfo.append(plyrInfo[3])
    teamInfo.append(plyrInfo[5])
    teamInfo.append(plyrInfo[6])
    teamInfo.append(plyrInfo[7])
    teamInfo.append(plyrInfo[10])

    return teamInfo

def rushPrint(lst,num):
    """Prints players and their data in order of rusher rating."""
    print "Name                           Pos   Year  Attempts   Rating  Team"
    count = 0
    while count < num:
        index = lst[count]
        print "%-30s %-5s %4s  %5s      %3.2f  %s"\
              % (index[1],index[2],index[4],index[5],index[0],index[3])
        count += 1

所以，是的，还有很多我必须定义的功能。但是到目前为止，您对代码有何看法？效率低吗？你能告诉我它有什么问题吗？因为在我看来这段代码会非常长（比如 300 行左右），但是老师说它应该是一个相对较短的项目。

score 3 · Accepted Answer

这是一段代码，可以大大简化您的整个项目。

理解手头的任务可能需要一点时间，但总的来说，当您处理正确的数据类型和“关联数组”（dicts）时，这将使您的生活更加轻松

import csv

reader = csv.DictReader(open('mycsv.txt', 'r'))
#opens the csv file into a dictionary

list_of_players = map(dict, reader)
#puts all the dictionaries (by row) as a separate element in a list. 
#this way, its not a one-time iterator and all your info is easily accessible

for i in list_of_players:
    for stat in ['rush','ryds','rtd','fum','fuml','year']:
        i[stat] = int(i[stat])
    #the above loop makes all the intended integers..integers instead of strings
    for stat in ['fpts','ravg','rtdr']:
        i[stat] = float(i[stat])
    #the above loop makes all the intended floats..floats instead of strings

for i in list_of_players:
    print i['name'], i[' '], i['fpts']
    #now you can easily access and loop through your players with meaningful names
    #using 'fpts' rather than predetermined numbers [5]

此示例代码展示了使用他们的姓名和统计数据（即名字、姓氏和 fpts）是多么容易：

>>> 
A.J. Feeley 20.3
Aaron Brown 0.9
Aaron Rodgers 403.4
Adrian Peterson 188.9
Ahmad Bradshaw 156.6

当然，为了获得所有请求的统计信息（最大值等），需要进行一些调整，但是通过从一开始就保持数据类型正确，这使得执行这些任务变得不那么冗长。

现在（使用这些结构）可以在不到 300 行的时间内完成这项任务，而且您使用 python 的次数越多，您将学习完成它们的传统习语。lambda 和 sorted() 是你会爱上的函数的例子......及时！

python - Python CSV 作业程序

1 回答 1

Related

Reference