0

好吧,伙计们。我的教授说有一种方法可以在没有 Python3 中任何循环的帮助的情况下执行此功能。我没有看到它atm。她建议使用 zip、enumerate、readlines 和 split(";")(每条评论后跟一个 ';',如果连续有两条,则表示该评论者没有评论这部电影)。我正在做的是看电影,在 movMat 列表中寻找比较电影。然后我将它们与普通评论者进行比较。之后我必须得到 Pearson 计算,这包括获取当前电影的共同评论者、目标电影(比较电影)的值、获得所述共同评论者值的平均值、标准偏差,最后是 Pearson R相关性。

def pCalc (movMat, movNumber, n) 
    indexes1 = [i for i,x in enumerate(movMat[movNumber][1].split(';')) if x == '1' or x == '2' or x == '3' or x == '4' or x == '5' ]
    indexes2 = [i for i,x in enumerate(movMat[n][1].split(';')) if x == '1' or x == '2' or x == '3' or x == '4' or x == '5' ]

    compare = list(set(indexes1).intersection(indexes2))

    xi = []
    for index, val in enumerate(movMat[movNumber][1].split(';')):
        if index in compare:
             xi.append(int(val))

    average1 = sum(xi)/len(compare)
    stdDev1 = statistics.stdev(xi)

    yi = []
    for index, val in enumerate(movMat[n][1].split(';')):
        if index in compare:
             yi.append(int(val))

    average2 = sum(yi)/len(compare)
    stdDev2 = statistics.stdev(yi)

    r = 0
    newSum = 0

    for i in range(0, len(compare)):
        newSum += ((xi[i]-average1)/stdDev1) * ((yi[i]-average2)/stdDev2)

    r = (1/(len(compare)-1)) * newSum

一个示例输入是:

该程序的主要部分处理参数调用、文件中的行和诸如此类的东西,但是输入命令行参数“1”的示例输出将调用玩具故事并将其与数据库中的其他电影进行比较,如下所示:

Movie number: Movie  1|Toy Story (1995)

*** No. of rows (movies) in matrix =  1682
*** No. of columns (reviewers) = 943
Output shows r-value, movie no.|name, no. of ratings

compare movie is  1|Toy Story (1995)
no. of common reviewers 452
target avg   3.8783185840707963
compare avg  3.8783185840707963
target std   0.9278967014291252
compare std  0.9278967014291252
r            0.999999999999991

compare movie is  2|GoldenEye (1995)
no. of common reviewers 104
target avg   3.8653846153846154
compare avg  3.201923076923077
target std   0.9456871165874381
compare std  0.9177833965361495
r            0.22178411018797187

compare movie is  3|Four Rooms (1995)
no. of common reviewers 78
target avg   3.717948717948718
compare avg  2.9358974358974357
target std   0.9520645495064435
compare std  1.2096982943568881
r            0.1757942980351483

compare movie is  4|Get Shorty (1995)
no. of common reviewers 149
target avg   3.87248322147651
compare avg  3.530201342281879
target std   0.9247979370536794
compare std  0.9970025819307402
r            0.10313529410109303
4

0 回答 0