假设我有以下代码:
def compute_ranks(graph, k):
d = .8 #dampening factor
loops = 10
ranks = {}
npages = len(graph)
for page in graph:
ranks[page] = 1.0 / npages
for c in range(0, loops):
newranks = {}
for page in graph:
newrank = (1-d) / npages
for node in graph:
if page in graph[node]:
newrank = newrank + d * (ranks[node]/len(graph[node]))
newranks[page] = newrank
ranks = newranks
return ranks
好吧,现在假设我不想允许任何可以相互勾结的项目。如果我有一个项目字典
g = {'a': ['a', 'b', 'c'], 'b':['a'], 'c':['d'], 'd':['a']}
对于任何路径 A==>B,我不想允许来自 B==>A 的路径位于或低于我的数字 k 的距离。
例如,如果 k = 0,那么我不允许的唯一路径是 A==>A。
但是,如果 k = 2,那么我不会像以前那样允许链接 A==>A,但也不允许链接如 D==>A、B==>A 或 A==>C。
我知道这很令人困惑,我的大部分问题来自于不理解这意味着什么。
这是问题的记录:
# Question 2: Combatting Link Spam
# One of the problems with our page ranking system is pages can
# collude with each other to improve their page ranks. We consider
# A->B a reciprocal link if there is a link path from B to A of length
# equal to or below the collusion level, k. The length of a link path
# is the number of links which are taken to travel from one page to the
# other.
# If k = 0, then a link from A to A is a reciprocal link for node A,
# since no links needs to be taken to get from A to A.
# If k=1, B->A would count as a reciprocal link if there is a link
# A->B, which includes one link and so is of length 1. (it requires
# two parties, A and B, to collude to increase each others page rank).
# If k=2, B->A would count as a reciprocal link for node A if there is
# a path A->C->B, for some page C, (link path of length 2),
# or a direct link A-> B (link path of length 1).
# Modify the compute_ranks code to
# - take an extra input k, which is a non-negative integer, and
# - exclude reciprocal links of length up to and including k from
# helping the page rank.