0

假设我有以下代码:

def compute_ranks(graph, k):
    d = .8 #dampening factor
    loops = 10
    ranks = {}
    npages = len(graph)
    for page in graph:
        ranks[page] = 1.0 / npages
    for c in range(0, loops):
        newranks = {}
        for page in graph:
            newrank = (1-d) / npages
            for node in graph:
                if page in graph[node]:
                    newrank = newrank + d * (ranks[node]/len(graph[node]))
            newranks[page] = newrank
        ranks = newranks
     return ranks

好吧,现在假设我不想允许任何可以相互勾结的项目。如果我有一个项目字典

g = {'a': ['a', 'b', 'c'], 'b':['a'], 'c':['d'], 'd':['a']}

对于任何路径 A==>B,我不想允许来自 B==>A 的路径位于或低于我的数字 k 的距离。

例如,如果 k = 0,那么我不允许的唯一路径是 A==>A。

但是,如果 k = 2,那么我不会像以前那样允许链接 A==>A,但也不允许链接如 D==>A、B==>A 或 A==>C。

我知道这很令人困惑,我的大部分问题来自于不理解这意味着什么。

这是问题的记录:

# Question 2: Combatting Link Spam

# One of the problems with our page ranking system is pages can 
# collude with each other to improve their page ranks.  We consider 
# A->B a reciprocal link if there is a link path from B to A of length 
# equal to or below the collusion level, k.  The length of a link path 
# is the number of links which are taken to travel from one page to the 
# other.

# If k = 0, then a link from A to A is a reciprocal link for node A, 
# since no links needs to be taken to get from A to A.

# If k=1, B->A would count as a reciprocal link  if there is a link 
# A->B, which includes one link and so is of length 1. (it requires 
# two parties, A and B, to collude to increase each others page rank).

# If k=2, B->A would count as a reciprocal link for node A if there is
# a path A->C->B, for some page C, (link path of length 2),
# or a direct link A-> B (link path of length 1).

# Modify the compute_ranks code to 
#   - take an extra input k, which is a non-negative integer, and 
#   - exclude reciprocal links of length up to and including k from 
#     helping the page rank.
4

3 回答 3

0

一种可能的解决方案是引入一种检测合谋的递归方法。就像是:

def Colluding(p1,p2,itemDict,k):
    if p1 == p2:
        return True
    elif k == 0:
        return False
    else p2 in itemDict[p1]:
        return True
    for p in itemDict[p1]:
        if Colluding(p1,p,itemDict,k-1):
            return True
    return False

然后它说if item in itemDict[node]你会拥有if item in itemDict[node] and not Colluding(item,node,itemDict,k)或类似的东西。

如果在小深度(例如 A->B->A)处存在大量共谋链接,则进行深度优先搜索可能不是最佳选择,因为它们可能仅在经过几次全深度搜索后才能找到。在这种情况下,您也许可以找到另一种进行广度优先搜索的方法。此外,如果您有很多链接,则可能值得尝试转换为循环,因为如果您使用递归,Python 可能会出现堆栈溢出问题。递归算法是我首先想到的,因为使用递归遍历树似乎很自然。

注意:由于这是对家庭作业的帮助,因此我没有对其进行测试,并且某些比较可能不太正确,需要以某种方式进行调整。

于 2012-06-04T00:19:24.550 回答
0

我想到了:

def Colluding(p1,p2,itemDict,k, time):
if k >= len(itemDict)/2:
    return True
if k == 0:
    if time == 0:
        if p1 == p2 and p1 in itemDict[p2]:
            return True
        else:
            return False
    if time == 1:
        if p1 in itemDict[p2]:
            return True
        else:
            return False
for p in itemDict[p1]:
    if Colluding(p1,p,itemDict,k-1,1) and p == p2:
        return True
    else:
        return False
return False

def compute_ranks(graph, k):
    d = 0.8 # damping factor
    numloops = 10
    ranks = {}
    npages = len(graph)
    for page in graph:
        ranks[page] = 1.0 / npages
    for i in range(0, numloops):
        newranks = {}
        for page in graph:
            newrank = (1 - d) / npages
            for node in graph:
                if page in graph[node] and not Colluding(page, node, graph, k,0):
                    newrank = newrank + d * (ranks[node]/len(graph[node]))
            newranks[page] = newrank
        ranks = newranks
    return ranks

感谢@Ken 为我指明了正确的方向。

于 2012-06-04T06:21:29.070 回答
0

我也参加了这门课程。我提出了以下解决方案。基本上,我们需要的只是“所有互惠链接”。给定图形和级别 k,以下过程计算所有互惠链接。它返回一个元组列表,表示相互链接(node1,node2)。您需要做的就是检查(节点,页面)是否不在此列表中以计算排名。

def get_reciprocal_links(graph, k):
reci=[]
for node in g:
    for ele in g[node]:
        got = False
        work = []
        i=0
        if node == ele and i == k:
            reci.append((node,ele))
            break
        if ele in g:
            for e in g[ele]:
                work.append(e)

        while i < k:
            if node in work:
                reci.append((node,ele))
                got = True
                break
            else:
                i=i+1
                check=work.pop()
                if check in g:
                    for e in g[check]:
                        work.append(e)
        if got == True:
            continue
return reci
于 2012-09-18T13:26:33.247 回答