0

我定义了一个名为 dist 的方法,用于计算两点之间的距离,我在直接使用该方法时正确地做到了这一点。但是,当我得到一个函数来调用它来计算两点之间的距离时,我得到 UnboundLocalError: local variable 'minkowski_distance' referenced before assignment

编辑 对不起,我刚刚意识到,这个功能确实有效。但是我有另一种调用它的方法。我把最后一个方法放在底部

这是方法:

class MinkowskiDistance(Distance):    
  def __init__(self, dist_funct_name_str = 'Minkowski distance', p=2):    
    self.p = p

  def dist(self, obj_a, obj_b):    
    distance_to_power_p=0    
    p=self.p    

    for i in range(len(obj_a)):    
      distance_to_power_p += abs((obj_a[i]-obj_b[i]))**(p)

    minkowski_distance = (distance_to_power_p)**(1/p)    
    return minkowski_distance

这就是函数:(它基本上将元组 x 和 y 拆分为它们的数字和字符串分量,并计算 x 和 y 的数字部分之间的距离,然后计算字符串部分之间的距离,然后将它们相加。

def total_dist(x, y, p=2, q=2):    
    jacard = QGramDistance(q=q)    
    minkowski = MinkowskiDistance(p=p)

    x_num = []    
    x_str = []    
    y_num = []    
    y_str = []

    #I am spliting each vector into its numerical parts and its string parts so that the distances
    #of each part can be found, then summed together.

    for i in range(len(x)):

        if type(x[i]) == float or type(x[i]) == int:    
            x_num.append(x[i])    
            y_num.append(y[i])    
        else:
            x_str.append(x[i])    
            y_str.append(y[i])

    num_dist = minkowski.dist(x_num,y_num)

    str_dist = I find using some more steps
    #I am simply adding the two types of distance to get the total distance:

    return num_dist + str_dist

class NearestNeighbourClustering(Clustering):

  def __init__(self, data_file,
               clust_algo_name_str='', strip_header = "no", remove = -1):

      self.data_file= data_file    
      self.header_strip = strip_header    
      self.remove_column = remove

  def run_clustering(self, max_dist, p=2, q=2):    
      K = {}    

      #dictionary of clusters    
      data_points = self.read_data_file()    
      K[0]=[data_points[0]]    
      k=0

      #I added the first point in the data to the 0th cluster    
      #k = number of clusters minus 1

      n = len(data_points)    
      for i in range(1,n):    
          data_point_in_a_cluster = "no"

          for c in range(k+1):    
              distances_from_i = [total_dist(data_points[i],K[c][j], p=p, q=q) for j in range(len(K[c]))]
          d = min(distances_from_i)
          if d <= max_dist:
              K[c].append(data_points[i])
              data_point_in_a_cluster = "yes"

      if data_point_in_a_cluster == "no":
          k += 1
          K[k]=[data_points[i]]
  return K
4

3 回答 3

0

minkowski_distance = (distance_to_power_p)**(1/p)仅当控件进入 for 循环时才执行该行

查看len(obj_a)

如果返回的值是0然后语句return minkowski_distance将抛出错误local variable 'minkowski_distance' referenced before assignment

minkowski_distance = (distance_to_power_p)**(1/p)您应该从 for 循环中取出该行

for i in range(len(obj_a)):
    distance_to_power_p += abs((obj_a[i]-obj_b[i]))**(p)

minkowski_distance = (distance_to_power_p)**(1/p) # this is only assignment, no need 
                                                  # for this to be inside the loop

return minkowski_distance
于 2012-10-26T04:38:26.940 回答
0

如果obj_a为空,minkowski_distance则永远不会设置

for i in range(len(obj_a)):

  distance_to_power_p += abs((obj_a[i]-obj_b[i]))**(p)

  minkowski_distance = (distance_to_power_p)**(1/p)

return minkowski_distance

这将在 x 都不是整数或浮点数的情况下发生

于 2012-10-26T04:38:54.453 回答
0

你似乎有一个缩进问题:

class MinkowskiDistance(Distance):
  def __init__(self, dist_funct_name_str = 'Minkowski distance', p=2):
    self.p = p

  def dist(self, obj_a, obj_b):
    distance_to_power_p=0
    p=self.p

    for i in range(len(obj_a)):
      distance_to_power_p += abs((obj_a[i]-obj_b[i]))**(p)
      minkowski_distance = (distance_to_power_p)**(1/p)

    return minkowski_distance

minkowski_distance = (distance_to_power_p)**(1/p)是在一个for循环内,所以如果for循环永远不会运行,minkowski_distance永远不会设置,你的错误就会出现。

将该行降低一个缩进级别(两个空格),一切都应该工作。

于 2012-10-26T04:39:54.277 回答