我正在构建一个while循环并遇到我的函数在while循环之外正常工作但在我的while循环中导致错误的问题。
这是while循环(仍在编码):
diff=1
while (diff!=0):
cents_new
cents_old=cents_new.copy()
print(cents_old)
print(cents_new)
feat_list=df_features.values.tolist()
print(feat_list)
dist_cent1=[]
dist_cent2=[]
dist_cent3=[]
print(dist_cent1)
print(dist_cent2)
print(dist_cent3)
dist_to_cent(cents_old, feat_list)
print(dist_cent1)
print(dist_cent2)
print(dist_cent3)
print(len(dist_cent1))
print(len(dist_cent2))
print(len(dist_cent3))
min_index_list=[]
print(min_index_list)
min_assign(dist_cent1,dist_cent2,dist_cent3)
print(min_index_list)
print(len(min_index_list))
df_features1=df_features.copy()
df_features1['Closest Centroid']=min_index_list
print(df_features)
print(df_features1)
feat_list1=df_features1.values.tolist()
print(feat_list1)
avg_list_cent1=[]
avg_list_cent2=[]
avg_list_cent3=[]
print(avg_list_cent1)
print(avg_list_cent2)
print(avg_list_cent3)
append_by_cent(feat_list1)
print(avg_list_cent1)
print(avg_list_cent2)
print(avg_list_cent3)
print(len(avg_list_cent1))
print(len(avg_list_cent2))
print(len(avg_list_cent3))
cents_new=[]
avg_and_assign(avg_list_cent1)
avg_and_assign(avg_list_cent2)
avg_and_assign(avg_list_cent3)
diff=0
当我去运行 avg_and_assign 函数时,该函数具有以下代码:
def avg_and_assign(*lists):
cents_new = []
for lst in lists:
alc = np.array(lst)
alc_mean = np.mean(alc, axis=0)[:4]
cents_new.append( np.ndarray.tolist(alc_mean) )
return cents_new
cents_new = avg_and_assign(avg_list_cent1, avg_list_cent2, avg_list_cent3)
我收到以下错误:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-104-2eb0f065bb1e> in <module>()
16 min_index_list=[]
17 #print(min_index_list)
---> 18 min_assign(dist_cent1,dist_cent2,dist_cent3)
19 #print(min_index_list)
20 df_features1=df_features[:]
<ipython-input-101-a817f672ae5a> in min_assign(l1, l2, l3)
51 #Here, I make a list that will hold the same index from each of the three
52 #centroid-to-feature distance lists.
---> 53 min_list=[l1[count], l2[count], l3[count]]
54 #I then determine which index has the minimum value.
55 min_index=min_list.index(min(min_list))
IndexError: list index out of range
这是引用另一个可以正常工作的函数,但现在在 while 循环中不是这样:
def min_assign(l1, l2, l3):
#I chose to use a count to both orderly input indices and to drive a while
#loop to that in turn drives appendices to my list.
count=0
while count<150:
#Here, I make a list that will hold the same index from each of the three
#centroid-to-feature distance lists.
min_list=[l1[count], l2[count], l3[count]]
#I then determine which index has the minimum value.
min_index=min_list.index(min(min_list))
#I then append the index value+1 (because I don't have a 0-labeled centoroid)
#to my list above the function
min_index_list.append(min_index+1)
#Increase the count because nobody likes an infinite loop. Then again, you're
#a mathematician so you may know somebody who does, at least outside of Python.
count+=1
#Here I run my function, being sure to input the centroid lists in numerical
#order so that each index output by the function numerically coincides to the
#centroid.
min_assign(dist_cent1,dist_cent2,dist_cent3)
我认为问题在于 while(count<150) 不再适用于 min_assign 不再有效。在 while 循环之前所有这些函数的第一次迭代中,每个列表都有 150 个条目。但是,150 个观测值中的每一个都应该与三个质心中的每一个都有距离,所以我不确定这会如何变化。但是,老实说,我曾多次尝试重做这些功能,但一遍又一遍地得到相同的错误。这是我运行的所有代码以供参考:
iris = pd.read_csv('Iris.csv')
iris=iris.iloc[:,1:]
iris.columns='sepal length', 'sepal width', 'petal length', 'petal width', 'species'
df=iris
df
df_features = df[['sepal length', 'sepal width', 'petal length', 'petal width']]
df_features
cent1=[5.4, 3.9,1.3,0.4]
cent2=[5.8,2.6,4.0,1.2]
cent3=[7.7,2.8,6.7,2.0]
cents=[cent1,cent2,cent3]
cents
df_cents=pd.DataFrame(cents, columns=['sepal length', 'sepal width', 'petal length', 'petal width'])
df_cents
cent_frame=pd.DataFrame(cents)
feat_list=df_features.values.tolist()
#Here I made lists to append the measured distances for each feature point from
#each centroid
dist_cent1=[]
dist_cent2=[]
dist_cent3=[]
#This is my distance function
def dist_to_cent(l1,l2):
#I used a count to track which list would have the distance value appended.
#There are 150 features, and each have to be measured against three different
#datapoints, therefore there would be 450 measurements performed.
count=0
for i in l1:
for j in l2:
dist=((i[0]-j[0])**2 + (i[1]-j[1])**2 + (i[2]-j[2])**2 + (i[3]-j[3])**2)**0.5
if count <=149:
dist_cent1.append(dist)
elif count<=299:
dist_cent2.append(dist)
else:
dist_cent3.append(dist)
count+=1
dist_to_cent(cents, feat_list)
#min_list= [dist_cent1[0],dist_cent2[0],dist_cent3[0]]
#print(min_list)
#print(min(min_list))
#print(min_list.index(min(min_list)))
#Now I have to determine the centroid with the minimum distance to each of the
#features. I did this with the below function. I started by making a blank list
#to hold my minimum values.
min_index_list=[]
def min_assign(l1, l2, l3):
#I chose to use a count to both orderly input indices and to drive a while
#loop to that in turn drives appendices to my list.
count=0
while count<150:
#Here, I make a list that will hold the same index from each of the three
#centroid-to-feature distance lists.
min_list=[l1[count], l2[count], l3[count]]
#I then determine which index has the minimum value.
min_index=min_list.index(min(min_list))
#I then append the index value+1 (because I don't have a 0-labeled centoroid)
#to my list above the function
min_index_list.append(min_index+1)
#Increase the count because nobody likes an infinite loop. Then again, you're
#a mathematician so you may know somebody who does, at least outside of Python.
count+=1
#Here I run my function, being sure to input the centroid lists in numerical
#order so that each index output by the function numerically coincides to the
#centroid.
min_assign(dist_cent1,dist_cent2,dist_cent3)
min_index_list
#Finally, I append the list to my features dataframe with the below column
#label.
df_features1=df_features.copy()
df_features1['Closest Centroid']=min_index_list
df_features
print(dist_cent1)
print(min_index_list)
feat_list1=df_features1.values.tolist()
#print(feat_list)
avg_list_cent1=[]
avg_list_cent2=[]
avg_list_cent3=[]
def append_by_cent(a_list):
for i in a_list:
if i[4]==1.0:
avg_list_cent1.append(i)
elif i[4]==2.0:
avg_list_cent2.append(i)
else:
avg_list_cent3.append(i)
append_by_cent(feat_list1)
#I will be basing my averages off using numpy as below:
#a = numpy.array([[240, 240, 239],
# [250, 249, 237],
# [242, 239, 237],
# [240, 234, 233]])
#print numpy.mean(a, axis=0)
#def avg_and_assign(*lists):
# cents_new = []
# for lst in lists:
# alc = np.array(lst)
# alc_mean = np.mean(alc, axis=0)[:4]
# cents_new.append( np.ndarray.tolist(alc_mean) )
# return cents_new
cents_new=[]
def avg_and_assign(l1):
l1a=[i.pop(4) for i in l1]
#print(l1)
arrl1=array(l1)
#print(arrl1)
arrl1=average(arrl1, axis=0)
new_cent1=arrl1.tolist()
cents_new.append(new_cent1)
avg_and_assign(avg_list_cent1)
avg_and_assign(avg_list_cent2)
avg_and_assign(avg_list_cent3)
print(cents_new)
为什么会突然将 avg_and_assign 函数触发在 min_assign 函数中的错误,该函数在将前一个函数粘贴到循环之前运行得很好。这些函数如何在循环外以相同的顺序运行但在循环中中断?
我只是不明白这里发生了什么。
任何帮助都会很棒。
顺便说一句:我知道我的代码很烂,我知道 Python。请可怜我。