我正在尝试编写一个数值聚类工具。基本上,我有一个列表(这里称为“产品”),应该从升序列表转换为指示数据集中数字之间链接的列表。读取数据集,删除回车符和连字符都可以,但是根据数据集操作列表给我带来了问题。
# opening file and returning raw data
file = input('Data file: ')
with open(file) as t:
nums = t.readlines()
t.close()
print(f'Raw data: {nums}')
# counting pairs in raw data
count = 0
for i in nums:
count += 1
print(f'Count of number pairs: {count}')
# removing carriage returns and hyphens
one = []
for i in nums:
one.append(i.rsplit())
new = []
for i in one:
for a in i:
new.append(a.split('-'))
print(f'Data sets: {new}')
# finding the range of the final list
my_list = []
for i in new:
for e in i:
my_list.append(int(e))
ran = max(my_list) + 1
print(f'Range of final list: {ran}')
# setting up the product list
rcount = count-1
product = list(range(ran))
print(f'Unchanged product: {product}')
for i in product:
for e in range(rcount):
if product[int(new[e][0])] < product[int(new[e][1])]:
product[int(new[e][1])] = product[int(new[e][0])]
else:
product[int(new[e][0])] = product[int(new[e][1])]
print(f'Resulting product: {product}')
我希望结果是 [0, 1, 1, 1, 1, 5, 5, 7, 7, 9, 1, 5, 5],但是当使用不同的数据集。
用于给出上述所需产品的数据集如下:'1-2\n', '2-3\n', '3-4\n', '5-6\n', '7-8 \n', '2-10\n', '11-12\n', '5-12\n', '\n'
但是,我面临的最大问题是使用其他数据集。如果没有额外的回车,事实证明,我将出现列表索引超出范围错误。