1

我的代码中有许多语句遵循以下相同的格式。我正在寻找一种方法或内置函数,可以用来进一步压缩这些示例,而不是现有的列表理解。示例如下:

sample_1_combined = [i for i in zip(sample_1_genes, mean_values)]
sample_2_combined = [i for i in zip(sample_2_genes, mean_values)]
sample_3_combined = [i for i in zip(sample_3_genes, mean_values)]
sample_4_combined = [i for i in zip(sample_4_genes, mean_values)]
sample_5_combined = [i for i in zip(sample_5_genes, mean_values)]
sample_6_combined = [i for i in zip(sample_6_genes, mean_values)]

sample_1_final = sorted(sample_1_combined, key=lambda expvalues: expvalues[0])
sample_2_final = sorted(sample_2_combined, key=lambda expvalues: expvalues[0])
sample_3_final = sorted(sample_3_combined, key=lambda expvalues: expvalues[0])
sample_4_final = sorted(sample_4_combined, key=lambda expvalues: expvalues[0])
sample_5_final = sorted(sample_5_combined, key=lambda expvalues: expvalues[0])
sample_6_final = sorted(sample_6_combined, key=lambda expvalues: expvalues[0])

在应用程序的其他地方,有更多的块使用每个单独的列表,例如:

sample_1_graph = [j for i, j in sample_1_final]
sample_2_graph = [j for i, j in sample_2_final]
sample_3_graph = [j for i, j in sample_3_final]
sample_4_graph = [j for i, j in sample_4_final]
sample_5_graph = [j for i, j in sample_5_final]
sample_6_graph = [j for i, j in sample_6_final]

这种格式的最后一个块:

plt.hist(sample_1_graph, bins=21, histtype='stepfilled', normed=True, color='b', label='278')
plt.hist(sample_2_graph, bins=21, histtype='stepfilled', normed=True, color='g', alpha=0.5, label='470')
plt.hist(sample_3_graph, bins=21, histtype='stepfilled', normed=True, color='r', alpha=0.5, label='543')
plt.hist(sample_4_graph, bins=21, histtype='stepfilled', normed=True, color='c', alpha=0.5, label='5934')
plt.hist(sample_5_graph, bins=21, histtype='stepfilled', normed=True, color='m', alpha=0.5, label='6102')
plt.hist(sample_6_graph, bins=21, histtype='stepfilled', normed=True, color='y', alpha=0.5, label='17163')

上面修改后的代码现在是:

# Compute row means.
mean_values = []
for i, (a, b, c, d, e, f) in enumerate(zip(sample_1_values, sample_2_values, sample_3_values, sample_4_values, sample_5_values, sample_6_values)):
    mean_values.append((a + b + c + d + e + f)/6)

# Provide proper gene names for mean values and replace original data values by corresponding means.
sample_genes_list = [i for i in sample_1_genes, sample_2_genes, sample_3_genes, sample_4_genes, sample_5_genes, sample_6_genes]

sample_final_list = [sorted(zip(sg, mean_values)) for sg in sample_genes_list]

# Plot an overlayed histogram of normalized data.
sample_graph_list = [[j for i, j in sample_final] for sample_final in sample_final_list]

colors = 'bgrcmy'
alphas = ['0.5', '0.5', '0.5', '0.5', '0.5', '0.5']
labels = ['278', '470', '543', '5934', '6102', '17163']

for graph, color, alpha, label in zip(sample_graph_list, colors, alphas, labels):
    plt.hist(graph, bins=21, histtype='stepfilled',
             normed=True, color=color, alpha=float(alpha), label=label)
4

1 回答 1

4

如果可能,制作一个嵌套列表sample_genes_list = [sample_1_genes, ...]

接着

sample_final_list = [sorted(zip(sg, mean_values) for sg in sample_genes_list]

这应该等同于您当前的代码,因为:

  1. list()如果你有 Python 2 或等同于 Python 3 ,列表推导什么都不做。sorted接受任何可迭代的,所以没关系。

  2. 元组自然首先按第 0 个元素排序。

更新以响应问题编辑:

sample_graph_list = [[j for i, j in sample_final]
        for sample_final in sample_final_list]

编辑2:最后:

colors = 'bgrcmy'
labels = ['278', '470', '543', '5934', '6102', '17163']
for graph, color, label in zip(sample_graph_list, colors, labels):
    plt.hist(graph, bins=21, histtype='stepfilled',
             normed=True, color=color, label=label)
于 2013-05-24T12:15:46.913 回答