2

我有这个代码:

 data = np.genfromtxt('csv_data.csv', dtype=None, names=True)

 print data

它导致以下输出

 [('westin,390,291,70,43,19,215,27,813',)
  ('ramada,136,67,53,30,24,149,49,310',)
  ('sutton,489,293,106,39,20,299,24,947',)
  ('loden,681,134,17,5,0,199,4,837',) ('hampton,241,166,26,5,1,159,21,439',)
  ('shangrila,332,45,20,8,2,325,8,407',) ('mariott,22,15,5,0,0,179,35,42',)
  ('pan_pacific,475,262,86,29,16,249,15,868',)
  ('sheraton,277,346,150,80,26,249,45,879',)
  ('westin_bayshore,390,291,70,43,19,199,27,813',)]

它没有复制列标题:

  Hotel,excellent,verygood,average,poor,terrible,cheapest,rank,reviews

从文件中。我想做的是将输出保存到python中的字典数据结构中。有没有办法将此输出转换为字典?

我可以编写一个函数来解析它,但我想知道 Python 中是否有内置函数。

谢谢

4

4 回答 4

2

你没有给delimiter参数赋值。因此,np.genfromtxt使用默认值None并尝试使用空格分隔字段。

你需要使用

np.genfromtxt(your_file, dtype=None, delimiter=',', names=True)
于 2012-10-23T14:39:05.463 回答
0

简单版:

d = {  item[0].split(',')[0] : item[0].split(',')[1:] for item in data  }

返回 :

{'sutton': ['489', '293', '106', '39', '20', '299', '24', '947'], 'hampton': ['241', '166', '26', '5', '1', '159', '21', '439'], 'westin_bayshore': ['390', '291', '70', '43', '19', '199', '27', '813'], 'sheraton': ['277', '346', '150', '80', '26', '249', '45', '879'], 'ramada': ['136', '67', '53', '30', '24', '149', '49', '310'], 'mariott': ['22', '15', '5', '0', '0', '179', '35', '42'], 'loden': ['681', '134', '17', '5', '0', '199', '4', "837'"], 'shangrila': ['332', '45', '20', '8', '2', '325', '8', '407'], 'pan_pacific': ['475', '262', '86', '29', '16', '249', '15', '868']}

更复杂(dict of dict):

d = {  item[0].split(',')[0] : { headers[i] : int( item[0].split(',')[i+1].strip("'") ) for i in range(len( item[0].split(',')[1:] ) )   }  for item in data  }

返回 :

{'sutton': {'poor': 39, 'cheapest': 299, 'average': 106, 'terrible': 20, 'rank': 24, 'reviews': 947, 'excellent': 489, 'verygood': 293}, 'hampton': {'poor': 5, 'cheapest': 159, 'average': 26, 'terrible': 1, 'rank': 21, 'reviews': 439, 'excellent': 241, 'verygood': 166}, 'westin_bayshore': {'poor': 43, 'cheapest': 199, 'average': 70, 'terrible': 19, 'rank': 27, 'reviews': 813, 'excellent': 390, 'verygood': 291}, 'sheraton': {'poor': 80, 'cheapest': 249, 'average': 150, 'terrible': 26, 'rank': 45, 'reviews': 879, 'excellent': 277, 'verygood': 346}, 'ramada': {'poor': 30, 'cheapest': 149, 'average': 53, 'terrible': 24, 'rank': 49, 'reviews': 310, 'excellent': 136, 'verygood': 67}, 'mariott': {'poor': 0, 'cheapest': 179, 'average': 5, 'terrible': 0, 'rank': 35, 'reviews': 42, 'excellent': 22, 'verygood': 15}, 'loden': {'poor': 5, 'cheapest': 199, 'average': 17, 'terrible': 0, 'rank': 4, 'reviews': 837, 'excellent': 681, 'verygood': 134}, 'shangrila': {'poor': 8, 'cheapest': 325, 'average': 20, 'terrible': 2, 'rank': 8, 'reviews': 407, 'excellent': 332, 'verygood': 45}, 'pan_pacific': {'poor': 29, 'cheapest': 249, 'average': 86, 'terrible': 16, 'rank': 15, 'reviews': 868, 'excellent': 475, 'verygood': 262}}
于 2012-10-23T14:46:41.687 回答
0
import csv 
f = open("csv_data",'r')
holder = csv.reader(f,delimiter = ',')
data_dict = {} 
headers = [] 
first_row = True   
for row in holder:
  if first_row:
    first_row = False
    for header in row:
      colname = str(header)
        headers.append(colname)
        data_dict[colname] = []
  else:
    colnum = 0
    for datapoint in row:
      data_dict[headers[colnum]].append(int(datapoint))
      colnum += 1

因此,您可以拥有一个字典变量,其键是列标题(它们是 csv 文件的第一行)和与这些键关联的值作为列表(csv 文件中的剩余数据)。此外,标题是所有列标题的列表。

于 2012-10-23T14:48:21.657 回答
0

csv使用模块自己处理文件。

下面获取该文件,并创建一个名为的字典by_hotel,其键是酒店名称,其值是字段名->原始行的值的字典(注意它还包括酒店名称,但无论如何......)

import csv

with open('csv_data.csv') as fin:
    csvin = csv.DictReader(fin)
    headers = csvin.fieldnames
    by_hotel = {row['Hotel']: row for row in csvin}

print by_hotel['sutton']['excellent']
# 489

如果您想要以原始顺序返回列表,那么您可以执行以下操作:

print [hotel['sutton'][fname] for fname in headers]

注意:您可能希望将值转换为整数以进行计算。

于 2012-10-23T17:00:56.590 回答