3

我在连接操作中遇到了一个非常严重的错误。我也尝试了合并(left_index,right_index),结果相同。

索引是相同的(按设计),由两个索引上的 index.is_unique (TRUE) 和 index.get_duplicates() (EMPTY) 检查。

基础版:

df1.join(series)
merge(df1, series_as_df, 

print tempres.index

<class 'pandas.tseries.index.DatetimeIndex'> [2013-01-14 17:04:45, ..., 2013-01-14 16:53:05] 长度:89,频率:无,时区:无


奇怪的是打印值: print tempres.index.values [1970-01-16 121:04:45 1970-01-16 121:04:35 1970-01-16 121:04:25 1970-01-16 121:04:15 1970-01-16 121:04:05 1970-01-16 121:03:55 1970-01-16 121:03:45 1970-01-16 121:03:35 1970-01-16 121:03:25 1970-01-16 121:03:15 1970-01-16 121:03:05 1970-01-16 121:02:55 1970-01-16 121:02:45 1970-01-16 121:02:35 1970-01-16 121:02:25 1970-01-16 121:02:15 1970-01-16 121:02:05 1970-01-16 121:01:55 1970-01-16 121:01:45 1970-01-16 121:01:35 1970-01-16 121:01:25 ...]

如有必要,我可以添加腌制系列和 df...

使用最新的熊猫版本 0.10.x

谢谢,

卢克

我的代码(从较大的代码中截取)

XYTparams (existing dataframe)
prep_functions[funcname] = [list of values, same length as XYTparams]

iSeries = Series(prep_functions[funcname], index = XYTparams.index, name = funcname)
XYTparams = XYTparams.join(iSeries)

审查我的问题:

我在基本 DataFrame 上使用合并和连续连接。在尝试下一次合并/加入时,我开始遇到错误。我无法在简单的测试中重现这一点,但我在问题开始之前保存了数据帧。

我找不到什么问题。

base_df = load('SPOparams.pic')
lookup_df = load('lookup.pic')

print base_df
print lookup_df

print base_df.count()

print base_df['VKCSKEY1']
print lookup_df['traf_key']

# reset index does not change a thing
base_df = base_df.reset_index(drop=True)

print base_df.index
print base_df.index.get_duplicates()
print lookup_df.index
print lookup_df.index.get_duplicates()


# checking value matches
for k in lookup_df['traf_key']:
    print k, k in  base_df['VKCSKEY1'].values

# why does this merge is unsuccesfull ???
# in any combination of the parameters
df_result =merge(base_df, lookup_df, 
             how='left', 
             #how = 'outer',
             left_on ='VKCSKEY1', 
             right_on ='traf_key',
             #left_index=True, 
             #right_index = True,
             #sort=True, 
             #suffixes=('', '.m'), copy=True
             )
print df_result

输出:

1.6.1
0.10.1
<class 'pandas.core.frame.DataFrame'>
Int64Index: 89 entries, 0 to 88
Data columns:
T                        89  non-null values
X                        89  non-null values
Y                        89  non-null values
precip_quantity_1hour    89  non-null values
pressure                 89  non-null values
rel_humidity             89  non-null values
temp                     89  non-null values
temp_max                 0  non-null values
temp_min                 0  non-null values
wind_direction           89  non-null values
wind_speed               89  non-null values
BC_TRAF                  89  non-null values
closest                  89  non-null values
closest.m                89  non-null values
AGGP.P50_ID              89  non-null values
AGGP.FUNC_CLASS          89  non-null values
AGGP.SPEED_CAT           89  non-null values
LINK_ID                  89  non-null values
FUNC_CLASS               89  non-null values
SPEED_CAT                89  non-null values
AR_AUTO                  89  non-null values
AR_BUS                   89  non-null values
AR_TAXIS                 89  non-null values
AR_CARPOOL               89  non-null values
AR_PEDEST                89  non-null values
AR_TRUCKS                89  non-null values
STCA20_PCT               89  non-null values
VKC_LINKNR               89  non-null values
TRVIC150R1               89  non-null values
closest.m                89  non-null values
closest.m.m              89  non-null values
VKCP.LINK_ID             89  non-null values
VKCP.FUNC_CLASS          89  non-null values
VKCP.SPEED               89  non-null values
VKCP.LINKNR              89  non-null values
VKCP.TWIN_ID             89  non-null values
VKCSKEY1                 89  non-null values
dtypes: datetime64[ns](1), float64(13), int64(9), object(14)
<class 'pandas.core.frame.DataFrame'>
Index: 30 entries, (60744, 0) to (58314, 0)
Data columns:
traf_key      30  non-null values
weekday_nr    30  non-null values
linknr        30  non-null values
 weekday      30  non-null values
vr0           30  non-null values
vr1           30  non-null values
vr2           30  non-null values
vr3           30  non-null values
vr4           30  non-null values
vr5           30  non-null values
vr6           30  non-null values
vr7           30  non-null values
vr8           30  non-null values
vr9           30  non-null values
vr10          30  non-null values
vr11          30  non-null values
vr12          30  non-null values
vr13          30  non-null values
vr14          30  non-null values
vr15          30  non-null values
vr16          30  non-null values
vr17          30  non-null values
vr18          30  non-null values
vr19          30  non-null values
vr20          30  non-null values
vr21          30  non-null values
vr22          30  non-null values
vr23          30  non-null values
au0           30  non-null values
au1           30  non-null values
au2           30  non-null values
au3           30  non-null values
au4           30  non-null values
au5           30  non-null values
au6           30  non-null values
au7           30  non-null values
au8           30  non-null values
au9           30  non-null values
au10          30  non-null values
au11          30  non-null values
au12          30  non-null values
au13          30  non-null values
au14          30  non-null values
au15          30  non-null values
au16          30  non-null values
au17          30  non-null values
au18          30  non-null values
au19          30  non-null values
au20          30  non-null values
au21          30  non-null values
au22          30  non-null values
au23          30  non-null values
sn0           30  non-null values
sn1           30  non-null values
sn2           30  non-null values
sn3           30  non-null values
sn4           30  non-null values
sn5           30  non-null values
sn6           30  non-null values
sn7           30  non-null values
sn8           30  non-null values
sn9           30  non-null values
sn10          30  non-null values
sn11          30  non-null values
sn12          30  non-null values
sn13          30  non-null values
sn14          30  non-null values
sn15          30  non-null values
sn16          30  non-null values
sn17          30  non-null values
sn18          30  non-null values
sn19          30  non-null values
sn20          30  non-null values
sn21          30  non-null values
sn22          30  non-null values
sn23          30  non-null values
dtypes: float64(24), int64(50), object(2)
T                        89
X                        89
Y                        89
precip_quantity_1hour    89
pressure                 89
rel_humidity             89
temp                     89
temp_max                  0
temp_min                  0
wind_direction           89
wind_speed               89
BC_TRAF                  89
closest                  89
closest.m                89
AGGP.P50_ID              89
AGGP.FUNC_CLASS          89
AGGP.SPEED_CAT           89
LINK_ID                  89
FUNC_CLASS               89
SPEED_CAT                89
AR_AUTO                  89
AR_BUS                   89
AR_TAXIS                 89
AR_CARPOOL               89
AR_PEDEST                89
AR_TRUCKS                89
STCA20_PCT               89
VKC_LINKNR               89
TRVIC150R1               89
closest.m                89
closest.m.m              89
VKCP.LINK_ID             89
VKCP.FUNC_CLASS          89
VKCP.SPEED               89
VKCP.LINKNR              89
VKCP.TWIN_ID             89
VKCSKEY1                 89
0     (60744, 0)
1     (60744, 0)
2     (60744, 0)
3     (60750, 0)
4     (60768, 0)
5     (60768, 0)
6     (60758, 0)
7     (60758, 0)
8     (69223, 0)
9     (69223, 0)
10    (69223, 0)
11    (64265, 0)
12    (64265, 0)
13    (64265, 0)
14    (64265, 0)
15    (64265, 0)
16    (64265, 0)
17    (64265, 0)
18    (64265, 0)
19    (64265, 0)
20    (64216, 0)
21    (64216, 0)
22    (64216, 0)
23    (64216, 0)
24    (64216, 0)
25    (64216, 0)
26    (64216, 0)
27    (64216, 0)
28    (64216, 0)
29    (57085, 0)
30    (57085, 0)
31    (57085, 0)
32    (57085, 0)
33    (57085, 0)
34    (57085, 0)
35    (57014, 0)
36    (57033, 0)
37    (57033, 0)
38    (64065, 0)
39    (64065, 0)
40    (64065, 0)
41    (64065, 0)
42    (64065, 0)
43    (57070, 0)
44    (64062, 0)
45    (64062, 0)
46    (64062, 0)
47    (64062, 0)
48    (57070, 0)
49    (64061, 0)
50    (64061, 0)
51    (64061, 0)
52    (64061, 0)
53    (59849, 0)
54    (59415, 0)
55    (58487, 0)
56    (58054, 0)
57    (58054, 0)
58    (58054, 0)
59    (52551, 0)
60    (58054, 0)
61    (58054, 0)
62    (58054, 0)
63    (58054, 0)
64    (52551, 0)
65    (58054, 0)
66    (58488, 0)
67    (58488, 0)
68    (58028, 0)
69    (58464, 0)
70    (58028, 0)
71    (57989, 0)
72    (58595, 0)
73    (58027, 0)
74    (57989, 0)
75    (58595, 0)
76    (58595, 0)
77    (58019, 0)
78    (58595, 0)
79    (58595, 0)
80    (58019, 0)
81    (58595, 0)
82    (58595, 0)
83    (66715, 0)
84    (58595, 0)
85    (59295, 0)
86    (67614, 0)
87    (58314, 0)
88    (58314, 0)
Name: VKCSKEY1, Length: 89
VKCSKEY1
(60744, 0)    (60744, 0)
(60750, 0)    (60750, 0)
(60768, 0)    (60768, 0)
(60758, 0)    (60758, 0)
(69223, 0)    (69223, 0)
(64265, 0)    (64265, 0)
(64216, 0)    (64216, 0)
(57085, 0)    (57085, 0)
(57014, 0)    (57014, 0)
(57033, 0)    (57033, 0)
(64065, 0)    (64065, 0)
(57070, 0)    (57070, 0)
(64062, 0)    (64062, 0)
(64061, 0)    (64061, 0)
(59849, 0)    (59849, 0)
(59415, 0)    (59415, 0)
(58487, 0)    (58487, 0)
(58054, 0)    (58054, 0)
(52551, 0)    (52551, 0)
(58488, 0)    (58488, 0)
(58028, 0)    (58028, 0)
(58464, 0)    (58464, 0)
(57989, 0)    (57989, 0)
(58595, 0)    (58595, 0)
(58027, 0)    (58027, 0)
(58019, 0)    (58019, 0)
(66715, 0)    (66715, 0)
(59295, 0)    (59295, 0)
(67614, 0)    (67614, 0)
(58314, 0)    (58314, 0)
Name: traf_key
Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88], dtype=int64)
[]
Index([(60744, 0), (60750, 0), (60768, 0), (60758, 0), (69223, 0), (64265, 0), (64216, 0), (57085, 0), (57014, 0), (57033, 0), (64065, 0), (57070, 0), (64062, 0), (64061, 0), (59849, 0), (59415, 0), (58487, 0), (58054, 0), (52551, 0), (58488, 0), (58028, 0), (58464, 0), (57989, 0), (58595, 0), (58027, 0), (58019, 0), (66715, 0), (59295, 0), (67614, 0), (58314, 0)], dtype=object)
[]
(60744, 0) True
(60750, 0) True
(60768, 0) True
(60758, 0) True
(69223, 0) True
(64265, 0) True
(64216, 0) True
(57085, 0) True
(57014, 0) True
(57033, 0) True
(64065, 0) True
(57070, 0) True
(64062, 0) True
(64061, 0) True
(59849, 0) True
(59415, 0) True
(58487, 0) True
(58054, 0) True
(52551, 0) True
(58488, 0) True
(58028, 0) True
(58464, 0) True
(57989, 0) True
(58595, 0) True
(58027, 0) True
(58019, 0) True
(66715, 0) True
(59295, 0) True
(67614, 0) True
(58314, 0) True
Traceback (most recent call last):
  File "L:\temp\pandas_join_bug.py", line 43, in <module>
    right_on ='traf_key',
  File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 36, in merge
    return op.get_result()
  File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 185, in get_result
    ldata, rdata = self._get_merge_data()
  File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 277, in _get_merge_data
    copydata=False)
  File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 1194, in _maybe_rename_join
    to_rename = self.items.intersection(other.items)
  File "C:\Python27\lib\site-packages\pandas\core\index.py", line 666, in intersection
    indexer = self.get_indexer(other.values)
  File "C:\Python27\lib\site-packages\pandas\core\index.py", line 812, in get_indexer
    raise Exception('Reindexing only valid with uniquely valued Index '
Exception: Reindexing only valid with uniquely valued Index objects

一旦发生错误,我就无法获得任何成功的合并或连接语句。起初我没有看到错误与重复的合并/加入操作有关。最新集合的任何单个合并/加入现在都可以使用。一旦我需要另一个合并/加入,我就会得到同样的错误。

4

1 回答 1

16

重复的列名会导致这个错误,尝试消除重复的列名

于 2013-08-29T20:11:48.220 回答