我在连接操作中遇到了一个非常严重的错误。我也尝试了合并(left_index,right_index),结果相同。
索引是相同的(按设计),由两个索引上的 index.is_unique (TRUE) 和 index.get_duplicates() (EMPTY) 检查。
基础版:
df1.join(series)
merge(df1, series_as_df,
print tempres.index
<class 'pandas.tseries.index.DatetimeIndex'> [2013-01-14 17:04:45, ..., 2013-01-14 16:53:05] 长度:89,频率:无,时区:无
奇怪的是打印值: print tempres.index.values [1970-01-16 121:04:45 1970-01-16 121:04:35 1970-01-16 121:04:25 1970-01-16 121:04:15 1970-01-16 121:04:05 1970-01-16 121:03:55 1970-01-16 121:03:45 1970-01-16 121:03:35 1970-01-16 121:03:25 1970-01-16 121:03:15 1970-01-16 121:03:05 1970-01-16 121:02:55 1970-01-16 121:02:45 1970-01-16 121:02:35 1970-01-16 121:02:25 1970-01-16 121:02:15 1970-01-16 121:02:05 1970-01-16 121:01:55 1970-01-16 121:01:45 1970-01-16 121:01:35 1970-01-16 121:01:25 ...]
如有必要,我可以添加腌制系列和 df...
使用最新的熊猫版本 0.10.x
谢谢,
卢克
我的代码(从较大的代码中截取)
XYTparams (existing dataframe)
prep_functions[funcname] = [list of values, same length as XYTparams]
iSeries = Series(prep_functions[funcname], index = XYTparams.index, name = funcname)
XYTparams = XYTparams.join(iSeries)
审查我的问题:
我在基本 DataFrame 上使用合并和连续连接。在尝试下一次合并/加入时,我开始遇到错误。我无法在简单的测试中重现这一点,但我在问题开始之前保存了数据帧。
我找不到什么问题。
base_df = load('SPOparams.pic')
lookup_df = load('lookup.pic')
print base_df
print lookup_df
print base_df.count()
print base_df['VKCSKEY1']
print lookup_df['traf_key']
# reset index does not change a thing
base_df = base_df.reset_index(drop=True)
print base_df.index
print base_df.index.get_duplicates()
print lookup_df.index
print lookup_df.index.get_duplicates()
# checking value matches
for k in lookup_df['traf_key']:
print k, k in base_df['VKCSKEY1'].values
# why does this merge is unsuccesfull ???
# in any combination of the parameters
df_result =merge(base_df, lookup_df,
how='left',
#how = 'outer',
left_on ='VKCSKEY1',
right_on ='traf_key',
#left_index=True,
#right_index = True,
#sort=True,
#suffixes=('', '.m'), copy=True
)
print df_result
输出:
1.6.1
0.10.1
<class 'pandas.core.frame.DataFrame'>
Int64Index: 89 entries, 0 to 88
Data columns:
T 89 non-null values
X 89 non-null values
Y 89 non-null values
precip_quantity_1hour 89 non-null values
pressure 89 non-null values
rel_humidity 89 non-null values
temp 89 non-null values
temp_max 0 non-null values
temp_min 0 non-null values
wind_direction 89 non-null values
wind_speed 89 non-null values
BC_TRAF 89 non-null values
closest 89 non-null values
closest.m 89 non-null values
AGGP.P50_ID 89 non-null values
AGGP.FUNC_CLASS 89 non-null values
AGGP.SPEED_CAT 89 non-null values
LINK_ID 89 non-null values
FUNC_CLASS 89 non-null values
SPEED_CAT 89 non-null values
AR_AUTO 89 non-null values
AR_BUS 89 non-null values
AR_TAXIS 89 non-null values
AR_CARPOOL 89 non-null values
AR_PEDEST 89 non-null values
AR_TRUCKS 89 non-null values
STCA20_PCT 89 non-null values
VKC_LINKNR 89 non-null values
TRVIC150R1 89 non-null values
closest.m 89 non-null values
closest.m.m 89 non-null values
VKCP.LINK_ID 89 non-null values
VKCP.FUNC_CLASS 89 non-null values
VKCP.SPEED 89 non-null values
VKCP.LINKNR 89 non-null values
VKCP.TWIN_ID 89 non-null values
VKCSKEY1 89 non-null values
dtypes: datetime64[ns](1), float64(13), int64(9), object(14)
<class 'pandas.core.frame.DataFrame'>
Index: 30 entries, (60744, 0) to (58314, 0)
Data columns:
traf_key 30 non-null values
weekday_nr 30 non-null values
linknr 30 non-null values
weekday 30 non-null values
vr0 30 non-null values
vr1 30 non-null values
vr2 30 non-null values
vr3 30 non-null values
vr4 30 non-null values
vr5 30 non-null values
vr6 30 non-null values
vr7 30 non-null values
vr8 30 non-null values
vr9 30 non-null values
vr10 30 non-null values
vr11 30 non-null values
vr12 30 non-null values
vr13 30 non-null values
vr14 30 non-null values
vr15 30 non-null values
vr16 30 non-null values
vr17 30 non-null values
vr18 30 non-null values
vr19 30 non-null values
vr20 30 non-null values
vr21 30 non-null values
vr22 30 non-null values
vr23 30 non-null values
au0 30 non-null values
au1 30 non-null values
au2 30 non-null values
au3 30 non-null values
au4 30 non-null values
au5 30 non-null values
au6 30 non-null values
au7 30 non-null values
au8 30 non-null values
au9 30 non-null values
au10 30 non-null values
au11 30 non-null values
au12 30 non-null values
au13 30 non-null values
au14 30 non-null values
au15 30 non-null values
au16 30 non-null values
au17 30 non-null values
au18 30 non-null values
au19 30 non-null values
au20 30 non-null values
au21 30 non-null values
au22 30 non-null values
au23 30 non-null values
sn0 30 non-null values
sn1 30 non-null values
sn2 30 non-null values
sn3 30 non-null values
sn4 30 non-null values
sn5 30 non-null values
sn6 30 non-null values
sn7 30 non-null values
sn8 30 non-null values
sn9 30 non-null values
sn10 30 non-null values
sn11 30 non-null values
sn12 30 non-null values
sn13 30 non-null values
sn14 30 non-null values
sn15 30 non-null values
sn16 30 non-null values
sn17 30 non-null values
sn18 30 non-null values
sn19 30 non-null values
sn20 30 non-null values
sn21 30 non-null values
sn22 30 non-null values
sn23 30 non-null values
dtypes: float64(24), int64(50), object(2)
T 89
X 89
Y 89
precip_quantity_1hour 89
pressure 89
rel_humidity 89
temp 89
temp_max 0
temp_min 0
wind_direction 89
wind_speed 89
BC_TRAF 89
closest 89
closest.m 89
AGGP.P50_ID 89
AGGP.FUNC_CLASS 89
AGGP.SPEED_CAT 89
LINK_ID 89
FUNC_CLASS 89
SPEED_CAT 89
AR_AUTO 89
AR_BUS 89
AR_TAXIS 89
AR_CARPOOL 89
AR_PEDEST 89
AR_TRUCKS 89
STCA20_PCT 89
VKC_LINKNR 89
TRVIC150R1 89
closest.m 89
closest.m.m 89
VKCP.LINK_ID 89
VKCP.FUNC_CLASS 89
VKCP.SPEED 89
VKCP.LINKNR 89
VKCP.TWIN_ID 89
VKCSKEY1 89
0 (60744, 0)
1 (60744, 0)
2 (60744, 0)
3 (60750, 0)
4 (60768, 0)
5 (60768, 0)
6 (60758, 0)
7 (60758, 0)
8 (69223, 0)
9 (69223, 0)
10 (69223, 0)
11 (64265, 0)
12 (64265, 0)
13 (64265, 0)
14 (64265, 0)
15 (64265, 0)
16 (64265, 0)
17 (64265, 0)
18 (64265, 0)
19 (64265, 0)
20 (64216, 0)
21 (64216, 0)
22 (64216, 0)
23 (64216, 0)
24 (64216, 0)
25 (64216, 0)
26 (64216, 0)
27 (64216, 0)
28 (64216, 0)
29 (57085, 0)
30 (57085, 0)
31 (57085, 0)
32 (57085, 0)
33 (57085, 0)
34 (57085, 0)
35 (57014, 0)
36 (57033, 0)
37 (57033, 0)
38 (64065, 0)
39 (64065, 0)
40 (64065, 0)
41 (64065, 0)
42 (64065, 0)
43 (57070, 0)
44 (64062, 0)
45 (64062, 0)
46 (64062, 0)
47 (64062, 0)
48 (57070, 0)
49 (64061, 0)
50 (64061, 0)
51 (64061, 0)
52 (64061, 0)
53 (59849, 0)
54 (59415, 0)
55 (58487, 0)
56 (58054, 0)
57 (58054, 0)
58 (58054, 0)
59 (52551, 0)
60 (58054, 0)
61 (58054, 0)
62 (58054, 0)
63 (58054, 0)
64 (52551, 0)
65 (58054, 0)
66 (58488, 0)
67 (58488, 0)
68 (58028, 0)
69 (58464, 0)
70 (58028, 0)
71 (57989, 0)
72 (58595, 0)
73 (58027, 0)
74 (57989, 0)
75 (58595, 0)
76 (58595, 0)
77 (58019, 0)
78 (58595, 0)
79 (58595, 0)
80 (58019, 0)
81 (58595, 0)
82 (58595, 0)
83 (66715, 0)
84 (58595, 0)
85 (59295, 0)
86 (67614, 0)
87 (58314, 0)
88 (58314, 0)
Name: VKCSKEY1, Length: 89
VKCSKEY1
(60744, 0) (60744, 0)
(60750, 0) (60750, 0)
(60768, 0) (60768, 0)
(60758, 0) (60758, 0)
(69223, 0) (69223, 0)
(64265, 0) (64265, 0)
(64216, 0) (64216, 0)
(57085, 0) (57085, 0)
(57014, 0) (57014, 0)
(57033, 0) (57033, 0)
(64065, 0) (64065, 0)
(57070, 0) (57070, 0)
(64062, 0) (64062, 0)
(64061, 0) (64061, 0)
(59849, 0) (59849, 0)
(59415, 0) (59415, 0)
(58487, 0) (58487, 0)
(58054, 0) (58054, 0)
(52551, 0) (52551, 0)
(58488, 0) (58488, 0)
(58028, 0) (58028, 0)
(58464, 0) (58464, 0)
(57989, 0) (57989, 0)
(58595, 0) (58595, 0)
(58027, 0) (58027, 0)
(58019, 0) (58019, 0)
(66715, 0) (66715, 0)
(59295, 0) (59295, 0)
(67614, 0) (67614, 0)
(58314, 0) (58314, 0)
Name: traf_key
Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88], dtype=int64)
[]
Index([(60744, 0), (60750, 0), (60768, 0), (60758, 0), (69223, 0), (64265, 0), (64216, 0), (57085, 0), (57014, 0), (57033, 0), (64065, 0), (57070, 0), (64062, 0), (64061, 0), (59849, 0), (59415, 0), (58487, 0), (58054, 0), (52551, 0), (58488, 0), (58028, 0), (58464, 0), (57989, 0), (58595, 0), (58027, 0), (58019, 0), (66715, 0), (59295, 0), (67614, 0), (58314, 0)], dtype=object)
[]
(60744, 0) True
(60750, 0) True
(60768, 0) True
(60758, 0) True
(69223, 0) True
(64265, 0) True
(64216, 0) True
(57085, 0) True
(57014, 0) True
(57033, 0) True
(64065, 0) True
(57070, 0) True
(64062, 0) True
(64061, 0) True
(59849, 0) True
(59415, 0) True
(58487, 0) True
(58054, 0) True
(52551, 0) True
(58488, 0) True
(58028, 0) True
(58464, 0) True
(57989, 0) True
(58595, 0) True
(58027, 0) True
(58019, 0) True
(66715, 0) True
(59295, 0) True
(67614, 0) True
(58314, 0) True
Traceback (most recent call last):
File "L:\temp\pandas_join_bug.py", line 43, in <module>
right_on ='traf_key',
File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 36, in merge
return op.get_result()
File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 185, in get_result
ldata, rdata = self._get_merge_data()
File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 277, in _get_merge_data
copydata=False)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 1194, in _maybe_rename_join
to_rename = self.items.intersection(other.items)
File "C:\Python27\lib\site-packages\pandas\core\index.py", line 666, in intersection
indexer = self.get_indexer(other.values)
File "C:\Python27\lib\site-packages\pandas\core\index.py", line 812, in get_indexer
raise Exception('Reindexing only valid with uniquely valued Index '
Exception: Reindexing only valid with uniquely valued Index objects
一旦发生错误,我就无法获得任何成功的合并或连接语句。起初我没有看到错误与重复的合并/加入操作有关。最新集合的任何单个合并/加入现在都可以使用。一旦我需要另一个合并/加入,我就会得到同样的错误。