我知道如何在 pandas 中以各种方式连接表 - concat、merge 等,但我也想知道如何使用 pandasql 来做到这一点。具体来说,我想在索引上加入两个熊猫数据框。这可能吗?当我做
new_df = pysqldf("SELECT a.*, b.list3 from df1 as a INNER JOIN df2 as b ON a.key=b.key;")
我得到正确的结果。(我在两个表上都有一个“关键”变量。)但是,当我尝试
new_df = pysqldf("SELECT a.*, b.list3 from df1 as a INNER JOIN df2 as b ON a.index=b.index;")
我明白了
---------------------------------------------------------------------------
PandaSQLException Traceback (most recent call last)
<ipython-input-154-ecab230d4dc9> in <module>()
----> 1 new_df = pysqldf("SELECT a.*, b.list3 from df1 as a INNER JOIN df2 as b ON a.index=b.index;")
<ipython-input-100-adc122e97ed8> in <lambda>(q)
1 from pandasql import sqldf
----> 2 pysqldf = lambda q: sqldf(q, globals())
/Users/jwesley/anaconda/lib/python2.7/site-packages/pandasql/sqldf.pyc in sqldf(query, env, db_uri)
154 >>> sqldf("select avg(x) from df;", locals())
155 """
--> 156 return PandaSQL(db_uri)(query, env)
/Users/jwesley/anaconda/lib/python2.7/site-packages/pandasql/sqldf.pyc in __call__(self, query, env)
61 result = read_sql(query, conn)
62 except DatabaseError as ex:
---> 63 raise PandaSQLException(ex)
64 except ResourceClosedError:
65 # query returns nothing
PandaSQLException: (sqlite3.OperationalError) near "index": syntax error [SQL: 'SELECT a.*, b.list3 from df1 as a INNER JOIN df2 as b ON a.index=b.index;']