3

在 Python 2.6.4 + Scrapy 工具包中创建 web-scraper。需要做数据分析,也是我的第一个Python学习项目。在我的 pipeline.py 中创建 SQL INSERT 语句时遇到问题。真正的查询有大约 30 个要插入的属性。

首先,有没有更好的方法来编写这个 UPDATE 或 INSERT 算法?对改进持开放态度。

其次,这里有两种不同的语法变体以及它们产生的不同错误。我已经根据示例尝试了很多变体,但找不到使用“INSERT SET”跨越多行的示例。什么是正确的语法?

DB 是空的,所以我们现在总是分支到“INSERT”块。

def _conditional_insert(self, tx, item):
 # create record if doesn't exist.
 tx.execute("SELECT username  FROM profiles_flat WHERE username = %s", (item['username'][0], ))
 result = tx.fetchone()

 if result:
    # do row UPDATE
     tx.execute( \
        """UPDATE profiles_flat SET
        username=`%s`, 
        headline=`%s`,
        age=`%s`
        WHERE username=`%s`""", (  \
        item['username'],
        item['headline'],
        item['age'],)
        item['username'],)
     )         
 else: 
   # do row INSERT
   tx.execute( \
   """INSERT INTO profiles_flat SET
        username=`%s`, 
        headline=`%s`,
        age=`%s` """, ( \
        item['username'],
        item['headline'],
        item['age'], )   # line 222
   )

错误:

[Failure instance: Traceback: <class '_mysql_exceptions.OperationalError'>: (1054, "Unknown column ''missLovely92 '' in 'field list'")
  /usr/lib/python2.6/threading.py:497:__bootstrap
  /usr/lib/python2.6/threading.py:525:__bootstrap_inner
  /usr/lib/python2.6/threading.py:477:run
  --- <exception caught here> ---
  /usr/lib/python2.6/vendor-packages/twisted/python/threadpool.py:210:_worker
  /usr/lib/python2.6/vendor-packages/twisted/python/context.py:59:callWithContext
  /usr/lib/python2.6/vendor-packages/twisted/python/context.py:37:callWithContext
  /usr/lib/python2.6/vendor-packages/twisted/enterprise/adbapi.py:429:_runInteraction
  /export/home/raven/scrapy/project/project/pipelines.py:222:_conditional_insert
  /usr/lib/python2.6/vendor-packages/MySQLdb/cursors.py:166:execute
  /usr/lib/python2.6/vendor-packages/MySQLdb/connections.py:35:defaulterrorhandler
  ]

替代语法:

  query = """INSERT INTO profiles_flat SET
        username=`%s`, 
        headline=`%s`,
        age=`%s` """ % \
   item['username'], # line 196
   item['headline'],
   item['age']

   tx.execute(query)

错误:

  [Failure instance: Traceback: <type 'exceptions.TypeError'>: not enough arguments for format string
  /usr/lib/python2.6/threading.py:497:__bootstrap
  /usr/lib/python2.6/threading.py:525:__bootstrap_inner
  /usr/lib/python2.6/threading.py:477:run
  --- <exception caught here> ---
  /usr/lib/python2.6/vendor-packages/twisted/python/threadpool.py:210:_worker
  /usr/lib/python2.6/vendor-packages/twisted/python/context.py:59:callWithContext
  /usr/lib/python2.6/vendor-packages/twisted/python/context.py:37:callWithContext
  /usr/lib/python2.6/vendor-packages/twisted/enterprise/adbapi.py:429:_runInteraction
  /export/home/raven/scrapy/project/project/pipelines.py:196:_conditional_insert
  ]    
4

1 回答 1

2

你不应该用反引号包围值。反引号用于引用列名。

INSERT INTO profiles_flat (username, headline, age)
VALUES (%s, %s, %s)
于 2012-04-29T16:21:32.693 回答