-1

我在 Python 中有一个代码,在其中我在类的函数中设置了一些变量值。现在我需要在函数之外获取设置值并使用它们。但是我没有得到设定的值,而是我在声明它们时设置的值。这是我的代码:

from datetime import datetime
import MySQLdb
from scrapy import signals
from twisted.internet.task import LoopingCall


class SpiderDetails(object):
    #"""Extension for collect spider information like start/stop time."""

    update_interval = 5  # in seconds
    spiderStartTime = ''
    spiderStopTime = ''
    spiderUpdateTime = ''

    def __init__(self, crawler):
        # keep a reference to the crawler in case is needed to access to more information
        self.crawler = crawler
        # keep track of polling calls per spider
        self.pollers = {}

    @classmethod
    def from_crawler(cls, crawler):
        instance = cls(crawler)
        crawler.signals.connect(instance.spider_opened, signal=signals.spider_opened)
        crawler.signals.connect(instance.spider_closed, signal=signals.spider_closed)
        return instance

    def spider_opened(self, spider):
        # store curent timestamp in db as 'start time' for this spider
        # TODO: complete db calls
        spiderStartTime = datetime.now()
        spiderStartTime = spiderStartTime.strftime("%Y-%m-%d %H:%M:%S")
        print spiderStartTime

        # start activity poller
        poller = self.pollers[spider.name] = LoopingCall(self.spider_update, spider)
        poller.start(self.update_interval)

    def spider_closed(self, spider, reason):
        spiderStopTime = datetime.now()
        spiderStopTime = spiderStopTime.strftime("%Y-%m-%d %H:%M:%S")
        print spiderStopTime
        # store curent timestamp in db as 'end time' for this spider
        # TODO: complete db calls

        # remove and stop activity poller
        poller = self.pollers.pop(spider.name)
        poller.stop()

    def spider_update(self, spider):
        spiderUpdateTime = datetime.now()
        spiderUpdateTime = spiderUpdateTime.strftime("%Y-%m-%d %H:%M:%S")
        print spiderUpdateTime
        # update 'last update time' for this spider
        # TODO: complete db calls
        #pass

    # Open database connection
    print spiderStopTime
    db = MySQLdb.connect("localhost","root","","numismatics")
    # prepare a cursor object using cursor() method
    cursor = db.cursor()

    # Prepare SQL query to INSERT a record into the database.
    #sql = "INSERT INTO test(ID, startDate) VALUES ('', spider_start)"
    try:
       # Execute the SQL command
       cursor.execute("INSERT INTO crawlertimes (`ID`, `spiderStartTime`,     `spiderStopTime`, `spiderUpdateTime`) VALUES (%s,%s,%s,%s)",('',spiderStartTime,spiderStopTime,spiderUpdateTime))
       # Commit your changes in the database
       db.commit()
   except:
       # Rollback in case there is any error
       db.rollback()

   # disconnect from server
   db.close()

在这段代码中,我spiderStopTime在函数 spider_close 中设置了变量,但是当我在 print 语句中的所有函数之外打印它时,它会变成空白。如何获得更改后的值?

4

3 回答 3

1

如果这些值是实例上的属性,则将它们设置为self

def spider_opened(self, spider):
    self.spiderStartTime = datetime.now()
    self.spiderStartTime = spiderStartTime.strftime("%Y-%m-%d %H:%M:%S")
    print self.spiderStartTime`

如果您需要它们作为全局变量,那么您必须global spiderStartTime在方法本身中将它们标记为全局变量。

代码的后半部分,您定义与数据库的连接,在加载类时执行。该代码在任何抓取发生之前运行,并且spiderStopTime此时仍被定义为空字符串。

将该代码移到spider_closed()方法中。这就是蜘蛛被关闭的地方,您实际上记录了停止时间:

def spider_closed(self, spider, reason):
    spiderStopTime = datetime.now()
    spiderStopTime = spiderStopTime.strftime("%Y-%m-%d %H:%M:%S")

    # remove and stop activity poller
    poller = self.pollers.pop(spider.name)
    poller.stop()

    db = MySQLdb.connect("localhost","root","","numismatics")
    cursor = db.cursor()

    try:
       cursor.execute("INSERT INTO crawlertimes (ID, spiderStartTime, spiderStopTime, spiderUpdateTime) VALUES (%s,%s,%s,%s)",
           ('', self.spiderStartTime, self.spiderStopTime, self.spiderUpdateTime))
       db.commit()
   except Exception:
       db.rollback()

   db.close()
于 2013-10-17T07:10:37.253 回答
0

您需要使用self来访问这些变量,例如:

self.spiderStopTime = datetime.now()
于 2013-10-17T07:21:28.847 回答
0

问题是 spiderStopTime 是函数的局部变量,一旦函数停​​止执行,垃圾收集就会启动。

为什么不在函数末尾返回 spiderStopTime 的值呢?

return spiderStopTime

当您调用该函数时,您将取回该值。

于 2013-10-17T07:13:57.887 回答