I made a Scrapy crawler that collects some data from forum threads. On the list page, i can see the last modified date. Based on that date, i want to decide whether to crawl the thread again or not. I store the data in mysql, using pipeline. While processing the list page with my CrawlSpider, i want to check a record in the mysql, and based on that record i either want to yield a Request or not. (I DO NOT want to load the url unless there is a new post.)
Whats the best way to do this?