10

我正在尝试在 Google App Engine 上构建一个数据存储,以从 StockTwits 为一堆公司收集一些流数据。我基本上是在复制我用 Twitter 做的一个,但它给了我一个 HTTPException: Invalid and/or missing SSL certificate error for a URLs。我更改了 URL 以查看另一家公司,但得到了相同的结果。

这是提取数据的代码:

class StreamHandler(webapp2.RequestHandler):

def get(self):

    tickers = ['AAPL','GOOG', 'IBM', 'BAC', 'INTC',
               'DELL', 'C', 'JPM', 'WFM', 'WMT', 
               'AMZN', 'HOT', 'SPG', 'SWY', 'HTSI', 
               'DUK', 'CEG', 'XOM', 'F', 'WFC', 
               'CSCO', 'UAL', 'LUV', 'DAL', 'COST', 'YUM',
               'TLT', 'HYG', 'JNK', 'LQD', 'MSFT',
               'GE', 'LVS', 'MGM', 'TWX', 'DIS', 'CMCSA',
               'TWC', 'ORCL', 'WPO', 'NYT', 'GM', 'JCP', 
               'LNKD', 'OPEN', 'NFLX', 'SBUX', 'GMCR', 
               'SPLS', 'BBY', 'BBBY', 'YHOO', 'MAR', 
               'L', 'LOW', 'HD', 'HOV', 'TOL', 'NVR', 'RYL', 
               'GIS', 'K', 'POST', 'KRFT', 'CHK', 'GGP', 
               'RSE', 'RWT', 'AIG', 'CB', 'BRK.A', 'CAT']

    for i in set(tickers):

        urlst = 'https://api.stocktwits.com/api/2/streams/symbol/'
        tickerstringst = urlst + i + '.json'
        tickurlst = urllib2.Request(tickerstringst)
        sttweets = urllib2.urlopen(tickurlst)
        stcode = sttweets.getcode()

        if stcode == 200:
            stresults = json.load(sttweets, 'utf-8')
            if "messages" in stresults:
                stentries = stresults["messages"]
                for stentry in stentries:
                    sttweet = streamdata()
                    stcreated = stentry['created_at']
                    sttweetid = str(stentry['id'])
                    sttweettxt = stentry['body']
                    sttweet.ticker = i
                    sttweet.created_at = stcreated
                    sttweet.tweet_id = sttweetid
                    sttweet.text = sttweettxt
                    sttweet.source = "StockTwits"
                    sttweet.put()

这是显示错误的日志文件。我在本地 Python 开发服务器上运行它,顺便说一句:

WARNING  2012-12-06 23:20:12,993 dev_appserver.py:3655] Could not initialize images API; you are likely missing the Python "PIL" module. ImportError: No module named _imaging
INFO     2012-12-06 23:20:13,017 dev_appserver_multiprocess.py:655] Running application dev~jibdantestv2 on port 8088: http://localhost:8088
INFO     2012-12-06 23:20:13,017 dev_appserver_multiprocess.py:657] Admin console is available at: http://localhost:8088/_ah/admin   
INFO     2012-12-06 23:20:54,776 dev_appserver.py:3092] "GET /_ah/admin HTTP/1.1" 302 -
INFO     2012-12-06 23:20:54,953 dev_appserver.py:3092] "GET /_ah/admin/datastore HTTP/1.1" 200 -
INFO     2012-12-06 23:20:55,280 dev_appserver.py:3092] "GET /_ah/admin/images/google.gif HTTP/1.1" 200 -
INFO     2012-12-06 23:21:04,617 dev_appserver.py:3092] "GET /_ah/admin/cron HTTP/1.1" 200 -
INFO     2012-12-06 23:21:04,815 dev_appserver.py:3092] "GET /_ah/admin/images/google.gif HTTP/1.1" 200 -
WARNING  2012-12-06 23:21:07,392 urlfetch_stub.py:448] Stripped prohibited headers from URLFetch request: ['Host']
ERROR    2012-12-06 23:21:09,921 webapp2.py:1553] Invalid and/or missing SSL certificate for URL: https://api.stocktwits.com/api/2/streams/symbol/GIS.json
Traceback (most recent call last):
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 1536, in __call__
rv = self.handle_exception(request, response, e)
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 1530, in __call__
rv = self.router.dispatch(request, response)
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 1278, in default_dispatcher
return route.handler_adapter(request, response)
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 1102, in __call__
return handler.dispatch()
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 572, in dispatch
return self.handle_exception(e, self.app.debug)
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 570, in dispatch
return method(*args, **kwargs)
  File "C:\Users\Tank\Documents\Aptana Studio 3 Workspace\jibdantestv2\main.py", line 38, in get
sttweets = urllib2.urlopen(tickurlst)
  File "C:\Python27\lib\urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
  File "C:\Python27\lib\urllib2.py", line 400, in open
response = self._open(req, data)
  File "C:\Python27\lib\urllib2.py", line 418, in _open
'_open', req)
  File "C:\Python27\lib\urllib2.py", line 378, in _call_chain
result = func(*args)
  File "C:\Python27\lib\urllib2.py", line 1215, in https_open
return self.do_open(httplib.HTTPSConnection, req)
  File "C:\Python27\lib\urllib2.py", line 1180, in do_open
r = h.getresponse(buffering=True)
  File "C:\Program Files (x86)\Google\google_appengine\google\appengine\dist27\httplib.py", line 502, in getresponse
raise HTTPException(str(e))
HTTPException: Invalid and/or missing SSL certificate for URL: https://api.stocktwits.com/api/2/streams/symbol/GIS.json
INFO     2012-12-06 23:21:09,937 dev_appserver.py:3092] "GET /add_data HTTP/1.1" 500 -
4

3 回答 3

11

我不知道 GAE 为什么会出现问题,但我注意到 api.stocktwits.com 返回的证书与其主题的通用名称(即 ssl2361.cloudflare.com)上的服务器名称不匹配,但仅在其主题备用名称之一(“DNS 名称=*.stocktwits.com”)。可能不支持主题替代名称,或者不适用于此处使用的通配符名称。(这将是一个谷歌错误/缺失的功能。)

我能够通过调用 GAE urlfetch.fetch API 来重现您的问题并找到解决方法。(您可能知道,在 GAE 上,urllib2 被实现为 urlfetch 的包装器。)

从你urllib2.Request到你的行开始jason.load,替换为:

sttweets = urlfetch.fetch(tickerstringst, validate_certificate=False)
stcode = sttweets.status_code

if stcode == 200:
    stresults = json.loads(sttweets.content, 'utf-8')

并且您的错误以及您对真实站点实际采取的任何保证都消失了(尽管流量仍应加密)。

目前urlfetch.fetchGAE API 文档说:

validate_certificate 底层实现当前默认为 False,但在不久的将来会默认为 True。

好吧,欢迎来到未来,因为 validate_certificate 现在似乎默认为True.

这可能是 GAE urlfetch.fetch 中的一个错误(或缺少的功能,如果您愿意的话),我鼓励您将其报告给 Google。

于 2012-12-07T21:26:06.700 回答
-1

我对 GAE 不太熟悉,所以这可能是访问 API 端点的问题。此外,可能是您没有使用正确的 Python 库来执行 SSL 请求,但既然您说您对 Twitter 请求使用相同的代码,也许情况并非如此。

您可以尝试在本地或其他服务器上使用相同的代码,而不是 GAE 吗?

于 2012-12-07T18:57:30.433 回答
-1

我有同样的问题,我已经在函数中设置了validate_certificate参数。Falseurlfetch.fetch()

    urlfetch.fetch(url, validate_certificate=false) #validates certificate

这并没有解决问题,我发现它处理validate_certificate参数的方式是一个内部错误。如果设置为 ,它会验证它,如果它是False,它不会验证证书True,或者至少这是它似乎正在做的事情。

    urlfetch.fetch(url, validate_certificate=true) #does not validate certificate 
于 2014-07-05T03:21:48.303 回答