1

当我运行下面的代码时,我在浏览器中显示 TypeError。错误出现在最后一行,并说“NoneType”对象不可下标(我正在尝试获取所有项目的所有 url)。然而这很奇怪,因为在命令行中,提要中的所有 url 都会被打印出来。关于为什么项目在命令行中打印但在浏览器中显示错误的任何想法?我该如何解决?

#reddit parse
try:
    f = urllib.urlopen("http://www.reddit.com/r/videos/top/.json");
except Exception:
    print("ERROR: malformed JSON response from reddit.com")
reddit_posts = json.loads(f.read().decode("utf-8"))["data"]["children"]
reddit_feed=[]
for post in reddit_posts:
    if "oembed" in post['data']['media']:
        print post["data"]["media"]["oembed"]["url"]
        reddit_feed.append(post["data"]["media"]["oembed"]["url"])  
print reddit_feed

编辑

if post["data"]["media"]["oembed"]["url"]:
    print post["data"]["media"]["oembed"]["url"]
4

1 回答 1

2

返回的 json 中有一些帖子,media=null因此post['data']['media']没有oembed字段(因此也没有url字段):

     {
        "kind" : "t3",
        "data" : {
           "downs" : 24050,
           "link_flair_text" : null,
           "media" : null,
           "url" : "http://youtu.be/aNJgX3qH148?t=4m20s",
           "link_flair_css_class" : null,
           "id" : "rymif",
           "edited" : false,
           "num_reports" : null,
           "created_utc" : 1333847562,
           "banned_by" : null,
           "name" : "t3_rymif",
           "subreddit" : "videos",
           "title" : "An awesome young man",
           "author_flair_text" : null,
           "is_self" : false,
           "author" : "Lostinfrustration",
           "media_embed" : {},
           "permalink" : "/r/videos/comments/rymif/an_awesome_young_man/",
           "author_flair_css_class" : null,
           "selftext" : "",
           "domain" : "youtu.be",
           "num_comments" : 2260,
           "likes" : null,
           "clicked" : false,
           "thumbnail" : "http://a.thumbs.redditmedia.com/xUDtCtRFDRAP5gQr.jpg",
           "saved" : false,
           "ups" : 32312,
           "subreddit_id" : "t5_2qh1e",
           "approved_by" : null,
           "score" : 8262,
           "selftext_html" : null,
           "created" : 1333847562,
           "hidden" : false,
           "over_18" : false
        }
     },

也似乎是您的异常消息并不真正适合:爆炸时可以抛出多种异常urlopen,例如IOError. 正如您的错误消息所暗示的那样,它不会检查返回的格式是否是有效的 JSON。

现在,为了缓解这个问题,您需要检查是否"oembed" in post['data']['media'],并且只有当它确实可以调用post['data']['media']['oembed']['url']时,请注意我假设所有oembedblob 都具有url(主要是因为您需要一个 URL 来在 reddit 上嵌入媒体)。

**更新:也就是说,这样的事情应该可以解决您的问题:

for post in reddit_posts:
    if isinstance(post['data']['media'], dict) \
           and "oembed" in post['data']['media'] \
           and isinstance(post['data']['media']['oembed'], dict) \
           and 'url' in post['data']['media']['oembed']:
        print post["data"]["media"]["oembed"]["url"]
        reddit_feed.append(post["data"]["media"]["oembed"]["url"])
print reddit_feed

你有这个错误的原因是因为对于某些post,post["data"]["media"]None,所以你基本上是None["oembed"]在这里打电话。因此错误:'NoneType' object is not subscriptable. 我也意识到这post['data']['media']['oembed']可能不是字典,因此您还需要验证它是否是字典以及是否url在其中。

更新 2:

它看起来data有时也不存在,所以修复:

import json
import urllib

try:
    f = urllib.urlopen("http://www.reddit.com/r/videos/top/.json")
except Exception:
    print("ERROR: malformed JSON response from reddit.com")
reddit_posts = json.loads(f.read().decode("utf-8"))

if isinstance(reddit_posts, dict) and "data" in reddit_posts \
   and isinstance(reddit_posts['data'], dict) \
   and 'children' in reddit_posts['data']:
    reddit_posts = reddit_posts["data"]["children"]
    reddit_feed = []
    for post in reddit_posts:
        if isinstance(post['data']['media'], dict) \
               and "oembed" in post['data']['media'] \
               and isinstance(post['data']['media']['oembed'], dict) \
               and 'url' in post['data']['media']['oembed']:
            print post["data"]["media"]["oembed"]["url"]
            reddit_feed.append(post["data"]["media"]["oembed"]["url"])
    print reddit_feed
于 2012-10-15T00:29:02.757 回答