0

I need to extract a javascript variable containing a multiline JSON from a remote page using a python script(2.7), and I want to use regex to do this, but my pattern does not return anything

What am I doing wrong ?

here's my code :

request = urllib2.Request("http://somesite.com/affiliates/")
result = urllib2.urlopen(request)
affiliates = re.findall('#var affiliates = (.*?);\s*$#m', result.read())
print affiliates
4

1 回答 1

2

If you look at the docs for re.findall(pattern, string, flags=0), you'll see you need to change how you're using it

affiliates = re.findall('var affiliates = (.*?);\s*$', result.read(), re.M)

You might also want to consider how whitespace can be sloppy in JavaScript.

于 2013-07-25T12:14:20.913 回答