Facebook provides data about demographics via their advertising platform. How to scrape it (using Python)?
1.) go to http://www.facebook.com/ads/create/
2.) fill in the forms
3.) now, there is data
See sample image: http:// www.webdistortion.com/wp-content/uploads/2010/10/fb4.jpg (i am a new user, so I can't post a image)
Problem: how to scrape it?
My ideas:
1.) use mechanize - maybe it is possible to fill in the forms, but the estimated number (112,960 in the example) is not visible in the source code and therefore you cannot parse it => we should do some other tricks, but what?
2.) use selenium (or windmill) - my recording was: open facebook.com --> click advertising --> click create ad --> ...
Unfortunately, this already failed. Log:
[info] Executing: |open | / | |
[info] Executing: |clickAndWait | link=Advertising | | [error] isNewPageLoaded found an old pageLoadError: Error: Permission denied for >> to get property Location.href [error] Permission denied for to get property Location.href [info] Executing: |clickAndWait | css=span.uiButtonText | | [error] Unexpected Exception: fileName -> chrome://selenium-ide/content/selenium-core/scripts/selenium-browserbot.js, lineNumber -> 840
There is evidence that it is possible to scrape this data: http://www.checkfacebook.com/
Solving the problem is more interesting than the data itself (ofc, this data is certainly interesting). I know that there are solutions, but I cannot come up with any. It is killing me, please help.