I wrote a short prog which uses the Discogs API with python, but it is so damn slow thats not usable for real web-applications. Here is the Python code and the python profile results (published only the time consuming spots) :
# -*- coding: utf-8 -*-
import profile
import discogs_client as discogs
def main():
discogs.user_agent = 'Mozilla/5.0'
#dump released albums into the file. You could also print it to the console
f=open('DiscogsTestResult.txt', 'w+')
#Use another band if you like,
#but if you decide to take "beatles" you will wait an hour! (cause of the num of releases)
artist = discogs.Artist('Faust')
print >> f, artist
print >> f," "
artistReleases = artist.releases
for r in artistReleases:
print >> f, r.data
print >> f,"---------------------------------------------"
print 'Performance Analysis of Discogs API'
print '=' * 80
profile.run('print main(); print')
and here is the result of pythons profile:
Performance Analysis of Discogs API
================================================================================
82807 function calls (282219 primitive calls) in 177.544 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
188 121.013 0.644 121.013 0.644 :0(connect)
206 52.080 0.253 52.080 0.253 :0(recv)
1 0.036 0.036 177.494 177.494 <string>:1(<module>)
188 0.013 0.000 175.234 0.932 adapters.py:261(send)
376 0.005 0.000 0.083 0.000 adapters.py:94(init_poolmanager)
188 0.008 0.000 176.569 0.939 api.py:17(request)
188 0.007 0.000 176.577 0.939 api.py:47(get)
188 0.015 0.000 173.922 0.925 connectionpool.py:268(_make_request)
188 0.015 0.000 174.034 0.926 connectionpool.py:332(urlopen)
1 0.496 0.496 177.457 177.457 discogsTestFullDump.py:6(main)
564 0.009 0.000 176.613 0.313 discogs_client.py:66(_response)
188 0.012 0.000 176.955 0.941 discogs_client.py:83(data)
188 0.011 0.000 51.759 0.275 httplib.py:363(_read_status)
188 0.017 0.000 52.520 0.279 httplib.py:400(begin)
188 0.003 0.000 121.198 0.645 httplib.py:754(connect)
188 0.007 0.000 121.270 0.645 httplib.py:772(send)
188 0.005 0.000 121.276 0.645 httplib.py:799(_send_output)
188 0.003 0.000 121.279 0.645 httplib.py:941(endheaders)
188 0.003 0.000 121.348 0.645 httplib.py:956(request)
188 0.016 0.000 121.345 0.645 httplib.py:977(_send_request)
188 0.009 0.000 52.541 0.279 httplib.py:994(getresponse)
1 0.000 0.000 177.544 177.544 profile:0(print main(); print)
188 0.032 0.000 176.322 0.938 sessions.py:225(request)
188 0.030 0.000 175.513 0.934 sessions.py:408(send)
752 0.015 0.000 121.088 0.161 socket.py:223(meth)
2256 0.224 0.000 52.127 0.023 socket.py:406(readline)
188 0.009 0.000 121.195 0.645 socket.py:537(create_connection)
Does anybody has any idea how to speed this up. I hope that whith some changes in the discogs_client.py it would be faster. Maybe changing from httplib to something else, or whatever. Or mybe it is faster to use another protocol instead of http?
(The source of discogs_client.py can be accessed here :"https://github.com/discogs/discogs_client/blob/master/discogs_client.py")
If anybody has any idea please respond, a lot of people would benefit from this.
Regards Daniel