I am working on a web backend that frequently grabs realtime market data from the web, and puts the data in a MySQL database.
Currently I have my main thread push tasks into a Queue object. I then have about 20 threads that read from that queue, and if a task is available, they execute it.
Unfortunately, I am running into performance issues, and after doing a lot of research, I can't make up my mind.
As I see it, I have 3 options: Should I take a distributed task approach with something like Celery? Should I switch to JPython or IronPython to avoid the GIL issues? Or should I simply spawn different processes instead of threads using processing? If I go for the latter, how many processes is a good amount? What is a good multi process producer / consumer design?
Thanks!