I downloaded the Wikipedia dumps(the first torrent on this page) and tried to index all the links by storing them in a python dictionary. I stored the links as a list of destinations in a dictionary with a key of the current page. However as I processed the dump I ended up with a MemoryError, so I decided to assign each page an integer ID. This got me farther but I still ended up with a MemoryError. What can I do to process this without that? I would prefer to store it all in memory. As my code is reasonably long I posted it here.