Ok, here is the real world scenario: I'm writing an application, and I have a class that represents a certain type of files (in my case this is photographs but that detail is irrelevant to the problem). Each instance of the Photograph class should be unique to the photo's filename.
The problem is, when a user tells my application to load a file, I need to be able to identify when files are already loaded, and use the existing instance for that filename, rather than create duplicate instances on the same filename.
To me this seems like a good situation to use memoization, and there's a lot of examples of that out there, but in this case I'm not just memoizing an ordinary function, I need to be memoizing __init__()
. This poses a problem, because by the time __init__()
gets called it's already too late as there's a new instance created already.
In my research I found Python's __new__()
method, and I was actually able to write a working trivial example, but it fell apart when I tried to use it on my real-world objects, and I'm not sure why (the only thing I can think of is that my real world objects were subclasses of other objects that I can't really control, and so there were some incompatibilities with this approach). This is what I had:
class Flub(object):
instances = {}
def __new__(cls, flubid):
try:
self = Flub.instances[flubid]
except KeyError:
self = Flub.instances[flubid] = super(Flub, cls).__new__(cls)
print 'making a new one!'
self.flubid = flubid
print id(self)
return self
@staticmethod
def destroy_all():
for flub in Flub.instances.values():
print 'killing', flub
a = Flub('foo')
b = Flub('foo')
c = Flub('bar')
print a
print b
print c
print a is b, b is c
Flub.destroy_all()
Which output this:
making a new one!
139958663753808
139958663753808
making a new one!
139958663753872
<__main__.Flub object at 0x7f4aaa6fb050>
<__main__.Flub object at 0x7f4aaa6fb050>
<__main__.Flub object at 0x7f4aaa6fb090>
True False
killing <__main__.Flub object at 0x7f4aaa6fb050>
killing <__main__.Flub object at 0x7f4aaa6fb090>
It's perfect! Only two instances were made for the two unique id's given, and Flub.instances clearly only has two listed.
But when I tried to take this approach with the objects I was using, I got all kinds of nonsensical errors about how __init__()
took only 0 arguments, not 2. So I'd change some things around and then it would tell me that __init__()
needed an argument. Totally bizarre.
After a while of fighting with it, I basically just gave up and moved all the __new__()
black magic into a staticmethod called get
, such that I could call Photograph.get(filename)
and it would only call Photograph(filename)
if filename wasn't already in Photograph.instances
.
Does anybody know where I went wrong here? Is there some better way to do this?
Another way of thinking about it is that it's similar to a singleton, except it's not globally singleton, just singleton-per-filename.
Here's my real-world code using the staticmethod get if you want to see it all together.