I have a rule-based system with several 100Ks of facts, and I'm getting very poor performance with PyCLIPS just for loading the facts.
I've narrowed it down to a simple example with two templates and a single rule that joins them (and does nothing else):
import clips
import timeit
env = clips.Environment()
env.BuildTemplate('F1', '(slot x (type INTEGER))')
env.BuildTemplate('F2', '(slot x (type INTEGER))')
env.BuildRule('Rule1', '(F1 (x ?val)) (F2 (x ?val))', '')
N = 20000
with open('F1.txt', 'w') as f1:
with open('F2.txt', 'w') as f2:
for n in xrange(N):
print >>f1, '(F1 (x {}))'.format(n)
print >>f2, '(F2 (x {}))'.format(n)
print timeit.timeit(lambda : env.LoadFacts('F1.txt'), number=1)
print timeit.timeit(lambda : env.LoadFacts('F2.txt'), number=1)
Output:
0.0951321125031
14.6272768974
So the second batch of 20K facts takes 14.6 seconds to load. Loading the same fact files from the CLIPS console is instantaneous. Checking different values of N
reveals that the loading time is roughly proportional to sqr(N)
(making this completely unusable for large numbers of facts).
Switching the order of operations, and defining the rule after loading the facts does not make things better (obviously the last operation is always the slow one).
Is anyone familiar with this issue? Am I using PyCLIPS in a wrong way?
I am running PyCLIPS v1.0.7.348
and CLIPS v6.3
.