1

I have a rule-based system with several 100Ks of facts, and I'm getting very poor performance with PyCLIPS just for loading the facts.

I've narrowed it down to a simple example with two templates and a single rule that joins them (and does nothing else):

import clips
import timeit

env = clips.Environment()
env.BuildTemplate('F1', '(slot x (type INTEGER))')
env.BuildTemplate('F2', '(slot x (type INTEGER))')
env.BuildRule('Rule1', '(F1 (x ?val)) (F2 (x ?val))', '')

N = 20000
with open('F1.txt', 'w') as f1:
    with open('F2.txt', 'w') as f2:
        for n in xrange(N):
            print >>f1, '(F1 (x {}))'.format(n)
            print >>f2, '(F2 (x {}))'.format(n)

print timeit.timeit(lambda : env.LoadFacts('F1.txt'), number=1)
print timeit.timeit(lambda : env.LoadFacts('F2.txt'), number=1)

Output:

0.0951321125031
14.6272768974

So the second batch of 20K facts takes 14.6 seconds to load. Loading the same fact files from the CLIPS console is instantaneous. Checking different values of N reveals that the loading time is roughly proportional to sqr(N) (making this completely unusable for large numbers of facts).

Switching the order of operations, and defining the rule after loading the facts does not make things better (obviously the last operation is always the slow one).

Is anyone familiar with this issue? Am I using PyCLIPS in a wrong way?

I am running PyCLIPS v1.0.7.348 and CLIPS v6.3.

4

1 回答 1

2

CLIPS 6.3 在将变量从一种模式与另一种模式进行比较的连接中使用散列。当有大量与您示例中的事实和规则相似的事实和规则时,这可以显着提高性能。在之前的 CLIPS 版本中,当一个新的 F1 事实被断言时,将在与第二个模式匹配的所有 F2 事实之间发生迭代(并且对于每个新的 F2 事实都会发生类似的迭代)。在 6.3 版中,迭代仅发生在散列到相同存储桶中的 ?val 值的事实上。PyCLIPS 网站上的自述页面表明它是使用 CLIPS 6.24 编译的,因此这可以解释性能差异。顺便说一下,我不记得 6.24 和 6.3 之间有任何显着的 API 差异,因此可以使用更新版本的 CLIPS 重新编译 PyCLIPS 以获得性能改进。

于 2014-06-29T17:53:28.183 回答