mysql - 将 MySQL 数据转换为 Prolog 规则以进行探索性挖掘的任何特定规则？

Question

我有三张表（目前），一张有 2000 行，另外两张有大约 160 万行。它们具有将它们相互关联的列，但它们不是正式的 FK 字段。我编写了一个 C++ 程序来生成基于源 MySQL 数据的规则文件，如下所示：

if T{ C1, C2...Cn } is the table definition
then the rule would be:
    T(C1, C2, Cn).

我的转换实用程序将整数解包，其他任何内容都放在单引号内，因此 INT(n)、DECIMAL 等成为 Prolog 数字，其他一切都是原子。

那么我的问题是：如果我想为具有 26 个字段的表/规则编写搜索规则，是否有一种“元级别”的方式来表达这一点：

findStuffById(I,FieldIWant1,FieldIwant2etc) :-
    tablename(_,_,I,_,FieldIWant1,_,_,_FieldIWant2etc,_,_,....).

还是我必须创建“更简单”的规则，然后必须使用“_”或变量来捕获我想要的东西？

编写规则的前景，例如...

findThisById(X)     :- tablename(X,_,_,_,_,This,_,_).
findThatById(X)     :- tablename(X,_,_,That,_,_,_,_).
findTheOtherById(X) :- tablename(X,_,_,_,_,_,_,TheOther).

……令人作呕……令人不安！

到目前为止，我唯一的想法是我可能需要在规则中创建复合术语以将事物组合在一起，即减少规则中的变量数量，但这可能会限制未来查询的“自由度”？我不是 Prolog 专家；我已经用它玩了好几个小时，并且渴望在我自己的日常工作中找到一个真实的世界用途。

我想我的转换程序也可以通过代码生成苦力规则，这样我就不必手动编写代码了。大表各有 26 列和 28 列，因此您可以看到这是怎么回事！

欢迎任何有关如何进行的建议；我没有尽可能多地使用 Prolog，我一直想学习更多！

我想创建一个 SWI-Prolog Web 服务器，它将使用相同的原始数据与 ElasticSearch 服务器进行正面交锋，以查看哪个是快速响应临时查询的。除非您预先创建一个嵌入了索引的复合文档，否则 ES 似乎也不进行传统的连接，这也可能有利于使用 Prolog。

谢谢。肖恩。

score 2 · Accepted Answer

You could maybe use nth1/1 and the "univ" operator, doing something like this:

fieldnames(t, [id,this,that]).
get_field(Field, Tuple, Value) :- 
    Tuple =.. [Table|Fields],
    fieldnames(Table, Names),
    nth1(Idx, Names, Field),
    nth1(Idx, Fields, Value).

You'd need to create fieldnames/2 records for each table structure, and you'd have to pass the table structure along to this query. It wouldn't be terrifically efficient, but it would work.

?- get_field(this, t(testId, testThis, testThat), Value)
Value = testThis

You could then build your accessors on top of this pretty easily:

findThisById(X, This) :- get_field(this, X, This).

Edit: Boris points out rightly that arg/3 will do this with even less work:

get_field(Field, Tuple, Value) :-
    functor(Tuple, Table, _),
    fieldnames(Table, Names),
    nth1(Idx, Names, Field),
    arg(Idx, Tuple, Value).

Prolog is so awesome.

score 1 · Accepted Answer

在一个实际的电子商务数据库上，我使用了类似的代码

update_price(File, Pid, Cn, Manu) :-
    product(Pid, [tax_class_id = Tid, /*cost = Cost,*/ price = CurrPrice]),
    tax_rate(_, [tax_class_id = Tid, rate = R]),
    manufacturer(Manu, name = NameM),
    (   ( NameM == 'Gruppo Aboca' ; NameM == 'L\'Amande' )
        ->  % Pr is Cost * 3 / 2
        togli_iva(R, Cn, Pr)
    ;   togli_iva(R, Cn, NoIva),
        Pr is NoIva * 2
    ),
    Delta is abs(CurrPrice - Pr),
    (   Delta > 0.01
    ->  Prx is round(Pr * 100) / 100,
        format(File, 'update product set price=~w where product_id=~d~@', [Prx, Pid, eol])
    ;   true
    ).

product、tax_rate、manufacturer 是实际的表名，具有已知的结构。例如，product 有 26 列，tax_rate 有 8，....

我有声明

:- dynamic 
    ...
    product/2,product/26,
    ...
    tax_rate/2,tax_rate/8,
    ...

数据从 SQL 转储中读取，当在内存中断言时，我构建 product/2 访问器，它负责从名称到位置的字段转换。

我不建议使用这种方法，因为对于您的数据来说太慢了。相反，您可以使用goal_expansion/2并在编译时将任何对 table(field1=Value1, field2=Value2) 的调用转换为对 table( , ,Value1,_,Value2) 的相应调用 - Prolog 位置方式。

这应该会从 SWI-Prolog 索引中获得最佳性能，最近更新为适用于所有列。

当然，要获得更多详细信息，您应该发布您的元数据格式......

编辑：如果您对简单的单字段访问器感兴趣（正如我从评论到丹尼尔回答中所理解的那样），您可以尝试库（记录）。

mysql - 将 MySQL 数据转换为 Prolog 规则以进行探索性挖掘的任何特定规则？

2 回答 2

Related

Reference