2

我正在尝试对简单的语法进行编码,该语法既涵盖了普通语句,也涵盖了用块括起来的语句。块有它的特殊关键字。我已将块规则优先级指定为零,但 tree-sitter 仍然不匹配它。即使它报告错误,即其他规则不匹配。但尽管如此,它不想匹配块!为什么以及如何解决?

编码:

area = pi*r^2;

block {
    r=12;
}

tree-sitter将整个序列block { r=12;作为语句匹配,尽管在语句中不允许使用大括号。所以它报错,但不想匹配块规则,虽然它是适用的。

语法:

module.exports = grammar({
    name: 'test',

    rules: {
        source_file: $ => seq(
            repeat(choice($.block, $.statement_with_semicolon)),
            optional($.statement_without_semicolon)
        ),

        block: $ => prec(1, seq(
            "block",
            "{",
            repeat( $.statement_with_semicolon ),
            optional( $.statement_without_semicolon),
            "}",
            optional(";")
        )),

        statement_without_semicolon: $ => $.token_chain,

        statement_with_semicolon: $ => seq(
            $.token_chain,
            ";"
        ),

        token_chain: $ => repeat1(
            $.token
        ),

        token: $ => choice(
            $.alphanumeric,
            $.punctuation
        ),

        alphanumeric: $ => /[a-zA-Zα-ωΑ-Ωа-яА-Я0-9]+/,

        punctuation: $ => /[^a-zA-Zα-ωΑ-Ωа-яА-Я0-9"{}\(\)\[\];]+/
    }
});

输出:

>tree-sitter parse example-file
(source_file [0, 0] - [4, 1]
  (statement_with_semicolon [0, 0] - [0, 14]
    (token_chain [0, 0] - [0, 13]
      (token [0, 0] - [0, 4]
        (alphanumeric [0, 0] - [0, 4]))
      (token [0, 4] - [0, 7]
        (punctuation [0, 4] - [0, 7]))
      (token [0, 7] - [0, 9]
        (alphanumeric [0, 7] - [0, 9]))
      (token [0, 9] - [0, 10]
        (punctuation [0, 9] - [0, 10]))
      (token [0, 10] - [0, 11]
        (alphanumeric [0, 10] - [0, 11]))
      (token [0, 11] - [0, 12]
        (punctuation [0, 11] - [0, 12]))
      (token [0, 12] - [0, 13]
        (alphanumeric [0, 12] - [0, 13]))))
  (statement_with_semicolon [0, 14] - [3, 9]
    (token_chain [0, 14] - [3, 8]
      (token [0, 14] - [2, 0]
        (punctuation [0, 14] - [2, 0]))
      (token [2, 0] - [2, 5]
        (alphanumeric [2, 0] - [2, 5]))
      (token [2, 5] - [2, 6]
        (punctuation [2, 5] - [2, 6]))
      (ERROR [2, 6] - [2, 7])
      (token [2, 7] - [3, 4]
        (punctuation [2, 7] - [3, 4]))
      (token [3, 4] - [3, 5]
        (alphanumeric [3, 4] - [3, 5]))
      (token [3, 5] - [3, 6]
        (punctuation [3, 5] - [3, 6]))
      (token [3, 6] - [3, 8]
        (alphanumeric [3, 6] - [3, 8]))))
  (statement_without_semicolon [3, 9] - [4, 0]
    (token_chain [3, 9] - [4, 0]
      (token [3, 9] - [4, 0]
        (punctuation [3, 9] - [4, 0]))))
  (ERROR [4, 0] - [4, 1]))
example-file    0 ms    (ERROR [2, 6] - [2, 7])
4

1 回答 1

1

您的问题是您的punctuation正则表达式匹配换行符\n\r,您可以在此处看到:

  (statement_with_semicolon [0, 14] - [3, 9]
    (token_chain [0, 14] - [3, 8]
      (punctuation [0, 14] - [2, 0]))

看看它如何匹配第零行的结尾和空白的第一行?当解析器开始block认为 block 只是statement_with_semicolonmatch中的另一个标记时alphanumeric。您可以通过将punctuation定义更改为:

punctuation: $ => /[^a-zA-Zα-ωΑ-Ωа-яА-Я0-9"{}\(\)\[\];\n\r]+/

但是,这可能不会是您遇到的最后一期此类问题,因此您可能需要重写语法以更明确地了解它接受的标点符号和位置。例如,定义一组有效的运算符。

这也回答了你的另一个问题

于 2021-04-04T14:14:10.393 回答