json - 是否应该将 PARSE 方言用于从根本上修改输入的任务？

Question

为了纪念 Rebol 3 随时开放源代码(?)，我又开始搞砸它了。作为一个练习，我正在尝试用 PARSE 方言编写我自己的 JSON 解析器。

由于 Douglas Crockford将 Rebol 的影响归功于他对 JSON 的发现，我认为这很容易。除了用括号替换大括号并去掉所有这些逗号之外，仅LOAD在字符串上使用的障碍之一是，当他们想要做等效的 aSET-WORD!时，他们使用看起来像 Rebol 标记器的字符串的东西，后面有一个非法的冒号：

{
    "key one": {
         "summary": "This is the string content for key one's summary",
         "value": 7
    },
    "key two": {
         "summary": "Another actually string, not supposed to be a 'symbol'",
         "value": 100
    }
}

基本上我想找到所有类似的情况"foo bar":并将它们转换为foo-bar:同时留下不单独冒号的匹配引号对。

当我在 PARSE 中解决这个问题时（我在原则上理解得很好，但仍然没有使用太多）出现了几个问题。但主要是，当您可以转入代码并从解析器下修改系列时的承诺条件是什么……特别是在 Rebol 3 中？更一般地说，它是“适合工作的工具”吗？

这是我尝试过的规则，似乎适用于这部分任务：

any [
    ; require a matched pair of quotes & capture series positions before
    ; and after the first quote, and before the last quote

    to {"} beforePos: skip startPos: to {"} endPos: skip

    ; optional colon next (if not there the rest of the next rule is skipped)

    opt [
        {:}

        ; if we got to this part of the optional match rule, there was a colon.
        ; we escape to code changing spaces to dashes in the range we captured

        (
            setWordString: copy/part startPos endPos
            replace/all setWordString space "-"
            change startPos setWordString
        )

        ; break back out into the parse dialect, and instead of changing the 
        ; series length out from under the parser we jump it back to the position
        ; before that first quote that we saw

        :beforePos

        ; Now do the removals through a match rule.  We know they are there and
        ; this will not cause this "colon-case" match rule to fail...because we
        ; saw those two quotes on the first time through!

        remove [{"}] to {"} remove [{"}]
    ]
]

可以吗？change startPos setWordString开放代码中是否有可能破坏外部解析......如果不是在这种情况下，那么在一些微妙的不同？

与往常一样，任何说教的“它更干净/更短/更好”的建议都会受到赞赏。

PS为什么没有replace/all/part？

score 2 · Accepted Answer

另一种方法是将 parse 视为带有 EBNF 的编译器 - 编译器。如果我正确回忆了 R2 语法：

copy token [rule] (append output token)

假设语法正确，并且没有{"}字符串：

thru {"} skip copy key to {"} skip
; we know ":" must be there, no check
thru {"} copy content to {"} skip
(append output rejoin[ {"} your-magic-with key {":"} content {"} ])

更精确，而不是to, char by char：

any space  {"} copy key some [ string-char | "\" skip ] {"} 
any space ":" any space {"} copy content any [ string-char  | "\" skip ] {"} 
(append output rejoin[ {"} your-magic-with key {":"} content {"} ])
; content can be empty -> any, key not -> some

string-char将是一个字符集，除了{\}and {"}，语法之外的任何内容？

不知道 R3 是否还能这样工作... :-/

score 2 · Accepted Answer

新的关键字 like change, insertandremove应该有助于这种类型的事情。我想这种方法的主要缺点是推动系列的延迟问题（我已经看到提到构建新字符串比操作更快）。

token: [
    and [{"} thru {"} any " " ":"]
    remove {"} copy key to {"} remove {"} remove any " "
    (key: replace/all key " " "-")
]

parse/all json [
    any [
        to {"} [
            and change token key
            ; next rule here, example:
            copy new-key thru ":" (probe new-key)
            | skip
        ]
    ]
]

这有点令人费解，因为我似乎无法让 'change 像我期望的那样工作（表现得像change，不是change/part），但理论上你应该能够沿着这些线缩短它并有一个相当干净的规则。理想的可能是：

token: [
    {"} copy key to {"} skip any " " and ":"
    (key: replace/all key " " "-")
]

parse/all json [
    any [
        to {"} change token key
        | thru {"}
    ]
]

编辑：另一个软糖change-

token: [
    and [{"} key: to {"} key.: skip any " " ":"]
    (key: replace/all copy/part key key. " " "-")
    remove to ":" insert key
]

parse/all json [
    any [to {"} [token | skip]]
]

score 0 · Accepted Answer

既然其他人已经回答了这个parse问题，我将回答PS：

有一些提议的选项从未添加到中replace，主要原因是处理选项有开销，而且这个函数已经需要一些有趣的优化来处理它已经拥有的选项。一旦我们稍微改进了它的 API，我们将尝试用原生函数替换该函数。这与函数的情况基本相似reword，直到最近我们才决定最终的 API。因为replace我们还没有讨论过。

在该/part选项的情况下，以前没有任何人建议过，并且与现有的内部长度计算统一在概念上可能有点尴尬。可能有一个有限的/part选项，只有整数而不是偏移量参考。/part如果长度优先于内部计算的长度，这可能是最好的。不过，如果我们最终使用经过调整的 API，则可能不需要/part选项。

json - 是否应该将 PARSE 方言用于从根本上修改输入的任务？

3 回答 3

Related

Reference