2
USING: accessors html.parser.analyzer io kernel math namespaces
  present regexp sequences ;
IN: all-roads-to-wiki

SYMBOL: G

: match-good-pages ( a -- ?/f )
  R/ \/wiki\/[^:]*$/ first-match ;

: filter-urls ( tags -- urls )
  find-hrefs [ present ]     map
  [ match-good-pages ]       filter
  [ match-good-pages seq>> ] map ;

: findpath ( url -- url )
  G get =
  [
     ! false
  ]
  [ scrape-html nip
    [
      dup "title" find-by-name drop 1 + swap nth
      text>> R/ - Wikipedia,/ re-split first print
    ]
    [
      "bodyContent" find-by-id-between filter-urls [ findpath ] map
    ] bi
  ] if ; inline recursive

: allroads-entry ( -- a )
  readln "http://en.wikipedia.org/wiki/" prepend G set-global
  "enwp.org/Special:Random" findpath ; inline

上面的代码将遍历维基百科上的每个链接,直到找到它正在寻找的那个。

没关系,因为(希望)findpath最终会“返回”(即不再调用自身)并在堆栈上留下一个巨大的嵌套数据结构。但是当我尝试编译这个时,我得到一个unbalanced-recursion错误:

递归词“findpath”离开时堆栈的高度错误

unbalanced-recursion:当堆栈效果推断确定内联递归词具有不正确的堆栈效果声明时抛出。

无论我做什么,Factor(可以理解)抱怨堆栈效应不匹配。我该怎么做才能让它正确递归?

4

1 回答 1

1

仔细看这个find-path词。我将添加注释,以便您可以查看堆栈中的内容:

: findpath ( url -- url )
    ! 1 item: { url }
    G 
    ! 2 items: { url G }
    get 
    ! 2 items: { url value-of-G }
    =
    ! 1: item { t/f }
    [
       ! 0 items!!!!
       ! false
    ]
    [ scrape-html nip
        [
            dup "title" find-by-name drop 1 + swap nth
            text>> R/ - Wikipedia,/ re-split first print
        ]
        [
            "bodyContent" find-by-id-between filter-urls 
            [ findpath ] map
        ] bi
    ] if ; inline recursive

组合器消耗堆栈上的if最后一项,因此此代码可能无法工作。这是这个findpath词的工作代码:

: page-title ( seq -- title )
    dup "title" find-by-name drop 1 + swap nth
    text>> R/ - Wikipedia,/ re-split first ;

: page-links ( seq -- links )
    "bodyContent" find-by-id-between filter-urls ;

: scrape-en-wiki-url ( wiki-url -- seq )
    "https://en.wikipedia.org" prepend
    dup print flush scrape-html nip ;

: found-url? ( wiki-url -- ? )
    G get [ = ] [ drop t ] if* ;

: findpath ( wiki-url -- seq/f )
    dup found-url?
    [ drop f G set f ] [
        scrape-en-wiki-url
        [ page-title print flush ] [
            page-links [ findpath ] map
        ] bi
    ] if ; inline recursive

还可以查看用于此类任务的Wikipedia vocab 。

于 2016-04-30T15:59:29.910 回答