0

I have the following problem. I'm doing WebCrawler for a school assignment, and I'm doing it in Clojure. Here is the code.

(defn crawl [url current-depth max-depth]
(def hrefs (get-links url))
(if (< current-depth max-depth)
    (map crawl hrefs (iterate eval (inc current-depth)) (iterate eval max-depth))
    hrefs))

(defn get-links [page] 
($ (get! page) td "a[href]" (attr "abs:href")))

The get! and $ functions is not written by me, I've taken them from here: https://github.com/mfornos/clojure-soup/blob/master/src/jsoup/soup.clj

My problem is that when I call (crawl "http://bard.bg" 0 0) from repl I get the following output:

("http://www.bard.bg/genres/?id=1" "http://www.bard.bg/genres/?id=2" "http://www.bard.bg/genres/?id=4" "http://www.bard.bg/genres/?id=5" "http:/
("http://www.bard.bg/genres/?id=1" "http://www.bard.bg/genres/?id=2" "http://www.bard.bg/genres/?id=4" "http://www.bard.bg/genres/?id=5" "http:/
("http://www.bard.bg/genres/?id=1" "http://www.bard.bg/genres/?id=2" "http://www.bard.bg/genres/?id=4" "http://www.bard.bg/genres/?id=5" "http://www.bard.bg/genres/?id=6" "http://www.bard.bg/genres/?id=10" "http://www.bard.bg/genres/?id=17" "http://www.bard.bg/genres/?id=24"
...

So where do the first 2 lazyseqs are coming from? Why are they unfinished?

Seems like the problem is in the Clojure-Soup and more specifically here:

(defmacro $ [doc & forms]
   (let [exprs# (map #(if (string? %) `(select ~%)
                  (if (symbol? %) `(select ~(str %))
                     (if (keyword? %) `(select ~(str "#"(name %)))
                        %))) forms)]
 `(->> ~doc ~@exprs#)))`
4

1 回答 1

1

我无法重现您描述的问题。在我的例子(crawl "http://bard.bg" 0 0)中,返回一个包含 174 个字符串的列表。

但是,我想借此机会向您指出函数def中的不正确用法。crawl而不是def你应该使用let. 此外,而不是(iterate eval ...)使用repeat.

(defn crawl [url current-depth max-depth]
  (let [hrefs (get-links url)]
    (if (< current-depth max-depth)
      (map crawl hrefs (repeat (inc current-depth)) (repeat max-depth))
      hrefs)))

有关讨论,请参见clojure 中的 let vs def

于 2013-01-12T15:00:06.730 回答