I have the following problem. I'm doing WebCrawler for a school assignment, and I'm doing it in Clojure. Here is the code.
(defn crawl [url current-depth max-depth]
(def hrefs (get-links url))
(if (< current-depth max-depth)
(map crawl hrefs (iterate eval (inc current-depth)) (iterate eval max-depth))
hrefs))
(defn get-links [page]
($ (get! page) td "a[href]" (attr "abs:href")))
The get!
and $
functions is not written by me, I've taken them from here: https://github.com/mfornos/clojure-soup/blob/master/src/jsoup/soup.clj
My problem is that when I call (crawl "http://bard.bg" 0 0)
from repl I get the following output:
("http://www.bard.bg/genres/?id=1" "http://www.bard.bg/genres/?id=2" "http://www.bard.bg/genres/?id=4" "http://www.bard.bg/genres/?id=5" "http:/
("http://www.bard.bg/genres/?id=1" "http://www.bard.bg/genres/?id=2" "http://www.bard.bg/genres/?id=4" "http://www.bard.bg/genres/?id=5" "http:/
("http://www.bard.bg/genres/?id=1" "http://www.bard.bg/genres/?id=2" "http://www.bard.bg/genres/?id=4" "http://www.bard.bg/genres/?id=5" "http://www.bard.bg/genres/?id=6" "http://www.bard.bg/genres/?id=10" "http://www.bard.bg/genres/?id=17" "http://www.bard.bg/genres/?id=24"
...
So where do the first 2
lazyseqs are coming from? Why are they unfinished?
Seems like the problem is in the Clojure-Soup and more specifically here:
(defmacro $ [doc & forms]
(let [exprs# (map #(if (string? %) `(select ~%)
(if (symbol? %) `(select ~(str %))
(if (keyword? %) `(select ~(str "#"(name %)))
%))) forms)]
`(->> ~doc ~@exprs#)))`