Tupelo 库tupelo.forest
可以使用树数据结构轻松解决此类问题。请参阅此问题以获取更多信息。API 文档可以在这里找到。
在这里,我们加载您的 xml 数据并先将其转换为 enlive,然后再将其转换为tupelo.forest
. 库和数据定义:
(ns tst.tupelo.forest-examples
(:use tupelo.forest tupelo.test )
(:require
[clojure.data.xml :as dx]
[clojure.java.io :as io]
[clojure.set :as cs]
[net.cgrand.enlive-html :as en-html]
[schema.core :as s]
[tupelo.core :as t]
[tupelo.string :as ts]))
(t/refer-tupelo)
(def xml-str-prod "<data>
<products>
<product>
<section>Red Section</section>
<images>
<image>img.jpg</image>
<image>img2.jpg</image>
</images>
</product>
<product>
<section>Blue Section</section>
<images>
<image>img.jpg</image>
<image>img3.jpg</image>
</images>
</product>
<product>
<section>Green Section</section>
<images>
<image>img.jpg</image>
<image>img2.jpg</image>
</images>
</product>
</products>
</data> " )
和初始化代码:
(dotest
(with-forest (new-forest)
(let [enlive-tree (->> xml-str-prod
java.io.StringReader.
en-html/html-resource
first)
root-hid (add-tree-enlive enlive-tree)
tree-1 (hid->hiccup root-hid)
hid 后缀代表“Hex ID”,它是唯一的十六进制值,其作用类似于指向树中节点/叶的指针。在这个阶段,我们刚刚加载了森林数据结构中的数据,创建了 tree-1,它看起来像:
[:data
[:tupelo.forest/raw "\n "]
[:products
[:tupelo.forest/raw "\n "]
[:product
[:tupelo.forest/raw "\n "]
[:section "Red Section"]
[:tupelo.forest/raw "\n "]
[:images
[:tupelo.forest/raw "\n "]
[:image "img.jpg"]
[:tupelo.forest/raw "\n "]
[:image "img2.jpg"]
[:tupelo.forest/raw "\n "]]
[:tupelo.forest/raw "\n "]]
[:tupelo.forest/raw "\n "]
[:product
[:tupelo.forest/raw "\n "]
[:section "Blue Section"]
[:tupelo.forest/raw "\n "]
[:images
[:tupelo.forest/raw "\n "]
[:image "img.jpg"]
[:tupelo.forest/raw "\n "]
[:image "img3.jpg"]
[:tupelo.forest/raw "\n "]]
[:tupelo.forest/raw "\n "]]
[:tupelo.forest/raw "\n "]
[:product
[:tupelo.forest/raw "\n "]
[:section "Green Section"]
[:tupelo.forest/raw "\n "]
[:images
[:tupelo.forest/raw "\n "]
[:image "img.jpg"]
[:tupelo.forest/raw "\n "]
[:image "img2.jpg"]
[:tupelo.forest/raw "\n "]]
[:tupelo.forest/raw "\n "]]
[:tupelo.forest/raw "\n "]]
[:tupelo.forest/raw "\n "]]
接下来,我们使用以下代码删除所有空白字符串:
blank-leaf-hid? (fn [hid] (and (leaf-hid? hid) ; ensure it is a leaf node
(let [value (hid->value hid)]
(and (string? value)
(or (zero? (count value)) ; empty string
(ts/whitespace? value)))))) ; all whitespace string
blank-leaf-hids (keep-if blank-leaf-hid? (all-hids))
>> (apply remove-hid blank-leaf-hids)
tree-2 (hid->hiccup root-hid)
产生更好的结果树(打嗝格式)
[:data
[:products
[:product
[:section "Red Section"]
[:images [:image "img.jpg"] [:image "img2.jpg"]]]
[:product
[:section "Blue Section"]
[:images [:image "img.jpg"] [:image "img3.jpg"]]]
[:product
[:section "Green Section"]
[:images [:image "img.jpg"] [:image "img2.jpg"]]]]]
然后,以下代码计算上述三个问题的答案:
product-hids (find-hids root-hid [:** :product])
product-trees-hiccup (mapv hid->hiccup product-hids)
img2-paths (find-paths-leaf root-hid [:data :products :product :images :image] "img2.jpg")
img2-prod-paths (mapv #(drop-last 2 %) img2-paths)
img2-prod-hids (mapv last img2-prod-paths)
img2-trees-hiccup (mapv hid->hiccup img2-prod-hids)
red-sect-paths (find-paths-leaf root-hid [:data :products :product :section] "Red Section")
red-prod-paths (mapv #(drop-last 1 %) red-sect-paths)
red-prod-hids (mapv last red-prod-paths)
red-trees-hiccup (mapv hid->hiccup red-prod-hids)]
结果:
(is= product-trees-hiccup
[[:product
[:section "Red Section"]
[:images
[:image "img.jpg"]
[:image "img2.jpg"]]]
[:product
[:section "Blue Section"]
[:images
[:image "img.jpg"]
[:image "img3.jpg"]]]
[:product
[:section "Green Section"]
[:images
[:image "img.jpg"]
[:image "img2.jpg"]]]] )
(is= img2-trees-hiccup
[[:product
[:section "Red Section"]
[:images
[:image "img.jpg"]
[:image "img2.jpg"]]]
[:product
[:section "Green Section"]
[:images
[:image "img.jpg"]
[:image "img2.jpg"]]]])
(is= red-trees-hiccup
[[:product
[:section "Red Section"]
[:images
[:image "img.jpg"]
[:image "img2.jpg"]]]]))))
完整的示例可以在 forest-examples 单元测试中找到。