clojure - 用于在随机加权选择之间进行选择的惯用 Clojure

Question

在涉足 Clojure 时，我完成了一个小示例程序，用于从选项列表中选择一个随机选项。

基本思想是迭代选择（分配权重）并将它们的权重转换为一个范围，然后在总范围内选择一个随机数来选择一个。它可能不是最优雅的设计，但让我们认为它是理所当然的。

与我下面的示例相比，wo 会有什么不同？

我对整体程序结构建议、名称间距等不感兴趣，主要是对您对每个功能的处理方式。

我对经验丰富的 Clojurer 如何处理“增强”函数特别感兴趣，在该函数中，我必须使用外部“cur”变量来引用范围的前一个端点。

  (def colors
      (hash-map 
            :white 1,
            :red 10,
            :blue 20,
            :green 1,
            :yellow 1
       )
     )

    (def color-list (vec colors))

    (def cur 0)

    (defn augment [x] 
      (def name (nth x 0))
      (def val (nth x 1))
      (def newval (+ cur val))
      (def left cur)
      (def right newval)
      (def cur (+ cur val))
      [name left right]
    )

    (def color-list-augmented (map augment color-list))

    (defn within-bounds [bound]
      (def min-bound (nth bound 1))
      (def max-bound (nth bound 2))
      (and (> choice min-bound) (< choice max-bound))
    )

    (def choice (rand-nth (range cur)))

    (def answer 
      (first (filter within-bounds color-list-augmented))
    )

    (println "Random choice:" (nth answer 0))

score 8 · Accepted Answer

我建议在学习 Clojure 的同时在http://www.4clojure.com/上做一些问题。您可以“关注”顶级用户，看看他们如何解决问题。

这是一个解决方案。它再次不是最有效的，因为我的目标是保持简单，而不是使用您稍后将学习的更高级的想法和结构。

user=> (def colors {:white 1  :red 10  :blue 20  :green 1  :yellow 1})
#'user/colors
user=> (keys colors)
(:white :red :blue :green :yellow)   
user=> (vals colors)
(1 10 20 1 1)

要将权重转换为区间，我们只需进行累积和：

user=> (reductions #(+ % %2) (vals colors))
(1 11 31 32 33)

寻找随机区间：

user=> (rand-int (last *1))
13
user=> (count (take-while #(<= % *1 ) *2 ))
2

REPL 中的注意*1是指最近打印的值，*2下一个最近的值，等等。所以我们要求一个介于 0（包括）和 33（不包括）之间的随机整数。这 33 个可能的选择对应于权重的总和。接下来，我们计算了找到该数字需要经过的间隔数。这里的随机数是 13。

(1 11 31 32 33) 
     ^ 13 belongs here, 2 numbers in

我们找到我们的随机数 2。请注意，为了降落在这里，我们必须至少有 11 但少于 31，所以有 20 种可能性，这正是...

user=> (nth (keys colors) *1)
:blue

所以，把这一切放在一个函数中：

(defn weighted-rand-choice [m]
    (let [w (reductions #(+ % %2) (vals m))
          r (rand-int (last w))]
         (nth (keys m) (count (take-while #( <= % r ) w)))))

让我们测试一下：

user=> (->> #(weighted-rand-choice colors) repeatedly (take 10000) frequencies)
{:red 3008, :blue 6131, :white 280, :yellow 282, :green 299}

score 8 · Accepted Answer

Rich Hickey 来自ants.clj的有点过时的（2008 年）解决方案：

(defn wrand 
  "given a vector of slice sizes, returns the index of a slice given a
  random spin of a roulette wheel with compartments proportional to
  slices."
  [slices]
  (let [total (reduce + slices)
        r (rand total)]
    (loop [i 0 sum 0]
      (if (< r (+ (slices i) sum))
        i
        (recur (inc i) (+ (slices i) sum))))))

Stuart Halloway 来自data.generators的最新（2012）解决方案：

(defn weighted
  "Given a map of generators and weights, return a value from one of
  the generators, selecting generator based on weights."
  [m]
  (let [weights   (reductions + (vals m))
        total   (last weights)
        choices (map vector (keys m) weights)]
    (let [choice (uniform 0 total)]
      (loop [[[c w] & more] choices]
        (when w
          (if (< choice w)
            (call-through c)
            (recur more)))))))

score 4 · Accepted Answer

它通常有助于将问题分解为可以独立解决的层。增强在分配范围方面做得很好，尽管在随机选择一个时，结果很难用正常的序列函数来消费。如果你改变增广的目标，让它产生一个正常的序列，那么增广问题与随机选择一个更清晰地分开。如果权重是整数，您可以构建一个包含每个项目的权重编号的列表，然后随机选择一个：

user> (map (fn [[item weight]] (repeat weight item)) colors)
((:white) 
 (:red :red :red :red :red :red :red :red :red :red) 
 (:blue :blue :blue :blue :blue :blue :blue :blue :blue :blue
  :blue :blue :blue :blue :blue :blue :blue :blue :blue :blue) 
 (:green) (:yellow))

然后将其展平为一个列表：

user> (flatten (map (fn [[item weight]] 
                       (repeat weight item)) 
                 colors))
(:white :red :red :red :red :red :red :red :red :red :red 
 :blue :blue :blue :blue :blue :blue :blue :blue :blue :blue 
 :blue :blue :blue :blue :blue :blue :blue :blue :blue :blue 
 :green :yellow)

并选择一个rand-nth：

user> (rand-nth (flatten (map (fn [[item weight]] (repeat weight item)) colors)))
:blue

ps：地图文字让事情看起来更好：读者页面很好地描述了这些

(def colors {:white 1,
             :red 10,
             :blue 20,
             :green 1,
             :yellow 1})

使用 let 给函数中的事物命名：

(defn augment [x]
  (let [name (nth x 0)
        val (nth x 1)
        newval (+ cur val)
        left cur
        right newval
        cur (+ cur val)]
    [name left right]))

score 0 · Accepted Answer

开源bigml采样库是另一种选择。我已经成功地使用了它。它有更好的文档记录，并且有一个很好的 API。

clojure - 用于在随机加权选择之间进行选择的惯用 Clojure

4 回答 4

Related

Reference