5

例如,我有这个示例数据:

d=data.frame(x=c(1,1,1,2,2,3,4,4),y=c(5,6,7,8,7,5,6,5),w=c(1,2,3,4,5,6,7,8))

看起来像这样:

  x y w
1 1 5 1
2 1 6 2
3 1 7 3
4 2 8 4
5 2 7 5
6 3 5 6
7 4 6 7
8 4 5 8

x并表示来自和的y索引。表示与 比较的分数。我想从 中最大化总分(或),其中 的每个值最多与 的一个值匹配,反之亦然。dataxdataywdatax[x]datay[y]wdxy

结果应如下所示:

  x y w
1 2 7 5
2 3 5 6
3 4 6 7

w所有值的总和最大化,并且x每个值y在结果中只出现一次。

如何在lpSolve::lp函数中设置此问题?

4

1 回答 1

7

您可以使用 lpSolveAPI 来解决您的问题。鉴于您的限制,您所述的解决方案不太可行。因此,让我们与您一起希望 X 和 Y 在解决方案中不重复。

您将需要 8 个新的二进制变量。每个变量都指定该行d是被选中 (1) 还是被删除 (0)。

根据 OP 的要求进行更新

是的,lpSolveAPI 代码(如下)使它看起来比实际更复杂。这个 LP 公式(lpSolveAPI 的输出)应该让事情更清楚:

/* Objective function */
max: +pick_1 +2 pick_2 +3 pick_3 +4 pick_4 +5 pick_5 +6 pick_6 +7 pick_7 +8 pick_8;

/* Constraints */
OneX_1: +pick_1 +pick_2 +pick_3 <= 1;
OneX_2: +pick_4 +pick_5 <= 1;
OneX_4: +pick_7 +pick_8 <= 1;
OneY_5: +pick_1 +pick_6 +pick_8 <= 1;
OneY_6: +pick_2 +pick_7 <= 1;
OneY_7: +pick_3 +pick_5 <= 1;

/* Variable bounds */
pick_1 <= 1;
pick_2 <= 1;
pick_3 <= 1;
pick_4 <= 1;
pick_5 <= 1;
pick_6 <= 1;
pick_7 <= 1;
pick_8 <= 1;

解释:第二个约束 (OneX_2) 简单地说明只有一个pick_4orpick_5可以是 1,因为数据帧中的第 4 行和第 5 行d有 X = 2

解决方案

请注意,上面的公式产生了一个最佳解决方案,该解决方案在数据框中选择了 4 行d

> d[c(3,4,6,7),]
  x y w
3 1 7 3
4 2 8 4
6 3 5 6
7 4 6 7

w的总和是20,比问题中的解要好。

代码

library(lpSolveAPI)
d <- data.frame(x=c(1,1,1,2,2,3,4,4),y=c(5,6,7,8,7,5,6,5),w=c(1,2,3,4,5,6,7,8))

ncol <- 8 #you have eight rows that can be picked or dropped from the solution set
lp_rowpicker <- make.lp(ncol=ncol)
set.type(lp_rowpicker, columns=1:ncol, type = c("binary"))

obj_vals <- d[, "w"]
set.objfn(lp_rowpicker, obj_vals) 
lp.control(lp_rowpicker,sense='max')

#Add constraints to limit X values from repeating
add.constraint(lp_rowpicker, xt=c(1,1,1), #xt specifies which rows of the LP
               indices=c(1,2,3), rhs=1, type="<=")
add.constraint(lp_rowpicker, xt=c(1,1), #xt specifies which rows of the LP
               indices=c(4,5), rhs=1, type="<=")
add.constraint(lp_rowpicker, xt=c(1,1), #xt specifies which rows of the LP
               indices=c(7,8), rhs=1, type="<=") #x's in dataframe rows 7 & 8 are both '4'

#Add constraints to limit Y values from repeating
add.constraint(lp_rowpicker, xt=c(1,1,1), #xt specifies which rows of the LP
               indices=c(1,6,8), rhs=1, type="<=") #Y's in df rows 1,6 & 8 are all '5'
add.constraint(lp_rowpicker, xt=c(1,1), #xt specifies which rows of the LP
               indices=c(2,7), rhs=1, type="<=") #Y's in dataframe rows 2&7 are both '6'
add.constraint(lp_rowpicker, xt=c(1,1), #xt specifies which rows of the LP
               indices=c(3,5), rhs=1, type="<=") #y's in dataframe rows 3&5 are both '7'

solve(lp_rowpicker)
get.objective(lp_rowpicker) #20
get.variables(lp_rowpicker)
#[1] 0 0 1 1 0 1 1 0
#This tells you that from d you pick rows: 3,4,6 & 7 in your optimal solution.

#If you want to look at the full formulation:
rownames1 <- paste("OneX", c(1,2,4), sep="_")
rownames2 <- paste("OneY", c(5,6,7), sep="_")
colnames<- paste("pick_",c(1:8), sep="")
dimnames(lp_rowpicker) <- list(c(rownames1, rownames2), colnames)
print(lp_rowpicker)

#write it to a text file
write.lp(lp_rowpicker,filename="max_w.lp")

希望这能让您了解如何使用 lpSolveAPI 来制定您的问题。

于 2015-11-16T18:44:34.193 回答