algorithm - 使用冷却时间优化动作顺序的算法

Question

我可以从“动作”列表中选择每秒执行一次。列表中的每个动作都有一个代表它的价值的数值，还有一个代表它的“冷却时间”的值——在再次使用该动作之前我必须等待的秒数。该列表可能如下所示：

动作 A 的值为 1，冷却时间为 2 秒
动作 B 的值为 1.5，冷却时间为 3 秒
动作 C 的值为 2，冷却时间为 5 秒
动作 D 的值为 3，冷却时间为 10 秒

所以在这种情况下，订单 ABA 的总值为 (1+1.5+1) = 3.5，这是可以接受的，因为 A 的第一次使用发生在 1 秒，A 的最终使用发生在 3 秒，然后这两者之差大于等于A的冷却时间，2秒。AAB 的顺序不起作用，因为您只间隔一秒钟就执行 A，比冷却时间短。

我的问题是尝试优化使用操作的顺序，在一定数量的操作上最大化总价值。显然，如果您只使用一个动作，最佳顺序是执行动作 D，总价值为 3。两个动作的最大值将来自执行 CD 或 DC，总价值为 5。当您总共执行 10 或 20 或 100 个动作时，会变得更加复杂。我找不到一种方法来优化动作顺序而不强制它，这使得它的复杂性与你想要优化顺序的动作总数呈指数级。这变得不可能过去大约 15 总数。

那么，有没有什么方法可以找到复杂度较低的最佳时间呢？有没有研究过这个问题？我想可能有某种加权图类型算法可以解决这个问题，但我不知道它是如何工作的，更不用说如何实现它了。

抱歉，如果这令人困惑——这在概念上有点奇怪，我找不到更好的方法来构建它。

score 3 · Accepted Answer

编辑：这是使用高度修改的 Dijkstra 算法的正确解决方案：

Dijkstra 算法用于找到最短路径，给定一个（Graph Abstract 的）地图，它是一系列节点（通常是位置，但对于这个例子，假设它们是动作），它们通过弧相互连接（在在这种情况下，每条弧线将有一个“值”而不是距离）

这是本质上的结构。

Graph{//in most implementations these are not Arrays, but Maps. Honestly, for your needs you don't a graph, just nodes and arcs... this is just used to keep track of them.
node[] nodes;
arc[] arcs;
}
Node{//this represents an action
arc[] options;//for this implementation, this will always be a list of all possible Actions to use.
float value;//Action value
}
Arc{
node start;//the last action used
node end;//the action after that
dist=1;//1 second
}

基于查看每条路径的最终总数，我们可以使用此数据类型制作所有可行选项的地图，以获得最佳解决方案。因此，您寻找模式的时间越多，您就越有可能找到非常理想的路径。

地图上的每一段道路都有一个距离，代表它的价值，道路上的每一站都是一秒的标记，因为那是决定下一步去哪里（执行什么动作）的时间. 为简单起见，假设 A 和 B 是唯一可行的选择。na 表示无操作，因为没有可用的操作。如果您旅行 4 秒（金额越高，结果越好）您的选择是...

A->na->A->na->A
B->na->na->B->na
A->B->A->na->B
B->A->na->B->A
...

还有更多，但我已经知道最佳路径是B->A->na->B->A，因为它的价值最高。因此，处理这种动作组合的既定最佳模式是（至少在分析 4 秒之后） B->A->na->B->A 这实际上是一个非常简单的递归算法。

    /*
     cur is the current action that you are at, it is a Node. In this example, every other action is seen as a viable option, so it's as if every 'place' on the map has a path going to every other path.
     numLeft is the amount of seconds left to run the simulation. The higher the initial value, the more desirable the results.

This won't work as written, but will give you a good idea of how the algorithm works.
*/
function getOptimal(cur,numLeft,path){
  if(numLeft==0){
    var emptyNode;//let's say, an empty node wiht a value of 0.
    return emptyNode;
  }
  var best=path;
  path.add(cur);
  for(var i=0;i<cur.options.length;i++){
    var opt=cur.options[i];//this is a COPY
    if(opt.timeCooled<opt.cooldown){
      continue;
    }
    for(var i2=0;i2<opt.length;i2++){
      opt[i2].timeCooled+=1;//everything below this in the loop is as if it is one second ahead
    }
    var potential=getOptimal(opt[i],numLeft-1,best);
    if(getTotal(potential)>getTotal(cur)){best.add(potential);}//if it makes it better, use it! getTotal will sum up the values of an array of nodes(actions)
  }
  return best;
}
function getOptimalExample(){
  log(getOptimal(someNode,4,someEmptyArrayOfNodes));//someNode will be A or B
}

结束编辑。

我对这个问题有点困惑，但是......

如果您的行动数量有限，仅此而已，那么请始终选择价值最高的行动，除非尚未达到冷却时间。

听起来你想要这样的东西（在伪代码中）：

function getOptimal(){
var a=[A,B,C,D];//A,B,C, and D are actions
a.sort()//(just pseudocode. Sort the array items by how much value they have.)
var theBest=null;
for(var i=0;i<a.length;++i){//find which action is the most valuable
     if(a[i].timeSinceLastUsed<a[i].cooldown){
        theBest=a[i];
        for(...){//now just loop through, and add time to each OTHER Action for their timeSinceLastUsed...
             //...
         }//That way, some previously used, but more valuable actions will be freed up again.
        break;
    }//because a is worth the most, and you can use it now, so why not?
}
}

score 1 · Accepted Answer

编辑：在重新阅读您的问题后，我发现需要调整加权调度算法以适应您的问题陈述；在我们的例子中，我们只想从与我们选择的动作类别匹配的集合中取出那些重叠的动作，以及那些在同一时间点开始的动作。IE 如果我们选择a1，我们想从集合中删除a2andb1而不是b2。

这看起来与此 pdf中深入讨论的加权调度问题非常相似。本质上，权重是您的操作值，间隔是（开始时间，开始时间+冷却时间）。动态规划解决方案可以被记忆，使其在 O(nlogn) 时间内运行。唯一困难的部分是修改您的问题，使其看起来像加权区间问题，这使我们能够利用预定的解决方案。

因为您的时间间隔没有设置开始和结束时间（即您可以选择何时开始某个动作），我建议在假设某个设定时间范围的情况下枚举所有给定动作的所有可能开始时间，然后使用这些静态开始/使用动态规划解决方案结束时间。假设您只能在一秒钟内开始一个动作，您可以运行动作 A 间隔 (0-2,1-3,2-4,...)，动作 B 间隔 (0-3,1-4,2 -5,...), 间隔的动作 C (0-5,1-6,2-7,...) 等。然后您可以使用联合动作的集合来获得一个看起来像原始加权的问题空间区间问题：

|---1---2---3---4---5---6---7---| time
|{--a1--}-----------------------| v=1
|---{--a2---}-------------------| v=1
|-------{--a3---}---------------| v=1
|{----b1----}-------------------| v=1.5
|---{----b2-----}---------------| v=1.5
|-------{----b3-----}-----------| v=1.5
|{--------c1--------}-----------| v=2
|---{--------c2---------}-------| v=2
|-------{-------c3----------}---| v=2
etc...

score 0 · Accepted Answer

0

始终选择最有价值的可用操作。

于 2013-03-23T00:12:22.220 回答

algorithm - 使用冷却时间优化动作顺序的算法

3 回答 3

Related

Reference