想象一个直接的监督层次结构。孩子死了。父亲决定给Restart
孩子。Restart
ed的时候,postRestart
和朋友们都被叫了,但是如果父亲决定让孩子恢复怎么办?儿童演员是否知道他正在恢复?顺便说一句。父亲是否可以访问导致孩子异常的消息?
3 回答
Resume 的意思是“什么都没有发生,继续”,本着这种精神,孩子甚至不知情。这是一个很少使用的指令。
父母只会得到失败本身(即Throwable
),而不是导致问题的消息,因为这会让您将父母和孩子的逻辑纠缠到健康之外。
术语恢复意味着继续处理消息,并在文档中的两处提到。
第一个用于响应异常状态:根据 akka 文档:
As described in Actor Systems supervision describes a dependency relationship between actors: the supervisor delegates tasks to subordinates and therefore must respond to their failures. When a subordinate detects a failure (i.e. throws an exception), it suspends itself and all its subordinates and sends a message to its supervisor, signaling failureDepending on the nature of the work to be supervised and the nature of the failure, the supervisor has a choice of the following four options:
Resume the subordinate, keeping its accumulated internal state
Restart the subordinate, clearing out its accumulated internal state
Terminate the subordinate permanently
Escalate the failure, thereby failing itself
请注意,RESTART 实际上会杀死原始参与者。此处再次使用术语恢复,意思是继续处理消息。
根据 akka 文档。
The precise sequence of events during a restart is the following:
- suspend the actor (which means that it will not process normal messages until resumed), and recursively suspend all children
- call the old instance’s preRestart hook (defaults to sending termination requests to all children and calling postStop)
- wait for all children which were requested to terminate (using context.stop()) during preRestart to actually terminate; this—like all actor operations—is non-blocking, the termination notice from the last killed child will effect the progression to the next step
- create new actor instance by invoking the originally provided factory again
- invoke postRestart on the new instance (which by default also calls preStart)
- send restart request to all children which were not killed in step 3; restarted children will follow the same process recursively, from step 2
- resume the actor
You can have the failure bubble up to the Supervisor if you properly set up that kind of behavior in the supervisorStrategy of the supervisor. A little example to show that behavior:
import akka.actor.Actor
import akka.actor.Props
import akka.actor.ActorSystem
object SupervisorTest {
def main(args: Array[String]) {
val system = ActorSystem("test")
val master = system.actorOf(Props[Master], "master")
master ! "foo"
Thread.sleep(500)
val worker = system.actorFor("/user/master/foo")
worker ! "bar"
}
}
class Master extends Actor{
import akka.actor.OneForOneStrategy
import akka.actor.SupervisorStrategy._
import scala.concurrent.duration._
override val supervisorStrategy =
OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) {
case _: Exception => Escalate
Escalate
}
override def preRestart(ex:Throwable, msg:Option[Any]) = {
println("In master restart: " + msg)
}
def receive = {
case msg:String =>
context.actorOf(Props[Worker], msg)
}
}
class Worker extends Actor{
override def preRestart(ex:Throwable, msg:Option[Any]) = {
println("In worker restart: " + msg)
}
def receive = {
case _ =>
throw new Exception("error!!")
}
}
You can see in the Master
actor (the supervisor in my example), I am choosing to Escalate
a failure of type Exception
. This will cause the failure to bubble up to the preRestart
in the Master
actor. Now I was expecting the msg
param to preRestart
to be the original offending message that went to the worker actor, but it wasn't. The only way I got that to show was be also overriding the preRestart
of the child actor. In my example, you will see the print outs from both the supervisor and child, in that order.