parallel-processing - 并行运行 Akka Streams 阶段会显着增加内存压力

Question

我正在尝试实现一个 Akka Stream，它从视频文件中读取帧并应用 SVM 分类器以检测每个帧上的对象。检测可以并行运行，因为视频帧的顺序无关紧要。我的想法是创建一个遵循 Akka Streams Cookbook（将工作平衡到固定的工人池）的图表，其中两个检测阶段标记为.async.

它在一定程度上按预期工作，但我注意到我的系统的内存压力（只有 8 GB 可用）急剧增加，并且超出图表显着降低了系统速度。将其与使用.mapAsync（Akka Docs）将三个演员集成到执行对象检测的流中的不同方法进行比较，内存压力显着降低。

我错过了什么？为什么并行运行两个阶段会增加内存压力，而三个并行运行的 Actor 似乎工作正常？

附加说明：我正在使用 OpenCV 读取视频文件。由于 4K 分辨率，每个类型Mat的视频帧约为 26.5 MB。

并行运行两个阶段，.async显着增加内存压力

implicit val materializer = ActorMaterializer(
  ActorMaterializerSettings(actorSystem)
    .withInputBuffer(initialSize = 1, maxSize = 1)
    .withOutputBurstLimit(1)
    .withSyncProcessingLimit(2)
  )

val greyscaleConversion: Flow[Frame, Frame, NotUsed] =
  Flow[Frame].map { el => Frame(el.videoPos, FrameTransformation.transformToGreyscale(el.frame)) }

val objectDetection: Flow[Frame, DetectedObjectPos, NotUsed] =
  Flow.fromGraph(GraphDSL.create() { implicit builder =>
    import GraphDSL.Implicits._

    val numberOfDetectors = 2
    val frameBalance: UniformFanOutShape[Frame, Frame] = builder.add(Balance[Frame](numberOfDetectors, waitForAllDownstreams = true))
    val detectionMerge: UniformFanInShape[DetectedObjectPos, DetectedObjectPos] = builder.add(Merge[DetectedObjectPos](numberOfDetectors))

    for (i <- 0 until numberOfDetectors) {
      val detectionFlow: Flow[Frame, DetectedObjectPos, NotUsed] = Flow[Frame].map { greyFrame =>
        val classifier = new CascadeClassifier()
        classifier.load("classifier.xml")
        val detectedObjects: MatOfRect = new MatOfRect()
        classifier.detectMultiScale(greyFrame.frame, detectedObjects, 1.08, 5, 0 | Objdetect.CASCADE_SCALE_IMAGE, new Size(40, 20), new Size(100, 80))
        DetectedObjectPos(greyFrame.videoPos, detectedObjects)
      }

      frameBalance.out(i) ~> detectionFlow.async ~> detectionMerge.in(i)
    }

    FlowShape(frameBalance.in, detectionMerge.out)
  })

def createGraph(videoFile: Video): RunnableGraph[NotUsed] = {
  Source.fromGraph(new VideoSource(videoFile))
    .via(greyscaleConversion).async
    .via(objectDetection)
    .to(Sink.foreach(detectionDisplayActor !))
}

.mapAsync在不增加内存压力的情况下集成演员

val greyscaleConversion: Flow[Frame, Frame, NotUsed] =
  Flow[Frame].map { el => Frame(el.videoPos, FrameTransformation.transformToGreyscale(el.frame)) }

val detectionRouter: ActorRef =
  actorSystem.actorOf(RandomPool(numberOfDetectors).props(Props[DetectionActor]), "detectionRouter")

val detectionFlow: Flow[Frame, DetectedObjectPos, NotUsed] =
  Flow[Frame].mapAsyncUnordered(parallelism = 3)(el => (detectionRouter ? el).mapTo[DetectedObjectPos])

def createGraph(videoFile: Video): RunnableGraph[NotUsed] = {
  Source.fromGraph(new VideoSource(videoFile))
    .via(greyscaleConversion)
    .via(detectionFlow)
    .to(Sink.foreach(detectionDisplayActor !))
}

parallel-processing - 并行运行 Akka Streams 阶段会显着增加内存压力

0 回答 0

Related

Reference