scala - 用于 REST 轮询的 Akka

Question

我正在尝试将大型 Scala + Akka + PlayMini 应用程序与外部 REST API 连接起来。这个想法是定期轮询（基本上每 1 到 10 分钟）一个根 URL，然后爬过子级 URL 以提取数据，然后将其发送到消息队列。

我想出了两种方法来做到这一点：

第一种方式

创建参与者的层次结构以匹配 API 的资源路径结构。在谷歌纵横案例中，这意味着，例如

演员 'latitude/v1/currentLocation' 投票https://www.googleapis.com/latitude/v1/currentLocation
演员“纬度/v1/位置”投票https://www.googleapis.com/latitude/v1/location
演员 'latitude/v1/location/1' 投票https://www.googleapis.com/latitude/v1/location/1
演员 'latitude/v1/location/2' 投票https://www.googleapis.com/latitude/v1/location/2
演员 'latitude/v1/location/3' 投票https://www.googleapis.com/latitude/v1/location/3
等等

在这种情况下，每个actor负责定期轮询其关联资源，以及为下一级路径资源创建/删除子actor（即actor'latitude/v1/location'创建actor 1、2、3等它通过对https://www.googleapis.com/latitude/v1/location的轮询了解的所有位置）。

第二种方式

创建一个相同的轮询参与者池，这些参与者接收由路由器负载平衡的轮询请求（包含资源路径），轮询一次 URL，进行一些处理，并安排轮询请求（针对下一级资源和轮询的 URL） . 在谷歌纵横中，这意味着例如：

1 个路由器，n 个轮询参与者。https://www.googleapis.com/latitude/v1/location的初始轮询请求导致https://www.googleapis.com/latitude/v1/location/1的几个新（立即）轮询请求，https ：/ /www.googleapis.com/latitude/v1/location/2等以及对同一资源的一个（延迟）轮询请求，即https://www.googleapis.com/latitude/v1/location。

我已经实现了这两种解决方案，但无法立即观察到任何相关的性能差异，至少对于我感兴趣的 API 和轮询频率而言没有。我发现第一种方法更容易推理，也许更容易与系统一起使用.scheduler.schedule(...) 比第二种方法（我需要 scheduleOnce(...)）。此外，假设资源嵌套在几个级别并且有些短暂（例如，可以在每次轮询之间添加/删除几个资源），akka 的生命周期管理可以很容易地在第一种情况下杀死整个分支。第二种方法（理论上）应该更快，并且代码更容易编写。

我的问题是：

哪种方法似乎是最好的（在性能、可扩展性、代码复杂性等方面）？
你觉得这两种方法的设计有什么问题吗（尤其是第一种）？
有没有人试图实现类似的东西？它是怎么做的？

谢谢！

score 1 · Accepted Answer

为什么不创建一个主轮询器，然后按计划启动异步资源请求？

我不是使用 Akka 的专家，但我试了一下：

遍历要获取的资源列表的轮询器对象：

import akka.util.duration._
import akka.actor._
import play.api.Play.current
import play.api.libs.concurrent.Akka

object Poller {
  val poller = Akka.system.actorOf(Props(new Actor {
    def receive = {
      case x: String => Akka.system.actorOf(Props[ActingSpider], name=x.filter(_.isLetterOrDigit)) ! x
    }
  }))

  def start(l: List[String]): List[Cancellable] =
    l.map(Akka.system.scheduler.schedule(3 seconds, 3 seconds, poller, _))

  def stop(c: Cancellable) {c.cancel()}
}

异步读取资源并触发更多异步读取的参与者。如果更友好，您可以按计划安排消息发送，而不是立即调用：

import akka.actor.{Props, Actor}
import java.io.File

class ActingSpider extends Actor {
  import context._
  def receive = {
    case name: String => {
      println("reading " + name)
      new File(name) match {
        case f if f.exists() => spider(f)
        case _ => println("File not found")
      }
      context.stop(self)
    }
  }

  def spider(file: File) {
    io.Source.fromFile(file).getLines().foreach(l => {
      val k = actorOf(Props[ActingSpider], name=l.filter(_.isLetterOrDigit))
      k ! l
    })
  }
}

scala - 用于 REST 轮询的 Akka

第一种方式

第二种方式

1 回答 1

Related

Reference