2

Cassandra Python 驱动程序的问题是返回的“未来”对象通过副作用添加回调。这意味着“未来”是不可组合的,就像Future来自 Javascript 或 Scala 的 a 是可组合的一样。我想知道是否有一种模式可用于将不可组合的未来转变为可组合的未来(最好没有泄漏内存。)

   my_query_object.insert(1, 2, 3, 'Fred Flinstone')
     .insert(1, 2, 3, 'Barney Rubble')
     .insert(5000, 2, 3, 'George Jetson')
     .insert(5000, 2, 3, 'Jane his wife')

查看Datastax 的 Cassandra Python 驱动程序的性能部分,我看到了他们如何创建一系列可连续链接的插入查询的示例。即这种模式的稍微复杂的版本:

def insert_next(previous_result=sentinel):
    if previous_result is not sentinel:
        if isinstance(previous_result, BaseException):
            log.error("Error on insert: %r", previous_result)

    future = session.execute_async(query)
    # NOTE: this callback also handles errors
    future.add_callbacks(insert_next, insert_next)

作为玩具示例非常有用。完成一个查询后,将再次执行另一个等效查询。该方案允许他们实现 7k 写入/秒,而不尝试“链接”回调的版本限制在 2k 写入/秒左右。

我一直在努力创造某种机制,使我能够重新获得确切的机制,但无济于事。有人想出类似的东西吗?

4

1 回答 1

1

花了我一点时间思考如何以某种形式保存未来:

import logging
from Queue import Queue #queue in python 3
from threading import Event #hmm... this needed?


insert_logger = logging.getLogger('async_insert')
insert_logger.setLevel(logging.INFO)

def handle_err(err):
  insert_logger.warning('Failed to insert due to %s', err)


#Designed to work in a high write environment. Chained callbacks for best performance and fast fail/stop when error
#encountered. Next insert should re-up the writing. Potential loss of failed write. Some guarantee on order of write
#preservation.
class CappedQueueInserter(object):
  def __init__(self, session, max_count=0):
    self.__queue = Queue(max_count)
    self.__session = session
    self.__started = Event()

  @property
  def started(self):
    return self.__started.is_set()

  def insert(self, bound_statement):
    if not self.started:
      self._begin(bound_statement)
    else:
      self._enqueue(bound_statement)

  def _begin(self, bound_statement):
    def callback():
      try:
        bound = self.__queue.get(True) #block until an item is added to the queue
        future = self.__session.execute_async(bound)
        future.add_callbacks(callback, handle_err)
      except:
        self.__started.clear()

    self.__started.set()
    future = self.__session.execute_async(bound_statement)
    future.add_callbacks(callback, handle_err)

  def _enqueue(self, bound_statement):
    self.__queue.put(bound_statement, True)


#Separate insert statement binding from the insertion loop
class InsertEnqueue(object):
  def __init__(self, prepared_query, insert, consistency_level=None):
    self.__statement = prepared_query
    self.__level = consistency_level
    self.__sink = insert

  def insert(self, *args):
    bound = self.bind(*args)
    self.__sink.insert(bound)

  @property
  def consistency_level(self):
    return self.__level or self.__statement.consistency_level

  @consistency_level.setter
  def adjust_level(self, value):
    if value:
      self.__level = value

  def bind(self, *args):
    bound = self.__statement.bind(*args)
    bound.consistency_level = self.consistency_level

    return bound

aQueue和 an 的组合Event触发事物。假设写入可能“最终”发生,这应该有效。

于 2014-03-17T01:24:50.993 回答