python - python-memcached 是否支持一致的哈希和二进制协议？

Question

Python-memcached 是 Django 官方支持的 memcached 驱动程序。

是否支持

一致的哈希
二进制协议

如果是这样，我如何在 Django 中使用这些功能？我找不到任何文档。

score 2 · Accepted Answer

查看_get_serverpython-memcached v1.45 上的方法，它似乎没有使用一致的哈希，而是一个简单的hash % len(buckets).

二进制协议也是如此，python-memcache 使用，据我在源代码中看到的，只有文本命令。

score 1 · Accepted Answer

你也许可以使用这个：http ://amix.dk/blog/post/19370

它封装了 python-memcache 的 Client 类，因此密钥使用一致的散列分布。

编辑-我正在挖掘python-memcached1.4.5 源代码，看起来它实际上可能支持一致的散列。相关代码：

from binascii import crc32   # zlib version is not cross-platform
def cmemcache_hash(key):
    return((((crc32(key) & 0xffffffff) >> 16) & 0x7fff) or 1)
serverHashFunction = cmemcache_hash

-- SNIP --

def _get_server(self, key):
    if isinstance(key, tuple):
        serverhash, key = key
    else:
        serverhash = serverHashFunction(key)

    for i in range(Client._SERVER_RETRIES):
        server = self.buckets[serverhash % len(self.buckets)]
        if server.connect():
            #print "(using server %s)" % server,
            return server, key
        serverhash = serverHashFunction(str(serverhash) + str(i))
    return None, None

基于此代码，它看起来确实实现了算法，除非cmemcache_hash它不是一个有意义的名称并且这不是真正的算法。（现已退役的cmemcache 进行一致性哈希）

但我认为 OP 指的是更具“弹性”的一致哈希，例如libketama。我认为那里没有解决方案，看起来你需要卷起袖子编译/安装更高级的 memcached 库，如pylibmc，并编写一个使用它而不是 python 的自定义 Django 后端-内存缓存。

无论如何，在任何一种情况下，当您向池中添加/删除存储桶时，都会发生一些键的重新映射（即使使用 libketama，也比其他算法少）

score 1 · Accepted Answer

如果您想要 django 的即插即用解决方案，请使用django-memcached-hashring：https ://github.com/jezdez/django-memcached-hashring 。

它是一个适配器django.core.cache.backends.memcached.MemcachedCache和hash_ring图书馆。

score 1 · Accepted Answer

请检查一致性哈希的这个示例 python 实现。一致哈希 python-memcached 连续循环

实现主体：想象一个连续循环，其中分布着许多复制的服务器点。当我们添加一个新服务器时，总缓存键的 1/n 将丢失

 '''consistent_hashing.py is a simple demonstration of consistent
hashing.'''

import bisect
import hashlib

class ConsistentHash:
  '''

  To imagine it is like a continnum circle with a number of replicated
  server points spread across it. When we add a new server, 1/n of the total
  cache keys will be lost. 

  consistentHash(n,r) creates a consistent hash object for a 
  cluster of size n, using r replicas. 

  It has three attributes. num_machines and num_replics are
  self-explanatory.  hash_tuples is a list of tuples (j,k,hash), 
  where j ranges over machine numbers (0...n-1), k ranges over 
  replicas (0...r-1), and hash is the corresponding hash value, 
  in the range [0,1).  The tuples are sorted by increasing hash 
  value.

  The class has a single instance method, get_machine(key), which
  returns the number of the machine to which key should be 
  mapped.'''
  def __init__(self,replicas=1):
      self.num_replicas = replicas

  def setup_servers(self,servers=None):
    hash_tuples = [(index,k,my_hash(str(index)+"_"+str(k))) \
               for index,server in enumerate(servers)
               for k in range(int(self.num_replicas) * int(server.weight)) ]
    self.hash_tuples=self.sort(hash_tuples);

  def sort(self,hash_tuples):
    '''Sort the hash tuples based on just the hash values   '''
    hash_tuples.sort(lambda x,y: cmp(x[2],y[2]))
    return hash_tuples

  def add_machine(self,server,siz):
    '''This mathod adds a new machine. Then it updates the server hash
     in the continuum circle '''
    newPoints=[(siz,k,my_hash(str(siz)+"_"+str(k))) \
                   for k in range(self.num_replicas*server.weight)]
    self.hash_tuples.extend(newPoints)
    self.hash_tuples=self.sort(self.hash_tuples);



  def get_machine(self,key):
    '''Returns the number of the machine which key gets sent to.'''
    h = my_hash(key)
    # edge case where we cycle past hash value of 1 and back to 0.
    if h > self.hash_tuples[-1][2]: return self.hash_tuples[0][0]
    hash_values = map(lambda x: x[2],self.hash_tuples)
    index = bisect.bisect_left(hash_values,h)
    return self.hash_tuples[index][0]

def my_hash(key):
  '''my_hash(key) returns a hash in the range [0,1).'''
  return (int(hashlib.md5(key).hexdigest(),16) % 1000000)/1000000.0

score 0 · Accepted Answer

0

现在vbucket正在解决一致性哈希问题，对缓存未命中的影响最小。

于 2014-02-19T07:04:44.153 回答

score 0 · Accepted Answer

I have used Consistent hashing algorithm. The lost keys are 1/n of the total number of keys. This means the successful key fetch will be 6/7 *100 around 85%. here

python - python-memcached 是否支持一致的哈希和二进制协议？

6 回答 6

Related

Reference