python - Python：构建 LRU 缓存

Question

我有6,00,000 entries in MongoDB以下格式：

feature:category:count

在哪里

特征可以是任何词，
类别是正面的还是负面的，并且
count告诉一个特性在该类别的文档中出现了多少次。

我想缓存前1000个元组，让我们说不要每次都查询数据库。

如何在 Python 中构建 LRU 缓存？或者有任何已知的解决方案吗？

score 21 · Accepted Answer

Python3.3 中的LRU 缓存有 O(1) 的插入、删除和搜索。

该设计使用条目的循环双向链接列表（按从旧到新排列）和哈希表来定位各个链接。缓存命中使用哈希表查找相关链接并将其移动到列表的头部。缓存未命中删除最旧的链接并在链接列表的头部创建一个新链接。

这是 33 行非常基本的 Python 的简化（但快速）版本（仅使用简单的字典和列表操作）。它在 Python2.0 及更高版本（或 PyPy 或 Jython 或 Python3.x）上运行：

class LRU_Cache:

    def __init__(self, original_function, maxsize=1024):
        # Link structure: [PREV, NEXT, KEY, VALUE]
        self.root = [None, None, None, None]
        self.root[0] = self.root[1] = self.root
        self.original_function = original_function
        self.maxsize = maxsize
        self.mapping = {}

    def __call__(self, *key):
        mapping = self.mapping
        root = self.root
        link = mapping.get(key)
        if link is not None:
            link_prev, link_next, link_key, value = link
            link_prev[1] = link_next
            link_next[0] = link_prev
            last = root[0]
            last[1] = root[0] = link
            link[0] = last
            link[1] = root
            return value
        value = self.original_function(*key)
        if len(mapping) >= self.maxsize:
            oldest = root[1]
            next_oldest = oldest[1]
            root[1] = next_oldest
            next_oldest[0] = root
            del mapping[oldest[2]]
        last = root[0]
        last[1] = root[0] = mapping[key] = [last, root, key, value]
        return value


if __name__ == '__main__':
    p = LRU_Cache(ord, maxsize=3)
    for c in 'abcdecaeaa':
        print(c, p(c))

从 Python 3.1 开始，OrderedDict使得实现 LRU 缓存变得更加简单：

from collections import OrderedDict

class LRU_Cache:

    def __init__(self, original_function, maxsize=1024):
        self.original_function = original_function
        self.maxsize = maxsize
        self.mapping = OrderedDict()

    def __call__(self, *key):
        mapping = self.mapping
        try:
            value = mapping[key]
            mapping.move_to_end(key)
        except KeyError:
            value = self.original_function(*key)
            if len(mapping) >= self.maxsize:
                mapping.popitem(False)
            mapping[key] = value
        return value

score 6 · Accepted Answer

除了 Python 3.2 中包含的版本外， Python Cookbook中还有 LRU 缓存配方，其中包括Python 核心开发人员 Raymond Hettinger 的这些。

score 3 · Accepted Answer

Python 3.2functools包含一个LRU 缓存。您可以轻松地从repo中挑选它，检查是否必须对其进行调整以与 Python 2 一起使用（不应该太难 - 可能使用itertools而不是某些内置函数 - 询问您是否需要帮助）并完成。不过，您需要将查询包装成可调用的，并确保它依赖于（可散列的）函数参数。

score 1 · Accepted Answer

还有一些 lru_cache 的 python 3.3 版本的反向移植，例如在 python 2.7 上运行的这个。如果您对两层缓存感兴趣（如果没有缓存在实例中，它将检查共享缓存）我已经基于 lru_cache 的反向端口创建了 lru2cache 。

python - Python：构建 LRU 缓存

4 回答 4

Related

Reference