21

我想实现一个装饰器,为任何方法提供每个请求的缓存,而不仅仅是视图。这是一个示例用例。

我有一个自定义标签,可以确定一长串记录中的一条记录是否是“收藏夹”。为了检查一个项目是否是收藏夹,您必须查询数据库。理想情况下,您将执行一个查询以获取所有收藏夹,然后仅针对每条记录检查该缓存列表。

一种解决方案是获取视图中的所有收藏夹,然后将该集合传递给模板,然后传递给每个标记调用。

或者,标签本身可以执行查询,但仅在第一次调用时执行。然后可以缓存结果以供后续调用。好处是您可以在任何视图上的任何模板中使用此标签,而无需提醒视图。

在现有的缓存机制中,您可以将结果缓存 50 毫秒,并假设这与当前请求相关。我想让这种相关性可靠。

这是我目前拥有的标签的示例。

@register.filter()
def is_favorite(record, request):

    if "get_favorites" in request.POST:
        favorites = request.POST["get_favorites"]
    else:

        favorites = get_favorites(request.user)

        post = request.POST.copy()
        post["get_favorites"] = favorites
        request.POST = post

    return record in favorites

有没有办法从 Django 获取当前请求对象,而无需传递它?从标签中,我可以只传递请求,该请求将始终存在。但我想从其他功能中使用这个装饰器。

是否存在按请求缓存的现有实现?

4

7 回答 7

27

使用自定义中间件,您可以获得保证为每个请求清除的 Django 缓存实例。

这是我在一个项目中使用的:

from threading import currentThread
from django.core.cache.backends.locmem import LocMemCache

_request_cache = {}
_installed_middleware = False

def get_request_cache():
    assert _installed_middleware, 'RequestCacheMiddleware not loaded'
    return _request_cache[currentThread()]

# LocMemCache is a threadsafe local memory cache
class RequestCache(LocMemCache):
    def __init__(self):
        name = 'locmemcache@%i' % hash(currentThread())
        params = dict()
        super(RequestCache, self).__init__(name, params)

class RequestCacheMiddleware(object):
    def __init__(self):
        global _installed_middleware
        _installed_middleware = True

    def process_request(self, request):
        cache = _request_cache.get(currentThread()) or RequestCache()
        _request_cache[currentThread()] = cache

        cache.clear()

要使用中间件,请在 settings.py 中注册,例如:

MIDDLEWARE_CLASSES = (
    ...
    'myapp.request_cache.RequestCacheMiddleware'
)

然后,您可以按如下方式使用缓存:

from myapp.request_cache import get_request_cache

cache = get_request_cache()

有关更多信息,请参阅 django 低级缓存 api 文档:

Django 低级缓存 API

修改 memoize 装饰器以使用请求缓存应该很容易。查看 Python 装饰器库,了解 memoize 装饰器的一个很好的示例:

Python 装饰器库

于 2012-01-06T14:17:16.023 回答
4

编辑:

我想出的最终解决方案已编译成 PyPI 包:https ://pypi.org/project/django-request-cache/

编辑 2016-06-15:

我发现了一个简单得多的解决方案来解决这个问题,并且因为从一开始就没有意识到这应该是多么容易而有点面无表情。

from django.core.cache.backends.base import BaseCache
from django.core.cache.backends.locmem import LocMemCache
from django.utils.synch import RWLock


class RequestCache(LocMemCache):
    """
    RequestCache is a customized LocMemCache which stores its data cache as an instance attribute, rather than
    a global. It's designed to live only as long as the request object that RequestCacheMiddleware attaches it to.
    """

    def __init__(self):
        # We explicitly do not call super() here, because while we want BaseCache.__init__() to run, we *don't*
        # want LocMemCache.__init__() to run, because that would store our caches in its globals.
        BaseCache.__init__(self, {})

        self._cache = {}
        self._expire_info = {}
        self._lock = RWLock()

class RequestCacheMiddleware(object):
    """
    Creates a fresh cache instance as request.cache. The cache instance lives only as long as request does.
    """

    def process_request(self, request):
        request.cache = RequestCache()

有了这个,您可以将request.cache其用作缓存实例,该实例的寿命与实际寿命一样长request,并且在请求完成时将被垃圾收集器完全清除。

如果您需要从通常不可用的上下文中访问该request对象,您可以使用可以在线找到的所谓“全局请求中间件”的各种实现之一。

** 初步答案:**

这里没有其他解决方案可以解决的一个主要问题是,当您在单个进程的生命周期中创建和销毁其中的几个时,LocMemCache 会泄漏内存。django.core.cache.backends.locmem定义了几个全局字典,这些字典包含对每个 LocalMemCache 实例的缓存数据的引用,并且这些字典永远不会被清空。

下面的代码解决了这个问题。它最初是@href_ 的答案和@squarelogic.hayden 评论中链接的代码所使用的更简洁逻辑的组合,然后我进一步完善了它。

from uuid import uuid4
from threading import current_thread

from django.core.cache.backends.base import BaseCache
from django.core.cache.backends.locmem import LocMemCache
from django.utils.synch import RWLock


# Global in-memory store of cache data. Keyed by name, to provides multiple
# named local memory caches.
_caches = {}
_expire_info = {}
_locks = {}


class RequestCache(LocMemCache):
    """
    RequestCache is a customized LocMemCache with a destructor, ensuring that creating
    and destroying RequestCache objects over and over doesn't leak memory.
    """

    def __init__(self):
        # We explicitly do not call super() here, because while we want
        # BaseCache.__init__() to run, we *don't* want LocMemCache.__init__() to run.
        BaseCache.__init__(self, {})

        # Use a name that is guaranteed to be unique for each RequestCache instance.
        # This ensures that it will always be safe to call del _caches[self.name] in
        # the destructor, even when multiple threads are doing so at the same time.
        self.name = uuid4()
        self._cache = _caches.setdefault(self.name, {})
        self._expire_info = _expire_info.setdefault(self.name, {})
        self._lock = _locks.setdefault(self.name, RWLock())

    def __del__(self):
        del _caches[self.name]
        del _expire_info[self.name]
        del _locks[self.name]


class RequestCacheMiddleware(object):
    """
    Creates a cache instance that persists only for the duration of the current request.
    """

    _request_caches = {}

    def process_request(self, request):
        # The RequestCache object is keyed on the current thread because each request is
        # processed on a single thread, allowing us to retrieve the correct RequestCache
        # object in the other functions.
        self._request_caches[current_thread()] = RequestCache()

    def process_response(self, request, response):
        self.delete_cache()
        return response

    def process_exception(self, request, exception):
        self.delete_cache()

    @classmethod
    def get_cache(cls):
        """
        Retrieve the current request's cache.

        Returns None if RequestCacheMiddleware is not currently installed via 
        MIDDLEWARE_CLASSES, or if there is no active request.
        """
        return cls._request_caches.get(current_thread())

    @classmethod
    def clear_cache(cls):
        """
        Clear the current request's cache.
        """
        cache = cls.get_cache()
        if cache:
            cache.clear()

    @classmethod
    def delete_cache(cls):
        """
        Delete the current request's cache object to avoid leaking memory.
        """
        cache = cls._request_caches.pop(current_thread(), None)
        del cache

编辑 2016-06-15:我发现了一个简单得多的解决方案来解决这个问题,并且因为从一开始就没有意识到这应该是多么容易而有点面无表情。

from django.core.cache.backends.base import BaseCache
from django.core.cache.backends.locmem import LocMemCache
from django.utils.synch import RWLock


class RequestCache(LocMemCache):
    """
    RequestCache is a customized LocMemCache which stores its data cache as an instance attribute, rather than
    a global. It's designed to live only as long as the request object that RequestCacheMiddleware attaches it to.
    """

    def __init__(self):
        # We explicitly do not call super() here, because while we want BaseCache.__init__() to run, we *don't*
        # want LocMemCache.__init__() to run, because that would store our caches in its globals.
        BaseCache.__init__(self, {})

        self._cache = {}
        self._expire_info = {}
        self._lock = RWLock()

class RequestCacheMiddleware(object):
    """
    Creates a fresh cache instance as request.cache. The cache instance lives only as long as request does.
    """

    def process_request(self, request):
        request.cache = RequestCache()

有了这个,您可以将request.cache其用作缓存实例,该实例的寿命与实际寿命一样长request,并且在请求完成时将被垃圾收集器完全清除。

如果您需要从通常不可用的上下文中访问该request对象,您可以使用可以在线找到的所谓“全局请求中间件”的各种实现之一。

于 2016-05-03T22:39:26.453 回答
3

我想出了一个技巧,可以将内容直接缓存到请求对象中(而不是使用标准缓存,它将绑定到 memcached、文件、数据库等)

# get the request object's dictionary (rather one of its methods' dictionary)
mycache = request.get_host.__dict__

# check whether we already have our value cached and return it
if mycache.get( 'c_category', False ):
    return mycache['c_category']
else:
    # get some object from the database (a category object in this case)
    c = Category.objects.get( id = cid )

    # cache the database object into a new key in the request object
    mycache['c_category'] = c

    return c

所以,基本上我只是将缓存值(在这种情况下为类别对象)存储在请求字典中的新键“c_category”下。或者更准确地说,因为我们不能只在请求对象上创建密钥,所以我将密钥添加到请求对象的方法之一——get_host()。

乔治。

于 2012-10-05T22:43:35.753 回答
3

多年后,在单个 Django 请求中缓存 SELECT 语句的超级技巧。您需要patch()在请求范围的早期执行该方法,就像在一个中间件中一样。

from threading import local
import itertools
from django.db.models.sql.constants import MULTI
from django.db.models.sql.compiler import SQLCompiler
from django.db.models.sql.datastructures import EmptyResultSet
from django.db.models.sql.constants import GET_ITERATOR_CHUNK_SIZE


_thread_locals = local()


def get_sql(compiler):
    ''' get a tuple of the SQL query and the arguments '''
    try:
        return compiler.as_sql()
    except EmptyResultSet:
        pass
    return ('', [])


def execute_sql_cache(self, result_type=MULTI):

    if hasattr(_thread_locals, 'query_cache'):

        sql = get_sql(self)  # ('SELECT * FROM ...', (50)) <= sql string, args tuple
        if sql[0][:6].upper() == 'SELECT':

            # uses the tuple of sql + args as the cache key
            if sql in _thread_locals.query_cache:
                return _thread_locals.query_cache[sql]

            result = self._execute_sql(result_type)
            if hasattr(result, 'next'):

                # only cache if this is not a full first page of a chunked set
                peek = result.next()
                result = list(itertools.chain([peek], result))

                if len(peek) == GET_ITERATOR_CHUNK_SIZE:
                    return result

            _thread_locals.query_cache[sql] = result

            return result

        else:
            # the database has been updated; throw away the cache
            _thread_locals.query_cache = {}

    return self._execute_sql(result_type)


def patch():
    ''' patch the django query runner to use our own method to execute sql '''
    _thread_locals.query_cache = {}
    if not hasattr(SQLCompiler, '_execute_sql'):
        SQLCompiler._execute_sql = SQLCompiler.execute_sql
        SQLCompiler.execute_sql = execute_sql_cache

patch() 方法将 Django 内部的 execute_sql 方法替换为一个名为 execute_sql_cache 的替代方法。该方法查看要运行的 sql,如果是 select 语句,它首先检查线程本地缓存。只有在缓存中没有找到它时,它才会继续执行 SQL。在任何其他类型的 sql 语句上,它都会清除缓存。有一些逻辑不缓存大型结果集,这意味着超过 100 条记录。这是为了保留 Django 的惰性查询集评估。

于 2013-01-19T00:16:43.563 回答
2

这个使用 python dict 作为缓存(不是 django 的缓存),并且非常简单和轻量级。

  • 每当线程被销毁时,它的缓存都会自动进行。
  • 不需要任何中间件,每次访问都不会对内容进行pickle和depickle,速度更快。
  • 经过测试并与 gevent 的猴子补丁一起工作。

使用线程本地存储也可以实现相同的功能。我不知道这种方法的任何缺点,请随时在评论中添加它们。

from threading import currentThread
import weakref

_request_cache = weakref.WeakKeyDictionary()

def get_request_cache():
    return _request_cache.setdefault(currentThread(), {})
于 2013-07-10T19:43:17.900 回答
1

您始终可以手动进行缓存。

    ...
    if "get_favorites" in request.POST:
        favorites = request.POST["get_favorites"]
    else:
        from django.core.cache import cache

        favorites = cache.get(request.user.username)
        if not favorites:
            favorites = get_favorites(request.user)
            cache.set(request.user.username, favorites, seconds)
    ...
于 2010-06-30T19:25:11.060 回答
1

@href_给出的答案很棒。

以防万一你想要一些更短的东西也可能起到作用:

from django.utils.lru_cache import lru_cache

def cached_call(func, *args, **kwargs):
    """Very basic temporary cache, will cache results
    for average of 1.5 sec and no more then 3 sec"""
    return _cached_call(int(time.time() / 3), func, *args, **kwargs)


@lru_cache(maxsize=100)
def _cached_call(time, func, *args, **kwargs):
    return func(*args, **kwargs)

然后让收藏夹这样称呼它:

favourites = cached_call(get_favourites, request.user)

此方法利用lru 缓存并将其与时间戳相结合,我们确保缓存不会保存任何内容超过几秒钟。如果您需要在短时间内多次调用昂贵的函数,这可以解决问题。

这不是使缓存无效的完美方法,因为有时它会丢失最近的数据:int(..2.99.. / 3)其次是int(..3.00..) / 3). 尽管有这个缺点,但它在大多数命中仍然非常有效。

另外作为奖励,您可以在请求/响应周期之外使用它,例如 celery 任务或管理命令作业。

于 2016-04-13T10:30:30.783 回答