1

我需要计算表的平均日期时间,并且我将聚合与 Avg 一起使用,但它返回浮点类型数字而不是日期时间对象。这个浮点数究竟代表什么?

而且,最重要的是,如何将其转换为日期时间对象?

4

2 回答 2

1

为了进一步参考,我不得不处理类似的问题。考虑模型:

class Championship(models.Model):
    ...

class Game(models.Model):
    date = models.DateField()
    championship = models.ForeignKey(Championship)

有些比赛与冠军有关,我想从这个冠军返回比赛日期的平均值,例如,如果我在 1 月 1 日有一场比赛,在 1 月 3 日有一场比赛,我想返回 1 月 2 日.

在 postgresql 背景上,天真地使用内置 Avg 来进行聚合不起作用:(因为 Avg 不是为日期时间字段设计的)

>>> championship.game_set.aggregate(Avg('date'))
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "~/env/local/lib/python2.7/site-packages/django/db/models/manager.py", line 158, in aggregate
    return self.get_query_set().aggregate(*args, **kwargs)
  File "~/env/local/lib/python2.7/site-packages/django/db/models/query.py", line 359, in aggregate
    return query.get_aggregation(using=self.db)
  File "~/env/local/lib/python2.7/site-packages/django/db/models/sql/query.py", line 389, in get_aggregation
    result = query.get_compiler(using).execute_sql(SINGLE)
  File "~/env/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 840, in execute_sql
    cursor.execute(sql, params)
  File "~/env/local/lib/python2.7/site-packages/django/db/backends/util.py", line 41, in execute
    return self.cursor.execute(sql, params)
  File "~/env/local/lib/python2.7/site-packages/django/db/backends/postgresql_psycopg2/base.py", line 58, in execute
    six.reraise(utils.DatabaseError, utils.DatabaseError(*tuple(e.args)), sys.exc_info()[2])
  File "~/env/local/lib/python2.7/site-packages/django/db/backends/postgresql_psycopg2/base.py", line 54, in execute
    return self.cursor.execute(query, args)
DatabaseError: function avg(date) does not exist
LINE 1: SELECT AVG("games_game"."date") AS "date__avg" FROM "games_g...
           ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

所以我尝试了两种解决方案,一种使用 django 查询集和 python,第二种主要使用原始 SQL。

def compute_avg_date(self):
    """                                                                                                                           
    Return the average date of the championship's game set.
    Casts dates into time deltas, in order to perform a python mean.
    """
    game_set = self.game_set.values_list('date', flat=True)
    origin_date = datetime.date.min
    try:
        return (
            sum(
                map(lambda date: date-origin_date, game_set),
                datetime.timedelta(0))/len(game_set) + origin_date)
    except ZeroDivisionError:
        return datetime.date.today()

def compute_avg_date_db(self):
    """                                                                                                                           
    Does the same as above but directly in db operations.
    """
    try:
        return self.game_set.filter(week=week).extra(
            select={
                'avg_time': 'to_timestamp(avg(extract(epoch from date)))'
                }).values_list(
            'avg_time', flat=True)[0].date()
    except AttributeError:
        return datetime.date.today()

我认为只有 db 的版本会更快,所以我做了一个小测试台。

>>> s = """\
... from championships.models import Championship
... champ = Championship.objects.get(pk=1)
... champ.compute_avg_date_db(10)
... """
>>> s1 = """\
... from championships.models import Championship
... champ = Championship.objects.get(pk=1)
... champ.compute_avg_date(10)
... """
>>> timeit.timeit(stmt=s, number=1000)
8.195073127746582
>>> timeit.timeit(stmt=s1, number=1000)
6.377335071563721

我做了一些其他类似的测试,所有这些都表明 compute_avg_date 方法,即使用 django 查询集和 python 的方法比原始 SQL 方法稍快。我不是专家,所以如果有人可以解释它,欢迎发表评论。

于 2014-01-06T16:39:00.893 回答
0

刚刚对我自己的模型进行了测试,这似乎是今年的平均水平

于 2013-06-27T19:44:23.217 回答