我需要计算表的平均日期时间,并且我将聚合与 Avg 一起使用,但它返回浮点类型数字而不是日期时间对象。这个浮点数究竟代表什么?
而且,最重要的是,如何将其转换为日期时间对象?
我需要计算表的平均日期时间,并且我将聚合与 Avg 一起使用,但它返回浮点类型数字而不是日期时间对象。这个浮点数究竟代表什么?
而且,最重要的是,如何将其转换为日期时间对象?
为了进一步参考,我不得不处理类似的问题。考虑模型:
class Championship(models.Model):
...
class Game(models.Model):
date = models.DateField()
championship = models.ForeignKey(Championship)
有些比赛与冠军有关,我想从这个冠军返回比赛日期的平均值,例如,如果我在 1 月 1 日有一场比赛,在 1 月 3 日有一场比赛,我想返回 1 月 2 日.
在 postgresql 背景上,天真地使用内置 Avg 来进行聚合不起作用:(因为 Avg 不是为日期时间字段设计的)
>>> championship.game_set.aggregate(Avg('date'))
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "~/env/local/lib/python2.7/site-packages/django/db/models/manager.py", line 158, in aggregate
return self.get_query_set().aggregate(*args, **kwargs)
File "~/env/local/lib/python2.7/site-packages/django/db/models/query.py", line 359, in aggregate
return query.get_aggregation(using=self.db)
File "~/env/local/lib/python2.7/site-packages/django/db/models/sql/query.py", line 389, in get_aggregation
result = query.get_compiler(using).execute_sql(SINGLE)
File "~/env/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 840, in execute_sql
cursor.execute(sql, params)
File "~/env/local/lib/python2.7/site-packages/django/db/backends/util.py", line 41, in execute
return self.cursor.execute(sql, params)
File "~/env/local/lib/python2.7/site-packages/django/db/backends/postgresql_psycopg2/base.py", line 58, in execute
six.reraise(utils.DatabaseError, utils.DatabaseError(*tuple(e.args)), sys.exc_info()[2])
File "~/env/local/lib/python2.7/site-packages/django/db/backends/postgresql_psycopg2/base.py", line 54, in execute
return self.cursor.execute(query, args)
DatabaseError: function avg(date) does not exist
LINE 1: SELECT AVG("games_game"."date") AS "date__avg" FROM "games_g...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
所以我尝试了两种解决方案,一种使用 django 查询集和 python,第二种主要使用原始 SQL。
def compute_avg_date(self):
"""
Return the average date of the championship's game set.
Casts dates into time deltas, in order to perform a python mean.
"""
game_set = self.game_set.values_list('date', flat=True)
origin_date = datetime.date.min
try:
return (
sum(
map(lambda date: date-origin_date, game_set),
datetime.timedelta(0))/len(game_set) + origin_date)
except ZeroDivisionError:
return datetime.date.today()
def compute_avg_date_db(self):
"""
Does the same as above but directly in db operations.
"""
try:
return self.game_set.filter(week=week).extra(
select={
'avg_time': 'to_timestamp(avg(extract(epoch from date)))'
}).values_list(
'avg_time', flat=True)[0].date()
except AttributeError:
return datetime.date.today()
我认为只有 db 的版本会更快,所以我做了一个小测试台。
>>> s = """\
... from championships.models import Championship
... champ = Championship.objects.get(pk=1)
... champ.compute_avg_date_db(10)
... """
>>> s1 = """\
... from championships.models import Championship
... champ = Championship.objects.get(pk=1)
... champ.compute_avg_date(10)
... """
>>> timeit.timeit(stmt=s, number=1000)
8.195073127746582
>>> timeit.timeit(stmt=s1, number=1000)
6.377335071563721
我做了一些其他类似的测试,所有这些都表明 compute_avg_date 方法,即使用 django 查询集和 python 的方法比原始 SQL 方法稍快。我不是专家,所以如果有人可以解释它,欢迎发表评论。
刚刚对我自己的模型进行了测试,这似乎是今年的平均水平