3

在 Django 中,我有类似于此示例的模型:

class Currency(models.Model):
    name = models.CharField(max_length=3, unique=True)
    full_name = models.CharField(max_length=20)


class ExchangeRate(models.Model):
    currency = models.ForeignKey('Currency')
    start_date = models.DateFiled()
    end_date = models.DateField()
    exchange_rate = models.DecimalField(max_digits=12, decimal_places=4)

让我们简化一下,假设我们只有一种货币,ExchangeRate表格如下所示:

+---------------------+-------------------+------------+------------+---------------+
| currency_from__name | currency_to__name | start_date |  end_date  | exchange_rate |
+---------------------+-------------------+------------+------------+---------------+
|        PLN          |        USD        | 2014-03-01 | 2014-08-01 |    3.00000    |
|        PLN          |        USD        | 2014-08-01 | 2014-12-01 |    6.00000    |
+---------------------+-------------------+------------+------------+---------------+

请注意,这是简化数学运算的示例!

在这个表中,数据密度是每月一次,一个月的有效记录是例如 whenstart_date = 2014.03.01end_date = 2014.04.01,所以start_date是包含的和end_date不包含的。

我想计算时间段的平均汇率:

[2014.06.01;  2012.09.01)

这意味着:>= 2014.06.01< 2014.09.01

在 Django 中,我写:

start_date = date(2014, 6, 1)
end_date = date(2014, 9, 1)

ExchangeRate.objects.all().filter(
        (
            Q(start_date__lt=start_date) & 
            Q(end_date__gt=start_date)
        ) | (
            Q(start_date__gte=start_date) & 
            Q(start_date__lt=end_date) & 
            Q(end_date__gt=start_date) 
        )
).annotate(
    currency_from_name = 'currency_from__name', 
    currency_to_name = 'currency_to__name'
).values(  # GROUP BY
    'currency_from_name',
    'currency_to_name'
).aggregate(
    F('currency_from_name'), 
    F('currency_to_name'), 
    Avg('exchange_rate')
)

在此查询之后,我收到的值4.5000从数学原因是正确的,但当您需要注意时间范围时是错误的。
正确答案是4.000

我只是想出了这个解决方案来用这个公式注释额外的列,然后计算这个列的平均值:

https://www.codecogs.com/eqnedit.php?latex=\inline&space;Abs&space;\left&space;(&space;\frac{months&space;\left&space;(&space;greater(ER_{start_date}\&space;,\&space ;start_date),&space;smaller(ER_{start_date}\&space;,\&space;end_date)&space;\right&space;)&space;}{months(start_date\&space;,\&space;end_date)}&space;\right&space;) &space;*&space;ER_{exchange_rate}

在哪里:

我正在使用9.3 PostgreSQL DBDjango 1.8.4

也许有一个简单的功能?
也许我过于复杂了?

4

3 回答 3

3

1 months_between().:

create function months_of(interval)
 returns int strict immutable language sql as $$
  select extract(years from $1)::int * 12 + extract(month from $1)::int
$$;

create function months_between(date, date)
 returns int strict immutable language sql as $$
   select months_of(age($1, $2))
$$;

2.average_weight():

create function average_weight(numeric, date, date, date, date)
 returns numeric(9,2) strict immutable language sql as $$
   select abs(months_between(GREATEST($2, $4), LEAST($3, $5))/months_between($4, $5))*$1
$$;

3.AverageWeight:

from django.db.models.aggregates import Func
from django.db.models.fields import FloatField

class AverageWeight(Func):
    function = 'average_weight'

    def __init__(self, *expressions):
        super(AverageWeight, self).__init__(*expressions, output_field=FloatField())

在您看来:

ExchangeRate.objects.all().filter(
        (
            Q(start_date__lt=start_date) & 
            Q(end_date__gt=start_date)
        ) | (
            Q(start_date__gte=start_date) & 
            Q(start_date__lt=end_date) & 
            Q(end_date__gt=start_date) 
        )
).annotate(
    currency_from_name = 'currency_from__name', 
    currency_to_name = 'currency_to__name',
    weight_exchange = AverageWeight(
        F('exchange_rate'),
        start_date,
        end_date,
        F('start_date'),
        F('end_date'),
    )
).values(  # GROUP BY
    'currency_from_name',
    'currency_to_name'
).aggregate(
    F('currency_from_name'), 
    F('currency_to_name'), 
    Avg('weight_exchange')
)
于 2015-09-08T20:09:27.493 回答
2

您的应用程序的问题在于您选择存储汇率的方式。所以,回答你的问题:是的,你把这个复杂化了。

“数学”告诉你平均汇率是 4.5,因为

(3 + 6) /2 == 4.5 

无论您选择什么开始日期或结束日期,系统都会为您提供相同的值。

为了解决根本原因,让我们尝试不同的方法。(为简单起见,我将保留与获取特定日期范围内的平均值无关的外键和其他细节,您可以稍后将它们添加回来)

使用此模型:

class ExchangeRate(models.Model):
    currency1 = models.CharField(max_length=3)
    currency2 = models.CharField(max_length=3)
    start_date = models.DateField()
    exchange_rate = models.DecimalField(max_digits=12, decimal_places=4)

这个数据:

INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-03-01', 3);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-04-01', 3);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-05-01', 3);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-06-01', 3);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-07-01', 3);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-08-01', 6);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-09-01', 6);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-10-01', 6);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-11-01', 6);

我们可以执行这个查询:

from django.db.models import Avg
from datetime import date

first_date = date(2014, 6, 1)
last_date = date(2014, 9, 1)
er.models.ExchangeRate.objects.filter(
    start_date__gte = first_date,
    start_date__lt = last_date

).aggregate(Avg('exchange_rate'))

要获得此输出:

{'exchange_rate__avg': 4.0}
于 2015-09-07T15:50:20.487 回答
0

您应该将此视为加权平均值,因此您要做的是计算每条线的权重,然后将它们加在一起。

我不知道足够多的 Django 来帮助你,但在 SQL 中会是这样(我现在无法测试这个,但我认为它给出了正确的想法):

SELECT SUM((LEAST(end_date, @end_date) - GREATEST(start_date, @start_date)) * exchange_rate) / (@end_date - @start_date) AS weighted_avg
FROM 
  ExchangeRate
WHERE
  (start_date, end_date) OVERLAPS (@start_date, @end_date)

这使用 OVERLAPS 运算符来查看周期是否重叠。我不确定权重计算中是否存在 of by 1 错误,但认为这应该在输入变量的定义中处理(@end_date = @end_date - 1)

于 2015-09-02T11:33:59.280 回答