0

我一直在努力理解 Cassandra 查询的工作方式,因为它们似乎没有达到我的预期。

这是我正在使用的当前表:

fields: {
    stats_customer_id: {
      type: 'uuid',
      default: {
        '$db_function': 'uuid()'
      }
    },
    stats_customer_id_old: 'text',
    stats_date_id: {
      type: 'timeuuid',
      default: {
        '$db_function': 'now()'
      }
    },
    provider_id: 'uuid',
    customer_id: 'uuid',
    customer_name: 'text',
    customer_account_no: 'text',
    direct_sent: 'int',
    messages_sent: 'int',
    reminders_sent: 'int',
    reminders_pending: 'int',
    replies_sent: 'int',
    binary_sent: 'int',
  },
  key: [
    [
      'provider_id'
    ],
    'stats_date_id',
    'customer_id'
  ]

注意:我 100% 乐于修改甚至完全丢弃此表以获得以下结果。

我的查询可以描述为:

For a given provider_id and date range (to and from date),
return a list of Customers (distinct) with a sum of each int field
(direct_sent, messages_sent, reminders_sent, reminders_pending, binary_sent).

在使用 select 语句、group_by 和其他东西尝试了几种方法后,我总是返回一个客户,该客户似乎包含给定日期范围内所有客户的总和。

当前查询的示例,使用 express-cassandra npm 库,如下所示:

let query = {
    provider_id: providerId,
    '$groupby': ['customer_id']
};

if (fromDate && toDate) {
    query.stats_date_id = {
      '$gte': models.minTimeuuid(fromDate),
      '$lte': models.maxTimeuuid(toDate)
    };
}
let selectQueries = [
    'provider_id',
    'customer_id',
    'customer_name',
    'sum(direct_sent) as direct_sent',
    'sum(messages_sent) as messages_sent',
    'sum(reminders_sent) as reminders_sent',
    'sum(reminders_pending) as reminders_pending',
    'sum(replies_sent) as replies_sent',
    'sum(binary_sent) as binary_sent',
];

// Query stats_customer table
let customerData = await models.instance.StatsCustomer.findAsync(query, {select: selectQueries});
return customerData;

我还需要能够以每天 1k 到 100k 个条目的速度将数据插入到这个表中。

我假设我误解了 Cassandra 的一些相当基本的东西,导致这种行为发生。和以前一样,为了满足上述措辞查询的要求,我完全乐意重新编写甚至删除该表。

提前致谢。

4

0 回答 0