问题标签 [hyperloglog]

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

0 投票
1 回答
38 浏览

redis - Redis - Count distinct problem (without hyper log log)

I should solve a count-distinct problem in Redis without the use of HyperLogLog (because of the 0.81% of known error).

I got different requests with a list of objects [O1, O2, ... On] for a specific Key A. For each list of objects received, Redis should memorize the Objects not still saved and return the number of new objects saved.

For Example:

  • Request 1 : Key: A - Objects: [O1, O2, O3] -> Response 1: Number of new objects : 3
  • Request 2 : Key: A - Objects: [O1, O2, O4] -> Response 2: Number of new objects : 1
  • Request 3 : Key: A - Objects: [O1, O2, O4] -> Response 3: Number of new objects : 0

I have tried to solve this problem with the Hyper Log Log and it's working perfectly but with a growing dataset of objects, the number of new objects saved is not so accurate. With the sets and the hashmap, the memory is growing too much.

I have read some stuff about Bitmaps but is not too clear. Do you have any reference to projects that are already facing this problem?

Thanks in advance

0 投票
0 回答
56 浏览

amazon-s3 - 尝试在 Trino 中处理在 Snowflake 上创建的 HyperLogLog 时出错

在 Trino 中,我收到错误消息Cannot deserialize HyperLogLog

我有一个关于雪花的查询,执行以下操作:

visitor_hll 正在写入 BINARY(8388608) 类型的列。

然后我有一个将这些数据复制到 S3 Parquet 的过程,我通过 Trino 查询它。当我尝试在字段上执行 hyperloglog 操作时,例如

我收到上述错误。

为了使用在 Snowflake 中创建的 HLL 数据,我该怎么做?

我搜索了我得到的错误信息,谷歌上唯一的结果是 Airlift 上的 HLL 函数的源代码。

我还看到 Snowflake 说“为了与外部工具集成,Snowflake 支持将状态从 BINARY 格式转换为 OBJECT(可以打印和导出为 JSON),反之亦然。” (见HLL_EXPORT)。这将返回一个 JSON 对象,但在 S3 方面,我看不到任何将其导入 HLL 的方法。

0 投票
0 回答
15 浏览

apache-spark - 用于生成 LiquidLegions 草图的 spark-alchemy HyperLogLog 的模拟

以 spark-alchemy HLL [https://github.com/swoop-inc/spark-alchemy][1] 为例,在 spark 中生成 LiquidLegions 草图的最佳方法是什么