0

我有如下所示的有效载荷。我需要每 1 分钟获取第一个不同的批次值。请让我知道如何使用 isfirst 和 lag 或 last 在流分析中实现这一点

输出如:

BATCH=01,"2015-01-01T00:00:01.0000000Z" BATCH=02,"2015-01-01T00:00:03.0000000Z" BATCH=03,"2015-01-01T00:00:06.0000000Z" BATCH= 01,"2015-01-01T00:00:14.0000000Z" BATCH=02,"2015-01-01T00:00:18.0000000Z" BATCH=03,"2015-01-01T00:00:22.0000000Z" BATCH=01, “2015-01-01T00:00:27.0000000Z”批次=01,“2015-01-01T00:00:31.0000000Z”

Pay Load:
    [{
            "Payload": {
                "Make": "BATCH1",
                "VAL": "01",
                "TS": "2015-01-01T00:00:01.0000000Z"
            }
    },
    {
    "Payload": {
            "Make": "BATCH1",
            "VAL": "01",
            "TS": "2015-01-01T00:00:02.0000000Z"
        }
    },
    {
        "Payload": {
            "Make": "BATCH1",
            "VAL": "02",
            "TS": "2015-01-01T00:00:03.0000000Z"
        }
    },
    {
        "Payload": {
            "Make": "BATCH1",
            "VAL": "02",
            "TS": "2015-01-01T00:00:04.0000000Z"
        }
    },
    {
        "Payload": {
            "Make": "BATCH1",
            "VAL": "02",
            "TS": "2015-01-01T00:00:05.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH1",
            "VAL": "03",
            "TS": "2015-01-01T00:00:06.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH1",
            "VAL": "03",
            "TS": "2015-01-01T00:00:07.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH1",
            "VAL": "03",
            "TS": "2015-01-01T00:00:10.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH1",
            "VAL": "03",
            "TS": "2015-01-01T00:00:11.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH1",
            "VAL": "03",
            "TS": "2015-01-01T00:00:12.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH2",
            "VAL": "01",
            "TS": "2015-01-01T00:00:13.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH2",
            "VAL": "01",
            "TS": "2015-01-01T00:00:14.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH2",
            "VAL": "01",
            "TS": "2015-01-01T00:00:15.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH2",
            "VAL": "01",
            "TS": "2015-01-01T00:00:16.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH2",
            "VAL": "01",
            "TS": "2015-01-01T00:00:17.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH2",
            "VAL": "02",
            "TS": "2015-01-01T00:00:18.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH2",
            "VAL": "02",
            "TS": "2015-01-01T00:00:20.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH2",
            "VAL": "02",
            "TS": "2015-01-01T00:00:21.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH3",
            "VAL": "02",
            "TS": "2015-01-01T00:00:22.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH3",
            "VAL": "02",
            "TS": "2015-01-01T00:00:23.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH3",
            "VAL": "02",
            "TS": "2015-01-01T00:00:24.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH3",
            "VAL": "02",
            "TS": "2015-01-01T00:00:25.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH3",
            "VAL": "02",
            "TS": "2015-01-01T00:00:26.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH4",
            "VAL": "01",
            "TS": "2015-01-01T00:00:27.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH4",
            "VAL": "01",
            "TS": "2015-01-01T00:00:28.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH4",
            "VAL": "01",
            "TS": "2015-01-01T00:00:29.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH4",
            "VAL": "01",
            "TS": "2015-01-01T00:00:30.0000000Z"
        }

    },
    {"Payload": {
            "Make": "BATCH5",
            "VAL": "01",
            "TS": "2015-01-01T00:00:31.0000000Z"
        }

    }
    ]
4

1 回答 1

0

我试图将您的要求总结如下:

示例输入,在一分钟的窗口中,每个批次 ID 可以有多个 VAL 更改:

Make:batch1,Val:01, Make:batch1,val:01, Make:batch1,val:02, Make:batch1,val:02 ×××××××××××× Make:batch2,val:01 , Make:batch2,val:01, Xxxxxxxxxx

所需的输出,每批更改只有 val,没有重复:

制造:batch1,val:01 制造:batch1,val:02 制造:batch2,val:01

答案分为2部分:

1.在静态期间收集数据,您可以使用内置的Tumbling Window功能,如下所示:

2.没有像 distinct 那样的内置 ASA 功能来过滤重复项。我建议您使用GROUP BY, MAX, ASA UDF( link ) 来接近您的结果。

SQL:

 SELECT g.Payload.Make,g.Payload.VAL,max(udf.convertdate(g.Payload.TS)) as TS
    FROM geoinput g TIMESTAMP BY g.Payload.TS
    GROUP BY g.Payload.Make,g.Payload.VAL, TumblingWindow(Duration(minute, 1))

测试输出:

在此处输入图像描述

顺便说一句,我只是在 UDF 中使用下面的代码

var date = new Date(datetime);
    return date.getTime();

另一种解决方法,您可以在 1 分钟内收集所有数据,然后使用Azure Function 作为输出。在 Azure Function 中,您可以根据需要处理数据。比如使用 JSON 对象来存储数据。Key-Value 结构可以过滤重复行。

于 2020-02-17T10:03:34.663 回答