我绝对爱上了 ADX 时间序列功能;使用 Python 处理大量传感器数据。以下是我的案例的要求:
- 以不同频率处理传感器数据标签——将它们全部设为 1 秒频率(如果以毫秒为单位,则在 1 秒间隔内聚合)
- 将堆叠数据转换为未堆叠数据。
- 在 unstack 之后,通过时间戳加入另一个具有多个“字符串标签”的数据集。
- 在某些列上进行线性插值,并向前填充其他列(总共大约 10-12)。
我认为通过以下查询,我已经完成了前三个;但无法series_fill_linear
直接在列上使用。文档说这个函数需要一个dynamic
类型作为输入。错误消息很有帮助:
series_fill_linear(): argument #1 was not of an expected data type: dynamic
是否可以series_fill_linear
在我已经使用的地方应用pack
而不是pack
再次使用。如何通过标签有选择地应用此功能;并使我的整体查询更具可读性?需要注意的是,只有sensor_data
table 需要series_fill_linear
and series_fill_forward
; label_data
只需要series_fill_forward
.
项目清单
sensor_data
| where timestamp > datetime(2020-11-24 00:59:59) and timestamp <datetime(2020-11-24 12:00:00)
| where device_number =='PRESSURE_599'
| where tag_name in ("tag1", "tag2", "tag3", "tag4")
| make-series agg_value = avg(value) default = double(null) on timestamp in range (datetime(2020-11-24 00:59:59), datetime(2020-11-24 12:00:00), 1s) by tag_name
| extend series_fill_linear(agg_value, double(null), false) //EDIT
| mv-expand timestamp to typeof(datetime), agg_value to typeof(double)
| summarize b = make_bag(pack(tag_name, agg_value)) by timestamp
| evaluate bag_unpack(b)
|join kind = leftouter (label_data
| where timestamp > datetime(2020-11-24 00:58:59) and timestamp <datetime(2020-11-24 12:00:01)
| where device_number =='PRESSURE_599'
| where tag != "PRESSURE_599_label_Raw"
| summarize x = make_bag(pack(tag, value)) by timestamp
| evaluate bag_unpack(x)) on timestamp
| project timestamp,
MY_LINEAR_COL_1 = series_fill_linear(tag1, double(null), false),
MY_LINEAR_COL_2 = series_fill_forward(tag2),
MY_LABEL_1 = series_fill_forward(PRESSURE_599_label_level1),
MY_LABEL_2 = series_fill_forward(PRESSURE_599_label_level2)
编辑:我最终使用extend
withcase
来处理不同的插值情况。
// let forward_tags = dynamic({"tags": ["tag2","tag4"]}); 无法在查询中将其用作“forward_tags.tags”
sensor_data
| where timestamp > datetime(2020-11-24 00:59:59) and timestamp <datetime(2020-11-24 12:00:00)
| where device_number = "PRESSURE_599"
| where tag_name in ("tag1", "tag2", "tag3", "tag4") // use a variable here instead?
| make-series agg_value = avg(value)
default = double(null)
on timestamp
in range (datetime(2020-11-24 00:59:59), datetime(2020-11-24 12:00:00), 1s)
by tag_name
| extend agg_value = case (tag_name in ("tag2", "tag3"), // use a variable here instead?
series_fill_forward(agg_value, double(null)),
series_fill_linear(agg_value, double(null), false)
)
| mv-expand timestamp to typeof(datetime), agg_value to typeof(double)
| summarize b = make_bag(pack(tag_name, agg_value)) by timestamp
| evaluate bag_unpack(b)
| join kind = leftouter (
label_data // don't want to use make-series here, will be unecessary data generation since already in 'ss' format.
| where timestamp > datetime(2020-11-24 00:58:59) and timestamp <datetime(2020-11-24 12:00:01)
| where tag != "PRESSURE_599_label_Raw"
| summarize x = make_bag(pack(tag, value)) by timestamp
| evaluate bag_unpack(x)
)
on timestamp
我想知道是否可以在查询/fxn 内部KQL
传递一个list of strings
以使用,如下所示。我已经评论了我认为list of strings
可以传递 a 以使代码更具可读性的地方。
现在,我只需要fill_forward
标签列(MY_LABEL_1, MY_LABEL_2
);这是以下查询的结果。我希望将代码添加到主查询中,最终结果是包含所有列的表;这是基于我的案例结果的示例表。
datatable (timestamp:datetime, tag1:double, tag2:double, tag3:double, tag4:double, MY_LABEL_1: string, MY_LABEL_2: string)
[
datetime(2020-11-24T00:01:00Z), 1, 3, 6, 9, "x", "foo",
datetime(2020-11-24T00:01:01Z), 1, 3, 6, 9, "", "",
datetime(2020-11-24T00:01:02Z), 1, 3, 6, 9,"", "",
datetime(2020-11-24T00:01:03Z), 1, 3, 6, 9,"y", "bar",
datetime(2020-11-24T00:01:04Z), 1, 3, 6, 9,"", "",
datetime(2020-11-24T00:01:05Z), 1, 3, 6, 9,"", "",
]