0

我正在加入 2 个表,它们都有数百个类似命名的列。我想更改每个表中的所有列名以包含表名。为了使查询简单,我不想显式地调用每个列名。下面的查询实现了这个目标。但是,以下查询在应用于大型数据集时非常慢。我假设性能缓慢是由于 replace_regex() 函数在整个数据集上运行。是否有另一种方法可以在提高更大数据集的性能的同时实现相同的结果?

let T1 = datatable (Key:string , Col2:string , Col3:string )
[
  "1", "b", "c",
  "2", "e", "f",
  "3", "h", "i"
] 
| project PackedRecord = todynamic(replace_regex(tostring(pack_all()), '"([a-zA-Z0-9_]*)":"', @'"T1_\1":"'))
| evaluate bag_unpack(PackedRecord);
let T2 = datatable (Key:string , Col2:string , Col3:string )
[
  "1", "B", "C",
  "2", "E", "F",
  "4", "H", "I"
] 
| project PackedRecord = todynamic(replace_regex(tostring(pack_all()), '"([a-zA-Z0-9_]*)":"', @'"T2_\1":"'))
| evaluate bag_unpack(PackedRecord);
let JoinTable = T1 | join kind=inner T2 on $left.T1_Key == $right.T2_Key;
JoinTable

上一个问题供参考

通过在 KQL/Kusto/Data Explorer 中添加字符串来重命名所有列名

4

1 回答 1

1

在进行操作时,您可以在不使用replace_regex()和依赖的情况下获得相同的结果。修改了原始 kql 代码段中的几行。OutputColumnPrefixbag_unpack

根据 kusto 文档,该OutputColumnPrefix参数允许传递一个公共前缀以添加到插件生成的所有列。

let T1 = datatable (Key:string , Col2:string , Col3:string )
[
  "1", "b", "c",
  "2", "e", "f",
  "3", "h", "i"
]
| project Key, PackedRecord = pack_all()
| evaluate bag_unpack(PackedRecord, OutputColumnPrefix = "T1_") | project-away T1_Key; // get rid of additional key;
let T2 = datatable (Key:string , Col2:string , Col3:string )
[
  "1", "B", "C",
  "2", "E", "F",
  "4", "H", "I"
]
| project Key, PackedRecord = pack_all()
| evaluate bag_unpack(PackedRecord, OutputColumnPrefix = "T2_") | project-away T2_Key; // get rid of additional key;
T1 
| join kind=inner T2 on $left.Key == $right.Key | project-away Key1 // get rid of second key after join
于 2021-12-06T23:49:35.653 回答