json - 将 JSON 与不同的键名组合

Question

编辑：将尝试简化我的问题，并将 JSON 示例简化为相关元素。

在 Ansible 中构建剧本，我正在尝试执行的一项任务涉及从 4 个独立的 Qradar API 端点提取数据，并尝试组合来自每个端点的一些细节。

每个端点有 4 个不同的 json 源：

“regex_properties.json”：具有唯一的“标识符”，我需要引用“名称”和“属性类型”值。
“log_source_types.json”：具有唯一的“id”字段，我需要引用它的“名称”
“log_sources.json”：具有唯一的“id”字段，如果它是 log_source_type 分组的一部分（匹配上面的“id”），则可能包含“type_id”字段。需要其中的“名称”字段，以及可能的“last_event_time”进行过滤（但没有它也可以通过）。
“property_expressions.json”：具有唯一的“标识符”字段。还具有每个“regex_property_identifier”映射到的“log_source_type_id”和/或“log_source_id”。这些值映射到其他日志中的唯一标识符

来自实验室的例子：

regex_properties.json
[
  {
    "identifier": "59723052-d96c-4cef-ba7b-69d426602e04",
    "property_type": "numeric",
    "name": "indexTotalSize",
  }
]

log_sources.json
[
  {
    "id": 64,
    "name": "SIM Audit-2 :: eng-qradar-aio-01",
    "type_id": 105,
    "last_event_time": 1588628234930,
  }
]

log_source_types.json
[
    "name": "SIM Audit",
    "id": 105
  },
]

property_expressions.json
[
  {
    "identifier": "0311c65b-d5b5-483e-943f-b539543a8e95",
    "log_source_type_id": 105,
    "log_source_id": 65,
    "regex_property_identifier": "59723052-d96c-4cef-ba7b-69d426602e04",
  }
]

我想提取这 4 个来源，并输出一个文件，其中包含由 property_expressions.json 链接的以下数据：

regex_property.json 的“名称”和“属性类型”（重命名为 regex_name 或类似名称）
log_sources.json 和 log_source_types.json 中的“名称”（分别重命名为 ls_name 和 lst_name）

比如下面

merged_example.json
[
  {
    "identifier": "0311c65b-d5b5-483e-943f-b539543a8e95",
    "log_source_type_id": 105,
    "log_source_id": 65,
    "regex_property_identifier": "59723052-d96c-4cef-ba7b-69d426602e04",
    "property_type": "numeric",
    "regex_name": "indexTotalSize",
    "lst_name": "SIM Audit",
    "ls_name": "SIM Audit-2 :: eng-qradar-aio-01",
  }
]

或者变成一个具有相同数据的csv，这就是导出的最终目标，但可以等待。

我尝试在 regex_properties.json 中将“标识符”重命名为“regex_property_identifier”，然后使用“jq -s regex_properties.json property_expressions.json”，但我仍然只是看到两个内容都是同一输出/文件中的单独数组。

我尝试过使用 ansible 并执行以下操作：

  - name: use JQ to reformat json to csv
    shell: cat /tmp/property_expressions.json | jq -r '.[]' | jq 'select(.enabled == true)' | jq '[.identifier,.regex_property_identifier,.log_source_id,.log_source_type_id] | @csv' > /tmp/props.csv

  - name: Read CSV into dictionary
    read_csv:
      path: "/tmp/props.csv"
      fieldnames: "prop_id,regex_id,ls_id,lst_id"
      delimiter: ","
    register: props

  - name: Loop Prop Dictionary and replace in CSV the regex_id
    replace:
      path: "/tmp/props.csv"
      regexp: "{{ item.regex_id }}"
      replace: "{{ regex_properties.json | json_query(regex_name_q) }},{{ regex_properties.json | json_query(regex_type_q) }}"
    loop: "{{ props.list }}"
    vars:
      regex_name_q: "{{ item.regex_id }}.name"
      regex_type_q: "{{ item.regex_id }}.property_type"

为了只制作一个 CSV 并逐项查找/替换条款。但如果我能在 JSON 数组中做到这一点，那就更干净了。

score 0 · Accepted Answer

假设排除了问题中 JSON 示例中的小错误，以下 bash 脚本会生成如下所示的输出：

#!/bin/bash

jq -n \
 --argfile lst log_source_types.json \
 --argfile ls  log_sources.json \
 --argfile pe  property_expressions.json \
 --argfile rp  regex_properties.json '
  [ range(0, $pe|length) as $i
    | {identifier: $pe[$i].identifier,
       log_source_type_id: $lst[$i].id,
       log_source_id: $pe[$i].log_source_id,
       regex_property_identifier: $pe[$i].regex_property_identifier,
       property_type: $rp[$i].property_type,
       regex_name: $rp[$i].name,
       lst_name: $lst[$i].name,
       ls_name: $ls[$i].name
     }
  ]
'

注意：我不会太担心 --argfile 已被正式弃用，但如果这让您感到困扰，有很多解决方法，尽管有些是版本相关的。如果您想要一个适用于每个 jq 版本的非弃用解决方案，我会使用以下形式：

jq -s -f program.jq \ 
 log_source_types.json \
 log_sources.json \
 property_expressions.json \
 regex_properties.json

其中 program.jq 以定义四个 $ 变量开始，首先是：.[0] as $lst |

json - 将 JSON 与不同的键名组合

1 回答 1

Related

Reference