2

我正在使用摄取管道脚本处理器从每个文档的本地时间中提取星期几。

我正在使用 client_ip 来提取时区,将其与时间戳一起使用来提取本地时间,然后从该本地时间提取星期几(和其他特征)。

这是我的摄取管道:

{
    "processors" : [
      {
        "set" : {
          "field" : "@timestamp",
          "override" : false,
          "value" : "{{_ingest.timestamp}}"
        }
      },
      {
        "date" : {
          "field" : "@timestamp",
          "formats" : [
            "EEE MMM dd HH:mm:ss 'UTC' yyyy"
          ],
          "ignore_failure" : true,
          "target_field" : "@timestamp"
        }
      },
      {
        "convert" : {
          "field" : "client_ip",
          "type" : "ip",
          "ignore_failure" : true,
          "ignore_missing" : true
        }
      },
      {
        "geoip" : {
          "field" : "client_ip",
          "target_field" : "client_geo",
          "properties" : [
            "continent_name",
            "country_name",
            "country_iso_code",
            "region_iso_code",
            "region_name",
            "city_name",
            "location",
            "timezone"
          ],
          "ignore_failure" : true,
          "ignore_missing" : true
        }
      },
      {
        "script" : {
          "description" : "Extract details of Dates",
          "lang" : "painless",
          "ignore_failure" : true,
          "source" : """
            LocalDateTime local_time LocalDateTime.ofInstant( Instant.ofEpochMilli(ctx['@timestamp']), ZoneId.of(ctx['client_geo.timezone']));
            int day_of_week = local_time.getDayOfWeek().getValue();
            int hour_of_day = local_time.getHour();
            int office_hours = 0;
            if (day_of_week<6 && day_of_week>0) { if (hour_of_day >= 7 && hour_of_day <= 19 ) {office_hours =1;}  else {office_hours = -1;}} else {office_hours = -1;}
            ctx['day_of_week'] = day_of_week;
            ctx['hour_of_day'] = hour_of_day;
            ctx['office_hours'] = office_hours;
          """
        }
      }
    ]
}

前两个处理器是出于其他目的而添加的。我已经添加了最后 3 个。

示例文档可能如下:

  "docs": [
    {
      "_source": {
        "@timestamp": 43109942361111,
        "client_ip": "89.160.20.128"
      }
    }
  ]

我现在正在数据中获取 GeoIP 字段,但没有由脚本处理器创建的字段。我究竟做错了什么?

编辑 关于受这些更改影响的索引的一些说明:动态映射已关闭。我已手动将 client_geo.timezone 字段作为关键字添加到索引的映射中。当我在索引上运行以下脚本搜索时

GET index_name/_search
{
 "script_fields": {
  "day_of_week": {
    "script": "doc['@timestamp'].value.withZoneSameInstant(ZoneId.of(doc['client_geo']['timezone'])).getDayOfWeek().getValue()"
  }
 }
}

我在脚本执行中收到以下运行时错误:

          "caused_by" : {
            "type" : "illegal_argument_exception",
            "reason" : "No field found for [client_geo] in mapping"
          }
4

1 回答 1

0

感谢您提供格式良好的问题+示例。

我能够复制您的问题并弄清楚了。

ctx是“原样的文档来源”。因此,摄取不会自动挖掘以点分隔的字段。

您的客户数据是这样添加的:

"client_geo" : {
   "continent_name" : "Europe"
   //<snip>..</snip>
}

因此,您必须将其作为嵌套哈希映射直接访问。

意义ctx['client_geo.timezone']实际上应该是ctx['client_geo']['timezone']

这是对我有用的完整管道:

"processors": [
      {
        "set": {
          "field": "@timestamp",
          "override": false,
          "value": "{{_ingest.timestamp}}"
        }
      },
      {
        "date": {
          "field": "@timestamp",
          "formats": [
            "EEE MMM dd HH:mm:ss 'UTC' yyyy"
          ],
          "ignore_failure": true,
          "target_field": "@timestamp"
        }
      },
      {
        "convert": {
          "field": "client_ip",
          "type": "ip",
          "ignore_failure": true,
          "ignore_missing": true
        }
      },
      {
        "geoip": {
          "field": "client_ip",
          "target_field": "client_geo",
          "properties": [
            "continent_name",
            "country_name",
            "country_iso_code",
            "region_iso_code",
            "region_name",
            "city_name",
            "location",
            "timezone"
          ],
          "ignore_failure": true,
          "ignore_missing": true
        }
      },
      {
        "script": {
          "description": "Extract details of Dates",
          "lang": "painless",
          "ignore_failure": true,
          "source": """
            LocalDateTime local_time = LocalDateTime.ofInstant(Instant.ofEpochMilli(ctx['@timestamp']), ZoneId.of(ctx['client_geo']['timezone']));
            int day_of_week = local_time.getDayOfWeek().getValue();
            int hour_of_day = local_time.getHour();
            int office_hours = 0;
            if (day_of_week<6 && day_of_week>0) { if (hour_of_day >= 7 && hour_of_day <= 19 ) {office_hours =1;}  else {office_hours = -1;}} else {office_hours = -1;}
            ctx['day_of_week'] = day_of_week;
            ctx['hour_of_day'] = hour_of_day;
            ctx['office_hours'] = office_hours;
          """
        }
      }
    ]
于 2021-11-09T12:33:48.193 回答