0

我正在使用 Beam SQL 并尝试将整数转换为日期时间字段。

  Schema resultSchema =
    Schema.builder()
          .addInt64Field("detectedCount")
          .addStringField("sensor")
          .addInt64Field("timestamp")
          .build();

  PCollection<Row> sensorRawUnboundedTimestampedSubset = 
    sensorRowUnbounded.apply(
        SqlTransform.query(
          "select PCOLLECTION.payload.`value`.`count` detectedCount, \n"
          + "PCOLLECTION.payload.`value`.`id` sensor, \n"
          + "PCOLLECTION.`timestamp` `timestamp` \n"
          + "from PCOLLECTION "))
    .setRowSchema(resultSchema);

对于一些计算和窗口,我想转换/转换timestampDatetime字段?请提供一些转换timestamp为. 数据类型。resultSchemaDateTime

4

1 回答 1

2

在 Beam (或 Calcite)中没有开箱即用的方法来做到这一点。短版 - Calcite 或 Beam 无法知道您如何将日期或时间戳实际存储在整数中。但是,假设您有 epoch millis,这应该可以工作:

@Test
public void testBlah() throws Exception {
  // input schema, has timestamps as epoch millis
  Schema schema = Schema.builder().addInt64Field("ts").addStringField("st").build();

  DateTime ts1 = new DateTime(2019, 8, 9, 10, 11, 12);
  DateTime ts2 = new DateTime(2019, 8, 9, 10, 11, 12);

  PCollection<Row> input =
    pipeline
      .apply(
          "createRows",
          Create.of(
              Row.withSchema(schema).addValues(ts1.getMillis(), "two").build(),
              Row.withSchema(schema).addValues(ts2.getMillis(), "twelve").build()))
      .setRowSchema(schema);

  PCollection<Row> result =
    input.apply(
      SqlTransform.query(
          "SELECT \n"
          + "(TIMESTAMP '1970-01-01 00:00:00' + ts * INTERVAL '0.001' SECOND) as ts, \n"
          + "st \n"
          + "FROM \n"
          + "PCOLLECTION"));

  // output schema, has timestamps as DateTime
  Schema outSchema = Schema.builder().addDateTimeField("ts").addStringField("st").build();
  PAssert.that(result)
    .containsInAnyOrder(
        Row.withSchema(outSchema).addValues(ts1, "two").build(),
        Row.withSchema(outSchema).addValues(ts2, "twelve").build());
  pipeline.run();
}

或者,您总是可以在 java 中而不是在 SQL 中执行此操作,只需将自定义ParDo应用于SqlTransform. 从对象中ParDo提取整数时间戳,将其转换为然后发出它,例如作为具有不同模式的另一行的一部分。RowDateTime

于 2019-07-19T22:16:32.827 回答