2

我有一个数据包,格式如下 {([ChannelName#{ (bigXML,[])} ])}

  • DataBag 仅包含一个元组。
  • 元组仅包含 Map 项。
  • Map 是一种类型,它是通道名称和值之间的映射。
  • 这里的 value 是 DataBag 类型,它只包含一个元组。
  • 元组由两项组成,一项是 charrarray(非常大的字符串),另一项是地图

我有一个发射上述袋子的 UDF。

现在我需要通过将 DataBag 中唯一的元组传递给 Map 中的给定 Channel 来调用另一个 UDF。

假设没有数据包和元组,因为 ([ChannelName#{ (bigXML,[])} ]) 我可以使用$0.$0#'StdOutChannel' Now 和包内的元组 访问数据{([ChannelName#{ (bigXML,[])} ])} 如果我这样做$0.$0.$0#'StdOutChannel' (预置 $0),我会收到以下错误 ERROR 1052: Cannot cast bag with schema bag({bytearray}) to map

如何访问数据包中的数据?

4

1 回答 1

2

Try to break this problem down a little.

Let's say you get your inner bag:

MYBAG = $0.$0#'StdOutChannel';

First, can you ILLUSTRATE or DUMP this?

What can you do with this bag? Usually FOREACH over the tuples inside.

A = FOREACH MYBAG {
    GENERATE $0 AS MyCharArray, $1 AS MyMap
};

ILLUSTRATE A; -- or if this doesn't work
DUMP A;

Can you try this interactively and maybe edit your question a little more with some details as a result of you trying these things.

Some editing hints for StackOverflow:

  • put backticks around your code (`ILLUSTRATE`)
  • indent code blocks by 4 spaces on each line
于 2011-02-10T08:36:14.860 回答