2

以这个模式(json数据框)为例:

 root
     |-- abstract: string (nullable = true)
     |-- adx_keywords: string (nullable = true)
     |-- asset_id: long (nullable = true)
     |-- byline: string (nullable = true)
     |-- column: string (nullable = true)
     |-- des_facet: array (nullable = true)
     |    |-- element: string (containsNull = true)
     |-- eta_id: long (nullable = true)
     |-- geo_facet: array (nullable = true)
     |    |-- element: string (containsNull = true)
     |-- id: long (nullable = true)
     |-- media: array (nullable = true)
     |    |-- element: struct (containsNull = true)
     |    |    |-- approved_for_syndication: long (nullable = true)
     |    |    |-- caption: string (nullable = true)
     |    |    |-- copyright: string (nullable = true)
     |    |    |-- **media-metadata**: array (nullable = true)
     |    |    |    |-- element: struct (containsNull = true)
     |    |    |    |    |-- format: string (nullable = true)
     |    |    |    |    |-- height: long (nullable = true)
     |    |    |    |    |-- url: string (nullable = true)
     |    |    |    |    |-- width: long (nullable = true)
     |    |    |-- subtype: string (nullable = true)
     |    |    |-- type: string (nullable = true)
     |-- nytdsection: string (nullable = true)
     |-- org_facet: array (nullable = true)
     |    |-- element: string (containsNull = true)
     |-- per_facet: array (nullable = true)
     |    |-- element: string (containsNull = true)
     |-- published_date: string (nullable = true)
     |-- section: string (nullable = true)

在这里,我想将每个数组类型作为单独的数据框。我能够分离第一级,如 des_facet、geo_facet 和媒体,并将其放入单独的数据帧中,但无法对第二级执行相同操作,例如: 媒体中的媒体元数据(数组类型)。将嵌套数组提取为单个数据框的解决方案将是有用的

提前致谢

4

0 回答 0