在 Spark 3.0.2 中执行上述操作会 在线程“main”java.lang.AssertionError 中产生异常:断言失败:发现重复的重写属性。 它在 Spark 2.4.3 中工作。
SELECT
COALESCE(view_1_alias.name, view_2.name) AS name,
COALESCE(view_1_alias.id, view_2.id) AS id,
COALESCE(view_1_alias.second_id, view_2.second_id) AS second_id,
COALESCE(view_1_alias.local_timestamp, view_2.local_timestamp) AS local_timestamp,
COALESCE(view_1_alias.utc_timestamp, view_2.utc_timestamp) AS utc_timestamp,
view_1_alias.alias_1_column_1,
view_1_alias.alias_1_column_2,
view_1_alias.alias_1_column_3,
view_1_alias.alias_1_column_4,
view_1_alias.alias_1_column_5,
view_1_alias.alias_1_column_6,
view_1_alias.alias_1_column_7,
view_2.alias_2_coumn_1
FROM
view_1 view_1_alias FULL OUTER JOIN view_2
ON
view_1_alias.name = view_2.name AND
view_1_alias.id = view_2.id AND
view_1_alias.second_id = view_2.second_id AND
view_1_alias.local_timestamp = view_2.local_timestamp;
view_1的标题:
| name |id|second_id|local_timestamp|utc_timestamp|alias_1_column_1|alias_1_column_2 | alias_1_column_3|alias_1_column_4|alias_1_column_5|alias_1_column_6|alias_1_column_7 |
view_2的标题:
| name | id |second_id |local_timestamp |utc_timestamp|alias_2_coumn_1|
有趣的是,我们有几个这样的查询,但只有一个特别给出了问题。我看到了这个问题,但不确定如何将其翻译为 spark sql。