1

我有一个使用 pypspark 和现在 dagster 用 python 编写的项目。我们使用 Sphinx 来构建文档,使用 napoleon 来解析 Google 风格的文档字符串。我们已经开始包含如下预包装的 dagster 实体:

@solid(
    config_schema={
        "join_key": String,
        "join_style": String,
        "df1_name": String,
        "df2_name": String,
    }
)
def join_two_dfs_solid(
    context, df1: SparkDataFrame, df2: SparkDataFrame
) -> SparkDataFrame:
    """
    Solid to join two DataFrames on the sepcified key.

    Args:
        context (dict): Dagster Context Dict
        df1 (SparkDataFrame): Spark DataFrame with the same schema
        df2 (SparkDataFrame): Spark DataFrame with the same schema

    Config Parameters:
        join_key (str): name of column to join on.  Specified column must exist in both columns.
        join_style (str): spark join style, e.g., "left", "inner", "outer", etc.; default is "inner"
        df1_name (str): alias name for the first dataframe.
        df2_name (str): alias name for the second dataframe.

    Returns:
        DataFrame
    """
    key = context.solid_config["join_key"]
    join_style = context.solid_config.get("join_style", "inner")
    df1_name = context.solid_config["df1_name"]
    df2_name = context.solid_config["df2_name"]

    context.log.info(f"Running join of two dataframes on {key}")
    check_required_columns(df1, [key])
    check_required_columns(df2, [key])

    output = df1.alias(df1_name).join(
        df2.alias(df2_name),
        sf.col(f"{df1_name}.{key}") == sf.col(f"{df2_name}.{key}"),
        how=join_style,
    )
    return output

当我们使用 sphinx-apidoc 构建时,我可以通过检查看到该函数的文档字符串存在,join_two_dfs_solid.__doc__并且 dagster 附加join_two_dfs_solid._description字段为空,这应该意味着它使用了文档字符串。但是,当 sphinx 文档构建时,我得到一个空白的 .rst 文件,用于包含该实体的模块。有谁知道狮身人面像中是否有任何其他配置设置或我需要更改以使其正确构建的实体?

4

1 回答 1

1

这是目前团队知道的一个未解决问题https://github.com/dagster-io/dagster/issues/2427

于 2021-05-17T14:55:32.217 回答