0

聚合上下文中的表达式是否可以引用聚合中的先前表达式?

import polars as pl

df = pl.DataFrame(dict(
  x=[0, 0, 1, 1],
  y=[1, 2, 3, 4],
))

df.groupby("x").agg([
  pl.col("x").sum().alias("sum_x"),
  (pl.col("sum_x") / pl.count()).alias("mean_x"),
])
# pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value:
# NotFound("Unable to get field named \"sum_x\". Valid fields: [\"x\", \"y\"]")

这不能天真地工作,因为正如错误清楚地表明的那样,上下文中的表达式不能引用以前的表达式。选择上下文的解决方法不适用于 groupby 上下文,因为agg它不会像那样保留所有数据with_column

4

1 回答 1

0

Similar to the selection context, in the groupby context expressions are executed in parallel and thus cannot refer to each other in the same context.

You need to enforce sequential execution by adding a select:

df.groupby("x").agg([
  pl.col("x").sum().alias("sum_x"),
  pl.count()
]).select([
    "sum_x",
    (pl.col("sum_x") / pl.col("count")).alias("mean_x")
])
于 2022-02-15T09:00:25.880 回答