我正在可视化具有例如分类字段的数据集。我想创建一个条形图,显示该字段的不同类别及其基数,按“升序”/“降序”顺序排序。这可以通过以下方式简单地实现altair
:
import pandas as pd
import altair as alt
data = {0:{'Name':'Mary', 'Sport':'Tennis'},
1:{'Name':'Cal', 'Sport':'Tennis'},
2:{'Name':'John', 'Sport':'Tennis'},
3:{'Name':'Jane', 'Sport':'Tennis'},
4:{'Name':'Bob', 'Sport':'Golf'},
5:{'Name':'Jerry', 'Sport':'Golf'},
6:{'Name':'Gustavo', 'Sport':'Golf'},
7:{'Name':'Walter', 'Sport':'Swimming'},
8:{'Name':'Jessy', 'Sport':'Swimming'},
9:{'Name':'Patric', 'Sport':'Running'},
10:{'Name':'John', 'Sport':'Shooting'}}
df = pd.DataFrame(data).T
bars = alt.Chart(df).mark_bar().encode(
x=alt.X('count():Q', axis=alt.Axis(format='.0d', tickCount=4)),
y=alt.Y('Sport:N',
sort=alt.SortField(op='count', field='Sport:N', order='descending'))
)
bars
现在假设我只对前三个数量最多的类别感兴趣。使用“transform_window”和“transform_filter”过滤数据似乎是合理的,但我找不到这样做的方法。我还去了Vega-Lite Top K 示例,试图对其进行调整,但没有成功(我的“最佳”尝试如下所示)。
bars.transform_window(window=[alt.WindowFieldDef(op='count',
field='Sport:N',
**{'as':'cardinality'})],
frame=[None, None])
bars.transform_window(window=[alt.WindowFieldDef(op='rank',
field='cardinality',
**{'as':'rank'})],
frame=[None, None],
sort=[alt.WindowSortField(field='rank',
order='descending')])
bars.transform_filter( ..... what??? .....)