1

我正在研究graphlab create with

data=graphlab.SFrame.read_csv('test.csv')

我试图获得其中一列的中位数

data_train.fillna(('Credit_History',data_train['Credit_History'].median()))

但我得到了错误

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-247-50ed3eb09dcc> in <module>()
----> 1 data_train.fillna(('Credit_History',data_train['Credit_History'].median()))

AttributeError: 'SArray' object has no attribute 'median'

data.show() 将显示该列的中位数,但有人知道如何解决这个问题吗?

4

2 回答 2

4

我想我明白你想做什么。Sframe 没有默认的中值函数。我会像这样即兴创作:

import numpy as np
data_train.fillna('Credit_History', np.median(data_train['Credit_History']))
于 2016-07-15T22:02:06.570 回答
1

SArray没有中位数方法。获得中位数的最好方法是通过sketch_summary方法,then quantile。有关草图摘要的更多信息,请访问

https://turi.com/products/create/docs/generated/graphlab.Sketch.html

import numpy as np
import graphlab as gl

sf = gl.SFrame(np.random.rand(100))

sketch = sf['X1'].sketch_summary()
median = sketch.quantile(0.5)
于 2016-07-15T21:51:28.300 回答