3

当我们有一个数据类型为字符串且值如 col1 col2 1 .89 的列时,我们将面临错误

所以,当我们使用

def azureml_main(dataframe1 = None, dataframe2 = None):

    # Execution logic goes here
    print('Input pandas.DataFrame #1:')
    import pandas as pd
    import numpy as np
    from sklearn.kernel_approximation import RBFSampler
    x =dataframe1.iloc[:,2:1080]
    print x
    df1 = dataframe1[['colname']]

    change = np.array(df1)
    b = change.ravel()
    print b
    rbf_feature = RBFSampler(gamma=1, n_components=100,random_state=1)
    print rbf_feature
    print "test"
    X_features = rbf_feature.fit_transform(x)

在此之后我们收到错误,因为无法将非 int 转换为浮点类型

4

1 回答 1

7

使用astype(float)例如:

df['col'] = df['col'].astype(float)

convert_objects

df = df.convert_objects(convert_numeric=True)

例子:

In [379]:

df = pd.DataFrame({'a':['1.23', '0.123']})
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2 entries, 0 to 1
Data columns (total 1 columns):
a    2 non-null object
dtypes: object(1)
memory usage: 32.0+ bytes
In [380]:

df['a'].astype(float)
Out[380]:
0    1.230
1    0.123
Name: a, dtype: float64

In [382]:

df = df.convert_objects(convert_numeric=True)
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2 entries, 0 to 1
Data columns (total 1 columns):
a    2 non-null float64
dtypes: float64(1)
memory usage: 32.0 bytes

更新

如果您正在运行版本0.17.0或更高版本,则convert_objects已将其替换为方法:to_numeric、、to_datetime等,to_timestamp而不是:

df['col'] = df['col'].astype(float)

你可以做:

df['col'] = pd.to_numeric(df['col'])

请注意,默认情况下,任何不可转换的值都会引发错误,如果您希望强制NaN执行以下操作:

df['col'] = pd.to_numeric(df['col'], errors='coerce')
于 2015-05-08T10:10:15.030 回答