0

我正在尝试建立一个预测模型,但目前不断收到错误:raise ValueError("Input contains NaN") ValueError: Input contains NaN. 我尝试使用np.any(np.isnan(dataframe))and np.any(np.isnan(dataframe)),但我不断收到新的错误。例如,TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

这是到目前为止的代码:

import pandas as pd
from sklearn.preprocessing import LabelEncoder
import numpy as np

dataframe = pd.read_csv('file.csv', delimiter=',')

le = LabelEncoder()
dfle = dataframe

dfle2 = dfle.apply(lambda col: le.fit_transform(col.astype(str)), axis=0, result_type='expand')

newdf = dfle2[['column1', 'column2', 'column3', 'column4', 'column5', 'column6', 'column7']]

X = dataframe[['column1', 'column2', 'column4', 'column5', 'column6', 'column7']].values

y = dfle.column3

from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformer
ohe = OneHotEncoder()

ColumnTransformer([('encoder', OneHotEncoder(), [0])], remainder='passthrough')
# np.all(np.isfinite(dfle))
# np.any(np.isnan(dfle))
X = ohe.fit_transform(X).toarray()
4

2 回答 2

0

您可以先做多种事情来处理此错误,您可以将 Nan 值填充为 0dataframe = pd.read_csv('file.csv', delimiter=',').fillna(0)

或者您可以使用sklearn插补技术来填充 Nan 值。

https://scikit-learn.org/stable/modules/classes.html#module-sklearn.impute

可以使用多种插补技术,但您应该使用KNNImputer.

于 2020-12-14T17:55:16.047 回答
0

错误

TypeError: ufunc 'isfinite' not supported for the input types,
and the inputs could not be safely coerced to any supported types
according to the casting rule ''safe''

可能是因为你str在做col.astype(str). 改用类似的东西astype(float)

至于NaN错误,您需要确定是否可以通过将其替换为零 ( fillna(0)) 来解决,或者是否需要使用更复杂的东西,例如卡尔曼滤波器。

于 2020-12-14T18:05:46.720 回答