0

我尝试使用 cuml 进行网格搜索。(rapids 21.10) 我得到一个cupy转换错误。如果我在没有网格搜索的情况下使用相同的数据集构建模型,则不会发生这种情况。它也适用于不位于 Videomemory 中的数据,但它显然比 cpu 慢。X 的数据是 float32,y 的数据是 int32:

X_cudf_train = cudf.DataFrame.from_pandas(X_train)
X_cudf_test = cudf.DataFrame.from_pandas(X_test)
​
y_cudf_train = cudf.Series(y_train.values)

RF_classifier_cu = RandomForestClassifier_cu(random_state = 123)
grid_search_RF_cu = GridSearchCV_cu(estimator=RF_classifier_cu, param_grid=grid_RF, cv=3, verbose=1)
grid_search_RF_cu.fit(X_cudf_train,y_cudf_train)
print(grid_search_RF_cu.best_params_)

错误:

 /home/asdanjer/miniconda3/envs/rapids-21.10/lib/python3.8/site-packages/cuml/internals/api_decorators.py:794: UserWarning: For reproducible results in Random Forest Classifier or for almost reproducible results in Random Forest Regressor, n_streams==1 is recommended. If n_streams is > 1, results may vary due to stream/thread timing differences, even when random_state is set
      return func(**kwargs)
    
    ---------------------------------------------------------------------------
    TypeError                    

         Traceback (most recent call last)
<timed exec> in <module>

~/miniconda3/envs/rapids-21.10/lib/python3.8/site-packages/sklearn/model_selection/_search.py in fit(self, X, y, groups, **fit_params)
    800         fit_params = _check_fit_params(X, fit_params)
    801 
--> 802         cv_orig = check_cv(self.cv, y, classifier=is_classifier(estimator))
    803         n_splits = cv_orig.get_n_splits(X, y, groups)
    804 

~/miniconda3/envs/rapids-21.10/lib/python3.8/site-packages/sklearn/model_selection/_split.py in check_cv(cv, y, classifier)
   2301             classifier
   2302             and (y is not None)
-> 2303             and (type_of_target(y) in ("binary", "multiclass"))
   2304         ):
   2305             return StratifiedKFold(cv)

~/miniconda3/envs/rapids-21.10/lib/python3.8/site-packages/sklearn/utils/multiclass.py in type_of_target(y)
    277         raise ValueError("y cannot be class 'SparseSeries' or 'SparseArray'")
    278 
--> 279     if is_multilabel(y):
    280         return "multilabel-indicator"
    281 

~/miniconda3/envs/rapids-21.10/lib/python3.8/site-packages/sklearn/utils/multiclass.py in is_multilabel(y)
    149             warnings.simplefilter("error", np.VisibleDeprecationWarning)
    150             try:
--> 151                 y = np.asarray(y)
    152             except np.VisibleDeprecationWarning:
    153                 # dtype=object should be provided explicitly for ragged arrays,

~/miniconda3/envs/rapids-21.10/lib/python3.8/site-packages/cudf/core/frame.py in __array__(self, dtype)
   1636 
   1637     def __array__(self, dtype=None):
-> 1638         raise TypeError(
   1639             "Implicit conversion to a host NumPy array via __array__ is not "
   1640             "allowed, To explicitly construct a GPU array, consider using "

TypeError: Implicit conversion to a host NumPy array via __array__ is not allowed, To explicitly construct a GPU array, consider using cupy.asarray(...)
To explicitly construct a host array, consider using .to_array()
4

0 回答 0