我一直在尝试在 Google Colab pro 上运行 RAPIDS,并成功安装了 cuml 和 cudf 包,但是我什至无法运行示例脚本。
TLDR;
每当我尝试在 Google Colab 上运行 cuml 的 fit 函数时,都会出现以下错误。当我使用演示示例进行安装和 cuml 时,我得到了这个。这发生在一系列 cuml 示例中(我首先尝试运行 UMAP)。
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-3-c06fc2c31ca3> in <module>()
13 knn.fit(X_train, y_train)
14
---> 15 knn.predict(X_test)
5 frames
cuml/neighbors/kneighbors_regressor.pyx in cuml.neighbors.kneighbors_regressor.KNeighborsRegressor.predict()
cuml/neighbors/nearest_neighbors.pyx in cuml.neighbors.nearest_neighbors.NearestNeighbors.kneighbors()
cuml/neighbors/nearest_neighbors.pyx in cuml.neighbors.nearest_neighbors.NearestNeighbors._kneighbors()
cuml/neighbors/nearest_neighbors.pyx in cuml.neighbors.nearest_neighbors.NearestNeighbors._kneighbors_dense()
/usr/local/lib/python3.7/site-packages/cuml/common/array.py in full(cls, shape, value, dtype, order)
326 """
327
--> 328 return CumlArray(cp.full(shape, value, dtype, order))
329
330 @classmethod
TypeError: full() takes from 2 to 3 positional arguments but 4 were given
在 Google Colab Pro 上采取的步骤(重现错误)
这是一个示例,我使用 Rapids 中的此示例安装相关软件包(https://colab.research.google.com/drive/1rY7Ln6rEE1pOlfSHCYOVaqt8OvDO35J0#forceEdit=true&offline=true&sandboxMode=true):
# Install RAPIDS
!git clone https://github.com/rapidsai/rapidsai-csp-utils.git
!bash rapidsai-csp-utils/colab/rapids-colab.sh stable
import sys, os, shutil
sys.path.append('/usr/local/lib/python3.7/site-packages/')
os.environ['NUMBAPRO_NVVM'] = '/usr/local/cuda/nvvm/lib64/libnvvm.so'
os.environ['NUMBAPRO_LIBDEVICE'] = '/usr/local/cuda/nvvm/libdevice/'
os.environ["CONDA_PREFIX"] = "/usr/local"
for so in ['cudf', 'rmm', 'nccl', 'cuml', 'cugraph', 'xgboost', 'cuspatial']:
fn = 'lib'+so+'.so'
source_fn = '/usr/local/lib/'+fn
dest_fn = '/usr/lib/'+fn
if os.path.exists(source_fn):
print(f'Copying {source_fn} to {dest_fn}')
shutil.copyfile(source_fn, dest_fn)
# fix for BlazingSQL import issue
# ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /usr/local/lib/python3.7/site-packages/../../libblazingsql-engine.so)
if not os.path.exists('/usr/lib64'):
os.makedirs('/usr/lib64')
for so_file in os.listdir('/usr/local/lib'):
if 'libstdc' in so_file:
shutil.copyfile('/usr/local/lib/'+so_file, '/usr/lib64/'+so_file)
shutil.copyfile('/usr/local/lib/'+so_file, '/usr/lib/x86_64-linux-gnu/'+so_file)
然后我尝试从 cuML ( https://docs.rapids.ai/api/cuml/stable/api.html#k-means-clustering )运行下面的示例
from cuml.neighbors import KNeighborsRegressor
from sklearn.datasets import make_blobs
from sklearn.model_selection import train_test_split
X, y = make_blobs(n_samples=100, centers=5,
n_features=10)
knn = KNeighborsRegressor(n_neighbors=10)
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.80)
knn.fit(X_train, y_train)
knn.predict(X_test)
这将导致问题开始时出现错误。