正如标题所说,我正在尝试在带有 Python 3.6(conda_amazonei_mxnet_p36
环境)的 AWS SageMaker Notebook 实例上使用 Turi Create。尽管默认安装了 CUDA 10.0,但 CUDA 8.0 也已预先安装,可以使用笔记本中的以下命令进行选择:
!sudo rm /usr/local/cuda
!sudo ln -s /usr/local/cuda-8.0 /usr/local/cuda
我已经使用nvcc --version
并且还验证了此安装:
$ cd /usr/local/cuda/samples/1_Utilities/deviceQuery
$ sudo make
$ ./deviceQuery
接下来,在我的笔记本中,我为 CUDA 8.0 安装了 Turi Create 和正确版本的 mxnet:
!pip install turicreate==5.4
!pip uninstall -y mxnet
!pip install mxnet-cu80==1.1.0
然后,我准备图像并尝试创建模型:
import turicreate as tc
tc.config.set_num_gpus(-1)
images = tc.image_analysis.load_images('images', ignore_failure=True);
data = images.join(annotations_);
train_data, test_data = data.random_split(0.8)
model = tc.object_detector.create(train_data, max_iterations=50)
运行时输出以下内容tc.object_detector.create
Using 'image' as feature column
Using 'annotaion' as annotations column
Downloading https://docs-assets.developer.apple.com/turicreate/models/darknet.params
Download completed: /var/tmp/model_cache/darknet.params
Setting 'batch_size' to 32
Using GPUs to create model (Tesla K80, Tesla K80, Tesla K80, Tesla K80, Tesla K80, Tesla K80, Tesla K80, Tesla K80)
Using default 16 lambda workers.
To maximize the degree of parallelism, add the following code to the beginning of the program:
"turicreate.config.set_runtime_config('TURI_DEFAULT_NUM_PYLAMBDA_WORKERS', 32)"
Note that increasing the degree of parallelism also increases the memory footprint.
---------------------------------------------------------------------------
MXNetError Traceback (most recent call last)
_ctypes/callbacks.c in 'calling callback function'()
~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/kvstore.py in updater_handle(key, lhs_handle, rhs_handle, _)
81 lhs = _ndarray_cls(NDArrayHandle(lhs_handle))
82 rhs = _ndarray_cls(NDArrayHandle(rhs_handle))
---> 83 updater(key, lhs, rhs)
84 return updater_handle
85
~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/optimizer/optimizer.py in __call__(self, index, grad, weight)
1528 self.sync_state_context(self.states[index], weight.context)
1529 self.states_synced[index] = True
-> 1530 self.optimizer.update_multi_precision(index, weight, grad, self.states[index])
1531
1532 def sync_state_context(self, state, context):
~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/optimizer/optimizer.py in update_multi_precision(self, index, weight, grad, state)
553 use_multi_precision = self.multi_precision and weight.dtype == numpy.float16
554 self._update_impl(index, weight, grad, state,
--> 555 multi_precision=use_multi_precision)
556
557 @register
~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/optimizer/optimizer.py in _update_impl(self, index, weight, grad, state, multi_precision)
535 if state is not None:
536 sgd_mom_update(weight, grad, state, out=weight,
--> 537 lazy_update=self.lazy_update, lr=lr, wd=wd, **kwargs)
538 else:
539 sgd_update(weight, grad, out=weight, lazy_update=self.lazy_update,
~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/ndarray/register.py in sgd_mom_update(weight, grad, mom, lr, momentum, wd, rescale_grad, clip_gradient, out, name, **kwargs)
~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/_ctypes/ndarray.py in _imperative_invoke(handle, ndargs, keys, vals, out)
90 c_str_array(keys),
91 c_str_array([str(s) for s in vals]),
---> 92 ctypes.byref(out_stypes)))
93
94 if original_output is not None:
~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/base.py in check_call(ret)
144 """
145 if ret != 0:
--> 146 raise MXNetError(py_str(_LIB.MXGetLastError()))
147
148
MXNetError: Cannot find argument 'lazy_update', Possible Arguments:
----------------
lr : float, required
Learning rate
momentum : float, optional, default=0
The decay rate of momentum estimates at each epoch.
wd : float, optional, default=0
Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight.
rescale_grad : float, optional, default=1
Rescale gradient to grad = rescale_grad*grad.
clip_gradient : float, optional, default=-1
Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
, in operator sgd_mom_update(name="", wd="0.0005", momentum="0.9", clip_gradient="0.025", rescale_grad="1.0", lr="0.001", lazy_update="True")
有趣的是,如果我将 CUDA 10.0 与 Turi Create 5.6 一起使用:
!pip install turicreate==5.6
!pip uninstall -y mxnet
!pip install mxnet-cu100==1.4.0.post0
笔记本仍然失败,但如果我立即卸载turicreate
并mxnet-cu100
再次尝试 CUDA 8.0 的上述步骤,它可以正常工作。
上次它工作时,我尝试过pip freeze > requirements.txt
,然后pip install -r requirements.txt
在重新启动实例后,但我仍然遇到与上述相同的错误(除非我先尝试使用 CUDA 10.0)。这里发生了什么?任何建议表示赞赏。