2

我正在使用 Jupyterhub 和 Ray 构建 Kubernetes 集群,并希望用户访问 Jupyterhub 并在 k8s 上使用 Ray 集群。我的计划是使用 Jupyterhub 笔记本“ https://ray.readthedocs.io/en/latest/api.html ”中的 Ray API 连接 Ray 集群

kubectl get svc
NAME         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                                          AGE
kubernetes   ClusterIP   10.96.0.1      <none>        443/TCP                                          6d19h
ray-head     ClusterIP   10.100.19.93   <none>        6379/TCP,6380/TCP,6381/TCP,12345/TCP,12346/TCP   4d21h

然而,当我跑

import ray
ray.init(redis_address="10.100.19.93:6379")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-f25708b1f128> in <module>
----> 1 ray.init(redis_address="10.100.19.93:6379")

/opt/conda/lib/python3.7/site-packages/ray/worker.py in init(redis_address, num_cpus, num_gpus, resources, object_store_memory, redis_max_memory, log_to_driver, node_ip_address, object_id_seed, local_mode, redirect_worker_output, redirect_output, ignore_reinit_error, num_redis_shards, redis_max_clients, redis_password, plasma_directory, huge_pages, include_webui, driver_id, configure_logging, logging_level, logging_format, plasma_store_socket_name, raylet_socket_name, temp_dir, load_code_from_local, _internal_config)
   1434             load_code_from_local=load_code_from_local)
   1435         _global_node = ray.node.Node(
-> 1436             ray_params, head=False, shutdown_at_exit=False, connect_only=True)
   1437 
   1438     connect(

/opt/conda/lib/python3.7/site-packages/ray/node.py in __init__(self, ray_params, head, shutdown_at_exit, connect_only)
    100             redis_client = self.create_redis_client()
    101             self.session_name = ray.utils.decode(
--> 102                 redis_client.get("session_name"))
    103 
    104         self._init_temp(redis_client)

/opt/conda/lib/python3.7/site-packages/ray/utils.py in decode(byte_str, allow_none)
    175     if not isinstance(byte_str, bytes):
    176         raise ValueError(
--> 177             "The argument {} must be a bytes object.".format(byte_str))
    178     if sys.version_info >= (3, 0):
    179         return byte_str.decode("ascii")

ValueError: The argument None must be a bytes object.

我想知道我的方法是否是正确的方法以及如何解决错误。

4

1 回答 1

0

只要您ray.init(redis_address="10.100.19.93:6379")不在集群内运行,您就必须通过LoadBalancerNodePort取决于集群运行的位置公开您的 ray-head 服务。

有关发布服务的更多详细信息

所以,做kubectl edit svc ray-head和改变

type: ClusterIP

type: NodePort

完成后,尝试ray.init(redis_address="<node-ip-address>:<node-port>")

于 2019-07-05T10:21:13.517 回答