查看来自https://spark.apache.org/docs/latest/cluster-overview.html的图片。
运行在 kubernetes 之外的 spark 集群。但我要在 kubernetes中运行驱动程序。问题是如何让spark集群知道驱动程序是否存在。
我的 Kubernetes yaml 文件:
kind: List
apiVersion: v1
items:
- kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: counter-uat
spec:
replicas: 1
selector:
matchLabels:
name: spark-driver
template:
metadata:
labels:
name: spark-driver
spec:
containers:
- name: counter-uat
image: counter:0.1.0
command: ["/opt/spark/bin/spark-submit", "--class", "Counter", "--master", "spark://spark.uat:7077", "/usr/src/counter.jar"]
- kind: Service
apiVersion: v1
metadata:
name: spark-driver
labels:
name: spark-driver
spec:
type: NodePort
ports:
- name: port
port: 4040
targetPort: 4040
selector:
name: spark-driver
错误是:
Caused by: java.io.IOException: Failed to connect to /172.17.0.8:44117
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:191)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: Host is unreachable: /172.17.0.8:44117
spark集群试图访问ip为172.17.0.8的驱动程序。172.17.0.8 可能是 Kubernetes 内部的一个内部 ip。
如何解决问题?如何修复我的 yaml 文件?谢谢
更新
我添加了以下两个参数:“--conf”、“spark.driver.bindAddress=192.168.42.8”、“--conf”、“spark.driver.host=0.0.0.0”。
但是从日志来看,还是试图到达172.17.0.8,也就是kubernetes内部的pod ip。
更新
kind: List
apiVersion: v1
items:
- kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: counter-uat
spec:
replicas: 1
selector:
matchLabels:
name: counter-driver
template:
metadata:
labels:
name: counter-driver
spec:
containers:
- name: counter-uat
image: counter:0.1.0
command: ["/opt/spark/bin/spark-submit", "--class", "Counter", "--master", "spark://spark.uat:7077", "--conf", "spark.driver.bindAddress=192.168.42.8","/usr/src/counter.jar"]
kind: Service
apiVersion: v1
metadata:
name: counter-driver
labels:
name: counter-driver
spec:
type: NodePort
ports:
- name: driverport
port: 42761
targetPort: 42761
nodePort: 30002
selector:
name: counter-driver
另一个错误:
2017-06-23T20:00:07.487656154Z Exception in thread "main" java.net.BindException: Cannot assign requested address: Service 'sparkDriver' failed after 16 retries (starting from 31319)! Consider explicitly setting the appropriate port for the service 'sparkDriver' (for example spark.ui.port for SparkUI) to an available port or increasing spark.port.maxRetries.