注意:将密钥烘焙到映像中是您能做的最糟糕的事情,我在这里这样做是为了在调试时在 Docker 和 Kubernetes 之间建立一个二进制相等的文件系统。
我正在尝试启动一个在 GCS 中保持其状态的 flink-jobmanager,所以我在我的中添加了high-availability.storageDir: gs://BUCKET/ha一行,flink-conf.yaml并且我正在构建我的 Dockerfile,如此处所述
这是我的 Dockerfile:
FROM flink:1.5-hadoop28
ADD https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar /opt/flink/lib/gcs-connector-latest-hadoop2.jar
RUN mkdir /opt/flink/etc-hadoop
COPY flink-conf.yaml /opt/flink/conf/flink-conf.yaml
COPY key.json /opt/flink/etc-hadoop/key.json
COPY core-site.xml /opt/flink/etc-hadoop/core-site.xml
现在,如果我通过构建这个容器docker build -t flink:dev .并在其中启动一个交互式 shell docker run -ti flink:dev /bin/bash,我可以通过以下方式启动 flink jobmanager:
flink-console.sh jobmanager --configDir=/opt/flink/conf/ --executionMode=cluster
Flink 正在拿起罐子并正常启动。但是,当我使用以下 yaml 在 Kubernetes 上启动它时,基于此处:
apiVersion: apps/v1
kind: Deployment
metadata:
name: flink-jobmanager
spec:
replicas: 1
selector:
matchLabels:
app: flink
component: jobmanager
template:
metadata:
labels:
app: flink
component: jobmanager
spec:
containers:
- name: jobmanager
image: flink:dev
imagePullPolicy: Always
resources:
requests:
memory: "1024Mi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"
args: ["jobmanager"]
ports:
- containerPort: 6123
name: rpc
- containerPort: 6124
name: blob
- containerPort: 6125
name: query
- containerPort: 8081
name: ui
- containerPort: 46110
name: ha
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /opt/flink/etc-hadoop/key.json
- name: JOB_MANAGER_RPC_ADDRESS
value: flink-jobmanager
Flink 似乎无法注册文件系统:
2018-10-04 09:20:51,357 DEBUG org.apache.flink.runtime.util.HadoopUtils - Cannot find hdfs-default configuration-file path in Flink config.
2018-10-04 09:20:51,358 DEBUG org.apache.flink.runtime.util.HadoopUtils - Cannot find hdfs-site configuration-file path in Flink config.
2018-10-04 09:20:51,359 DEBUG org.apache.flink.runtime.util.HadoopUtils - Adding /opt/flink/etc-hadoop//core-site.xml to hadoop configuration
2018-10-04 09:20:51,767 DEBUG org.apache.hadoop.security.UserGroupInformation - PrivilegedActionException as:flink (auth:SIMPLE) cause:java.io.IOException: Could not create FileSystem for highly available storage (high-availability.storageDir)
2018-10-04 09:20:51,767 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Cluster initialization failed.
java.io.IOException: Could not create FileSystem for highly available storage (high-availability.storageDir)
at org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:122)
at org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:95)
at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:115)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:402)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:270)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:225)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:189)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:188)
at org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint.main(StandaloneSessionClusterEntrypoint.java:91)
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 'gs'. The scheme is not directly supported by Flink and no Hadoop file system to support this scheme could be loaded.
at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:405)
at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:320)
at org.apache.flink.core.fs.Path.getFileSystem(Path.java:298)
at org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:119)
... 12 more
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Hadoop File System abstraction does not support scheme 'gs'. Either no file system implementation exists for that scheme, or the relevant classes are missing from the classpath.
at org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:102)
at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:401)
... 15 more
Caused by: java.io.IOException: No FileSystem for scheme: gs
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2799)
at org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:99)
... 16 more
由于 Kubernetes 应该使用相同的图像,我很困惑这怎么可能。我在这里监督什么吗?