我正在尝试使用以下命令启动我的 Openshift Origin 集群:
oc cluster up --public-hostname='openshift.xxx.xxx' --http-proxy='http://xx.xx.xx.xx:8080' --https-proxy='http://xx.xx.xx.xx:8080' --no-proxy='0.0.0.0,172.30.1.1,172.17.0.7' --host-data-dir /var/lib/origin/openshift.local.data
(我在公司代理后面运行它,所以我必须设置它并为 docker 设置一些例外)
这是我得到的结果:
Using nsenter mounter for OpenShift volumes
Using 127.0.0.1 as the server IP
Starting OpenShift using openshift/origin:v3.9.0 ...
-- Starting OpenShift container ...
Creating initial OpenShift configuration
Starting OpenShift using container 'origin'
Waiting for API server to start listening
OpenShift server started
-- Adding default OAuthClient redirect URIs ... OK
-- Installing registry ... OK
-- Installing router ... OK
-- Importing image streams ... OK
-- Importing templates ... OK
-- Importing internal templates ... OK
-- Installing web console ... FAIL
Error: failed to start the web console server: timed out waiting for the condition
我只有一个 webconsole 吊舱:
oc get pods -n openshift-web-console
NAME READY STATUS RESTARTS AGE
webconsole-7dfbffd44d-v5lxw 0/1 CrashLoopBackOff 8 25m
这是我获得的 pod 信息:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 26m default-scheduler Successfully assigned webconsole-7dfbffd44d-v5lxw to localhost
Normal SuccessfulMountVolume 26m kubelet, localhost MountVolume.SetUp succeeded for volume "webconsole-config"
Normal SuccessfulMountVolume 26m kubelet, localhost MountVolume.SetUp succeeded for volume "serving-cert"
Normal SuccessfulMountVolume 26m kubelet, localhost MountVolume.SetUp succeeded for volume "webconsole-token-qljxd"
Normal Pulled 25m (x4 over 26m) kubelet, localhost Container image "openshift/origin-web-console:v3.9.0" already present on machine
Normal Created 25m (x4 over 26m) kubelet, localhost Created container
Normal Started 25m (x4 over 26m) kubelet, localhost Started container
Warning BackOff 24m (x10 over 26m) kubelet, localhost Back-off restarting failed container
Normal SuccessfulMountVolume 19m kubelet, localhost MountVolume.SetUp succeeded for volume "webconsole-config"
Normal SuccessfulMountVolume 19m kubelet, localhost MountVolume.SetUp succeeded for volume "webconsole-token-qljxd"
Normal SuccessfulMountVolume 19m kubelet, localhost MountVolume.SetUp succeeded for volume "serving-cert"
Normal Pulled 18m (x4 over 19m) kubelet, localhost Container image "openshift/origin-web-console:v3.9.0" already present on machine
Normal Created 18m (x4 over 19m) kubelet, localhost Created container
Normal Started 18m (x4 over 19m) kubelet, localhost Started container
Warning BackOff 4m (x74 over 19m) kubelet, localhost Back-off restarting failed container
我在事件中看不到任何错误,但 pod 总是失败。
这是 Openshift Origin 的版本:
oc version
oc v3.9.0+191fece
kubernetes v1.9.1+a0ce1bc657
features: Basic-Auth GSSAPI Kerberos SPNEGO
Server https://127.0.0.1:8443
openshift v3.9.0+191fece
kubernetes v1.9.1+a0ce1bc657
红帽版本:
cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.3 (Maipo)
为了提供更多信息,我有 95% 的 /var 文件系统,我增加了 LV 的空间并尝试重新启动集群,我的问题从这里开始。现在,/var 文件系统是 58%:
df -h /var
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_rhel-lv_var 30G 18G 13G 58% /var
我一直在通过网络阅读信息但我找不到任何结论:我尝试将 webconsole 服务帐户添加到特权 scc,重新启动 docker.service,重新启动虚拟机等,但 Openshift Origin webconsole 仍然失败。有什么想法吗?
提前致谢。