在安装过程中,我在主机 ip 192.168.240.14 的 Ubuntu 16.04 单节点上遇到此错误:
TASK [network : Ensuring that the calico.yaml file exist] **********************
changed: [localhost]
TASK [network : include] *******************************************************
TASK [network : include] *******************************************************
TASK [network : include] *******************************************************
included: /installer/playbook/roles/network/tasks/calico.yaml for localhost
TASK [network : Enabling calico] ***********************************************
changed: [localhost]
TASK [network : Waiting for configuring calico service] ************************
ok: [localhost -> 192.168.240.14] => (item=192.168.240.14)
TASK [network : Waiting for configuring calico node to node mesh] **************
FAILED - RETRYING: Waiting for configuring calico node to node mesh (100 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (99 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (98 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (97 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (96 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (95 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (94 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (93 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (92 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (91 retries left).
我读到可以禁用 calico 的节点到节点网格功能,但由于 calico 是通过 ICP 安装的,calicoctl
因此无法识别该命令。在 config.yaml 中,我也找不到可以禁用此设置的选项。
到目前为止,我已尝试通过单独下载和执行 calicoctl 来禁用它,但无法建立与集群的连接:
user@user:~/Desktop/calicoctl$ ./calicoctl config set nodeToNodeMesh off
Error executing command: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:2379: getsockopt: connection refused
我不确定是不是因为它尝试拨打环回 IP 地址而不是 192.168.240.14 或其他。而且我也不知道这是否真的可以解决安装过程中的问题。
我对此不是很有经验,并感谢您的帮助!
编辑:
我再次使用 ICP 2.1.0.1 运行安装并遇到相同的错误,但重试了 10 次并收到以下错误消息:
TASK [network : Enabling calico] ***********************************************
changed: [localhost]
TASK [network : Waiting for configuring calico service] ************************
ok: [localhost -> 192.168.240.14] => (item=192.168.240.14)
FAILED - RETRYING: Waiting for configuring calico node to node mesh (10 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (9 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (8 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (7 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (6 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (5 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (4 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (3 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (2 retries left).
FAILED - RETRYING: Waiting for configuring calico node to node mesh (1 retries left).
TASK [network : Waiting for configuring calico node to node mesh] **************
fatal: [localhost]: FAILED! => {"attempts": 10, "changed": true, "cmd": "kubectl get pods --show-all --namespace=kube-system |grep configure-calico-mesh", "delta": "0:00:01.343071", "end": "2018-06-20 08:12:28.433186", "failed": true, "rc": 0, "start": "2018-06-20 08:12:27.090115", "stderr": "", "stderr_lines": [], "stdout": "configure-calico-mesh-9f756 0/1 Pending 0 5m", "stdout_lines": ["configure-calico-mesh-9f756 0/1 Pending 0 5m"]}
PLAY RECAP *********************************************************************
192.168.240.14 : ok=168 changed=54 unreachable=0 failed=0
localhost : ok=81 changed=16 unreachable=0 failed=1
Playbook run took 0 days, 0 hours, 19 minutes, 8 seconds
user@user:/opt/ibm-cloud-private-ce-2.1.0.1/cluster$
我不明白为什么突然将 localhost 包含在设置步骤中,因为我只在 hosts 文件中指定了我的 IP 地址:
[master]
192.168.240.14 ansible_user="user" ansible_ssh_pass="6CEd29CN" ansible_become=true ansible_become_pass="6CEd29CN" ansible_port="22" ansible_ssh_common_args="-oPubkeyAuthentication=no"
[worker]
192.168.240.14 ansible_user="user" ansible_ssh_pass="6CEd29CN" ansible_become=true ansible_become_pass="6CEd29CN" ansible_port="22" ansible_ssh_common_args="-oPubkeyAuthentication=no"
[proxy]
192.168.240.14 ansible_user="user" ansible_ssh_pass="6CEd29CN" ansible_become=true ansible_become_pass="6CEd29CN" ansible_port="22" ansible_ssh_common_args="-oPubkeyAuthentication=no"
#[management]
#4.4.4.4
#[va]
#5.5.5.5
我的 config.yaml 文件如下所示:
# Licensed Materials - Property of IBM
# IBM Cloud private
# @ Copyright IBM Corp. 2017 All Rights Reserved
# US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
---
###### docker0: 172.17.0.1
###### eth0: 192.168.240.14
## Network Settings
#network_type: calico
# network_helm_chart_path: < helm chart path >
## Network in IPv4 CIDR format
network_cidr: 10.1.0.0/16
## Kubernetes Settings
service_cluster_ip_range: 10.0.0.1/24
## Makes the Kubelet start if swap is enabled on the node. Remove
## this if your production env want to disble swap.
kubelet_extra_args: ["--fail-swap-on=false"]
# cluster_domain: cluster.local
# cluster_name: mycluster
cluster_CA_domain: "mydomain.icp"
# cluster_zone: "myzone"
# cluster_region: "myregion"
## Etcd Settings
#etcd_extra_args: ["--grpc-keepalive-timeout=0", "--grpc-keepalive-interval=0", #"--snapshot-count=10000"]
## General Settings
# wait_for_timeout: 600
# docker_api_timeout: 100
## Advanced Settings
default_admin_user: user
default_admin_password: 6CEd29CN
# ansible_user: <username>
# ansible_become: true
# ansible_become_password: <password>
## Kubernetes Settings
# kube_apiserver_extra_args: []
# kube_controller_manager_extra_args: []
# kube_proxy_extra_args: []
# kube_scheduler_extra_args: []
## Enable Kubernetes Audit Log
# auditlog_enabled: false
## GlusterFS Settings
# glusterfs: false
## GlusterFS Storage Settings
# storage:
# - kind: glusterfs
# nodes:
# - ip: <worker_node_m_IP_address>
# device: <link path>/<symlink of device aaa>,<link path>/<symlink of device bbb>
# - ip: <worker_node_n_IP_address>
# device: <link path>/<symlink of device ccc>
# - ip: <worker_node_o_IP_address>
# device: <link path>/<symlink of device ddd>
# storage_class:
# name:
# default: false
# volumetype: replicate:3
## Network Settings
## Calico Network Settings
### calico_ipip_enabled: true
calico_ipip_enabled: false
calico_tunnel_mtu: 1430
calico_ip_autodetection_method: interface=eth0
## IPSec mesh Settings
## If user wants to configure IPSec mesh, the following parameters
## should be configured through config.yaml
ipsec_mesh:
enable: false
# interface: <interface name on which IPsec will be enabled>
# subnets: []
# exclude_ips: "<list of IP addresses separated by a comma>"
kube_apiserver_insecure_port: 8080
kube_apiserver_secure_port: 8001
## External loadbalancer IP or domain
## Or floating IP in OpenStack environment
# cluster_lb_address: none
## External loadbalancer IP or domain
## Or floating IP in OpenStack environment
# proxy_lb_address: none
## Install in firewall enabled mode
firewall_enabled: false
## Allow loopback dns server in cluster nodes
loopback_dns: true
## High Availability Settings
# vip_manager: etcd
## High Availability Settings for master nodes
# vip_iface: eth0
# cluster_vip: 127.0.1.1
## High Availability Settings for Proxy nodes
# proxy_vip_iface: eth0
# proxy_vip: 127.0.1.1
## Federation cluster Settings
# federation_enabled: false
# federation_cluster: federation-cluster
# federation_domain: cluster.federation
# federation_apiserver_extra_args: []
# federation_controllermanager_extra_args: []
# federation_external_policy_engine_enabled: false
## vSphere cloud provider Settings
## If user wants to configure vSphere as cloud provider, vsphere_conf
## parameters should be configured through config.yaml
# kubelet_nodename: hostname
# cloud_provider: vsphere
# vsphere_conf:
# user: <vCenter username for vSphere cloud provider>
# password: <password for vCenter user>
# server: <vCenter server IP or FQDN>
# port: [vCenter Server Port; default: 443]
# insecure_flag: [set to 1 if vCenter uses a self-signed certificate]
# datacenter: <datacenter name on which Node VMs are deployed>
# datastore: <default datastore to be used for provisioning volumes>
# working_dir: <vCenter VM folder path in which node VMs are located>
## Disabled Management Services Settings
## You can disable the following management services: ["service-catalog", "metering", "monitoring", "istio", "vulnerability-advisor", "custom-metrics-adapter"]
#disabled_management_services: ["istio", "vulnerability-advisor", "custom-metrics-adapter"]
disabled_management_services: ["service-catalog", "metering", "monitoring", "istio", "vulnerability-advisor", "custom-metrics-adapter"]
## Docker Settings
# docker_env: []
# docker_extra_args: []
## The maximum size of the log before it is rolled
# docker_log_max_size: 50m
## The maximum number of log files that can be present
# docker_log_max_file: 10
## Install/upgrade docker version
# docker_version: 17.12.1
## ICP install docker automatically
# install_docker: true
## Ingress Controller Settings
## You can add your ingress controller configuration, and the allowed configuration can refer to
## https://github.com/kubernetes/ingress-nginx/blob/nginx-0.9.0/docs/user-guide/configmap.md#configuration-options
# ingress_controller:
# disable-access-log: 'true'
## Clean metrics indices in Elasticsearch older than this number of days
# metrics_max_age: 1
## Clean application log indices in Elasticsearch older than this number of days
# logs_maxage: 1
## Uncomment the line below to install Kibana as a managed service.
kibana_install: true
# STARTING_CLOUDANT
# cloudant:
# namespace: kube-system
# pullPolicy: IfNotPresent
# pvPath: /opt/ibm/cfc/cloudant
# database:
# password: fdrreedfddfreeedffde
# federatorCommand: hostname
# federationIdentifier: "-0"
# readinessProbePeriodSeconds: 2
# readinessProbeInitialDelaySeconds: 90
# END_CLOUDANT