看到我们的一些工作流作业因连接超时错误而失败。我们正在使用 Argo Workflow manager 来运行作业。我们观察到 Argo 正在失去与其工作流的连接,并且我们相互依赖的工作流模式因以下错误而失败。所以当我检查 Kubernetes API 服务器日志时,我看到了这些错误。有什么办法可以增加 Kubernetes API 服务器非运行作业的任何超时设置?
错误:
status.go:71] apiserver received an error that is not an metav1.Status: &net.OpError{Op:"write", Net:"tcp", Source:(*net.TCPAddr)(0xc095d6de90), Addr:(*net.TCPAddr)(0xc095d6dec0), Err:(*os.SyscallError)(0xc0287c4d00)}
E0817 17:14:46.575773 1 status.go:71] apiserver received an error that is not an metav1.Status: &net.OpError{Op:"write", Net:"tcp", Source:(*net.TCPAddr)(0xc07d555530), Addr:(*net.TCPAddr)(0xc07d555560), Err:(*os.SyscallError)(0xc0be7c70e0)}
E0817 17:14:46.576855 1 status.go:71] apiserver received an error that is not an metav1.Status: &net.OpError{Op:"write", Net:"tcp", Source:(*net.TCPAddr)(0xc07c5a03f0), Addr:(*net.TCPAddr)(0xc07c5a0420), Err:(*os.SyscallError)(0xc07c7454e0)}
E0817 17:15:28.402137 1 status.go:71] apiserver received an error that is not an metav1.Status: &net.OpError{Op:"write", Net:"tcp", Source:(*net.TCPAddr)(0xc09013ea20), Addr:(*net.TCPAddr)(0xc09013ea50), Err:(*os.SyscallError)(0xc0cd00f520)}
E0817 17:23:48.779008 1 runtime.go:78] Observed a panic: &errors.errorString{s:"killing connection/stream because serving request timed out and response had been started"} (killing connection/stream because serving request timed out and response had been started)
客户端版本:v1.17.2
服务器版本:v1.17.2
主机操作系统: Centos 7.7
CNI: 编织
谢谢, CS