$ cat /proc/version # Linux version 5.0.1-1.el7.elrepo.x86_64 (mockbuild@Build64R7) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC)) #1 SMP Sun Mar 10 10:09:55 EDT 2019 bash
Dec 05 16:13:03 k8s-master kubelet[16088]: E1205 16:13:03.305530 16088 kubelet.go:2187] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized Dec 05 16:13:07 k8s-master kubelet[16088]: W1205 16:13:07.190183 16088 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
podsecuritypolicy.policy/psp.flannel.unprivileged created clusterrole.rbac.authorization.k8s.io/flannel created clusterrolebinding.rbac.authorization.k8s.io/flannel created serviceaccount/flannel created configmap/kube-flannel-cfg created daemonset.apps/kube-flannel-ds-amd64 created daemonset.apps/kube-flannel-ds-arm64 created daemonset.apps/kube-flannel-ds-arm created daemonset.apps/kube-flannel-ds-ppc64le created daemonset.apps/kube-flannel-ds-s390x created
[reset] Reading configuration from the cluster... [reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' W1203 10:29:38.745701 402 reset.go:96] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get config map: Get https://10.58.12.180:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config: dial tcp 10.58.12.180:6443: connect: connection refused [reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted. [reset] Are you sure you want to proceed? [y/N]: y [preflight] Running pre-flight checks W1203 10:29:40.341747 402 removeetcdmember.go:79] [reset] No kubeadm config, using etcd pod spec to get data directory [reset] Stopping the kubelet service [reset] Unmounting mounted directories in "/var/lib/kubelet" [reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki] [reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf] [reset] Deleting contents of stateful directories: [/var/lib/etcd /var/lib/kubelet /etc/cni/net.d /var/lib/dockershim /var/run/kubernetes /var/lib/cni]
The reset process does not reset or clean up iptables rules or IPVS tables. If you wish to reset iptables, you must do so manually by using the "iptables" command.
If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar) to reset your system's IPVS tables.
The reset process does not clean your kubeconfig files and you must remove them manually. Please, check the contents of the $HOME/.kube/config file.
[init] Using Kubernetes version: v1.16.3 [preflight] Running pre-flight checks [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 19.03.5. Latest validated version: 18.09 [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Activating the kubelet service [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "ca" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [k8s-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.58.12.180] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [k8s-master localhost] and IPs [10.58.12.180 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [k8s-master localhost] and IPs [10.58.12.180 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for"kube-apiserver" [control-plane] Creating static Pod manifest for"kube-controller-manager" [control-plane] Creating static Pod manifest for"kube-scheduler" [etcd] Creating static Pod manifest forlocal etcd in"/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [kubelet-check] Initial timeout of 40s passed. Unfortunately, an error has occurred: timed out waiting for the condition
This error is likely caused by: - The kubelet is not running - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands: - 'systemctl status kubelet' - 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime. To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker. Here is one example how you may list all Kubernetes containers running in docker: - 'docker ps -a | grep kube | grep -v pause' Once you have found the failing container, you can inspect its logs with: - 'docker logs CONTAINERID' error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster To see the stack trace of this error execute with --v=5 or higher
我们通过systemctl status kubelet发现kubelet启动失败了,因为缺少kubelet的配置文件/var/lib/kubelet/config.yaml,该文件在init阶段生成,暂时先不管。
我们继续通过命令journalctl -xeu kubelet -f查看具体的错误信息,大部分信息是node "xxx" not found:
1 2 3 4 5 6 7 8 9 10
$ journalctl -xeu kubelet -f -- Logs begin at Mon 2019-12-02 22:02:45 CST. -- Dec 03 11:35:37 k8s-master kubelet[19995]: E1203 11:35:37.923670 19995 kuberuntime_manager.go:783] container start failed: RunContainerError: failed to start container "af7dfdc4b24593488a3d8cfff572ed4144b7b944bad8c6675e33e8c67f528439": Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused "exec: \"kube-apiserver\": executable file not found in $PATH": unknown Dec 03 11:35:37 k8s-master kubelet[19995]: E1203 11:35:37.923712 19995 pod_workers.go:191] Error syncing pod 8ff291bc69bce31e35b99927ce184468 ("kube-apiserver-k8s-master_kube-system(8ff291bc69bce31e35b99927ce184468)"), skipping: failed to "StartContainer"for"kube-apiserver" with RunContainerError: "failed to start container \"af7dfdc4b24593488a3d8cfff572ed4144b7b944bad8c6675e33e8c67f528439\": Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused \"exec: \\\"kube-apiserver\\\": executable file not found in $PATH\": unknown" Dec 03 11:35:37 k8s-master kubelet[19995]: E1203 11:35:37.956583 19995 kubelet.go:2267] node "k8s-master" not found Dec 03 11:35:38 k8s-master kubelet[19995]: E1203 11:35:38.031216 19995 reflector.go:123] k8s.io/client-go/informers/factory.go:134: Failed to list *v1beta1.CSIDriver: Get https://10.58.12.180:6443/apis/storage.k8s.io/v1beta1/csidrivers?limit=500&resourceVersion=0: dial tcp 10.58.12.180:6443: connect: connection refused Dec 03 11:35:38 k8s-master kubelet[19995]: E1203 11:35:38.056859 19995 kubelet.go:2267] node "k8s-master" not found Dec 03 11:35:38 k8s-master kubelet[19995]: E1203 11:35:38.157079 19995 kubelet.go:2267] node "k8s-master" not found Dec 03 11:35:38 k8s-master kubelet[19995]: E1203 11:35:38.231060 19995 reflector.go:123] k8s.io/client-go/informers/factory.go:134: Failed to list *v1beta1.RuntimeClass: Get https://10.58.12.180:6443/apis/node.k8s.io/v1beta1/runtimeclasses?limit=500&resourceVersion=0: dial tcp 10.58.12.180:6443: connect: connection refused Dec 03 11:35:38 k8s-master kubelet[19995]: E1203 11:35:38.257322 19995 kubelet.go:2267] node "k8s-master" not found
参考了网上很多解决方法,都无济于事,最后发现是自己制作镜像的时候Dockerfile搞错了。
安装完网络模块之后通过journalctl -xeu kubelet -f发现如下错误:
1 2 3 4 5 6 7 8 9
bash Dec 05 16:15:40 k8s-master kubelet[16088]: E1205 16:15:40.504199 16088 cni.go:358] Error adding kube-system_coredns-5644d7b6d9-kwtkz/770279cf6076b08149cf73b176eb8af441a6ebb90e03b2ea5912e884d5343604 to network flannel/cbr0: failed to set bridge addr: could not add IP address to "cni0": file exists Dec 05 16:15:40 k8s-master kubelet[16088]: E1205 16:15:40.681543 16088 remote_runtime.go:105] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to set up sandbox container "770279cf6076b08149cf73b176eb8af441a6ebb90e03b2ea5912e884d5343604" network for pod "coredns-5644d7b6d9-kwtkz": networkPlugin cni failed to set up pod "coredns-5644d7b6d9-kwtkz_kube-system" network: failed to set bridge addr: could not add IP address to "cni0": file exists Dec 05 16:15:40 k8s-master kubelet[16088]: E1205 16:15:40.681632 16088 kuberuntime_sandbox.go:68] CreatePodSandbox for pod "coredns-5644d7b6d9-kwtkz_kube-system(a78220d8-35ed-4b87-8388-11fc2bdb29c0)" failed: rpc error: code = Unknown desc = failed to set up sandbox container "770279cf6076b08149cf73b176eb8af441a6ebb90e03b2ea5912e884d5343604" network for pod "coredns-5644d7b6d9-kwtkz": networkPlugin cni failed to set up pod "coredns-5644d7b6d9-kwtkz_kube-system" network: failed to set bridge addr: could not add IP address to "cni0": file exists Dec 05 16:15:40 k8s-master kubelet[16088]: E1205 16:15:40.681651 16088 kuberuntime_manager.go:710] createPodSandbox for pod "coredns-5644d7b6d9-kwtkz_kube-system(a78220d8-35ed-4b87-8388-11fc2bdb29c0)" failed: rpc error: code = Unknown desc = failed to set up sandbox container "770279cf6076b08149cf73b176eb8af441a6ebb90e03b2ea5912e884d5343604" network for pod "coredns-5644d7b6d9-kwtkz": networkPlugin cni failed to set up pod "coredns-5644d7b6d9-kwtkz_kube-system" network: failed to set bridge addr: could not add IP address to "cni0": file exists Dec 05 16:15:40 k8s-master kubelet[16088]: E1205 16:15:40.681836 16088 pod_workers.go:191] Error syncing pod a78220d8-35ed-4b87-8388-11fc2bdb29c0 ("coredns-5644d7b6d9-kwtkz_kube-system(a78220d8-35ed-4b87-8388-11fc2bdb29c0)"), skipping: failed to "CreatePodSandbox" for "coredns-5644d7b6d9-kwtkz_kube-system(a78220d8-35ed-4b87-8388-11fc2bdb29c0)" with CreatePodSandboxError: "CreatePodSandbox for pod \"coredns-5644d7b6d9-kwtkz_kube-system(a78220d8-35ed-4b87-8388-11fc2bdb29c0)\" failed: rpc error: code = Unknown desc = failed to set up sandbox container \"770279cf6076b08149cf73b176eb8af441a6ebb90e03b2ea5912e884d5343604\" network for pod \"coredns-5644d7b6d9-kwtkz\": networkPlugin cni failed to set up pod \"coredns-5644d7b6d9-kwtkz_kube-system\" network: failed to set bridge addr: could not add IP address to \"cni0\": file exists" Dec 05 16:15:40 k8s-master kubelet[16088]: W1205 16:15:40.894386 16088 docker_sandbox.go:394] failed to read pod IP from plugin/docker: networkPlugin cni failed on the status hook for pod "coredns-5644d7b6d9-kwtkz_kube-system": CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "770279cf6076b08149cf73b176eb8af441a6ebb90e03b2ea5912e884d5343604" Dec 05 16:15:40 k8s-master kubelet[16088]: W1205 16:15:40.895390 16088 pod_container_deletor.go:75] Container "770279cf6076b08149cf73b176eb8af441a6ebb90e03b2ea5912e884d5343604" not found in pod's containers Dec 05 16:15:40 k8s-master kubelet[16088]: W1205 16:15:40.897212 16088 cni.go:328] CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "770279cf6076b08149cf73b176eb8af441a6ebb90e03b2ea5912e884d5343604"