适用于希望搭建完整 K8s 集群的用户。

1. 更新系统

1
sudo apt update && sudo apt upgrade -y

2. 关闭 Swap

Kubernetes 要求关闭 Swap:

1
2
sudo swapoff -a #临时关闭
sudo sed -i '/swap/d' /etc/fstab # 永久关闭

3. 加载必要的内核模块

1
2
sudo modprobe overlay
sudo modprobe br_netfilter

4. 设置 sysctl 参数

设置iptables不对bridge的数据进行处理,启用IP路由转发功能。

1
2
3
4
5
6
7
8
sudo tee /etc/sysctl.d/k8s.conf <<EOF
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF

# 使配置生效
sudo sysctl --system

5. 安装 Docker

1
2
3
4
5
sudo apt install -y apt-transport-https ca-certificates curl gnupg
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io

配置 containerd 运行时

1
2
3
4
5
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml > /dev/null
# 可以不修改SystemdCgroup
# sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
sudo systemctl restart containerd

查看状态

1
sudo systemctl status  containerd

6. 安装 Kubernetes

添加阿里云的 Kubernetes 软件源

1
2
3
4
5
sudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg
echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list

# 如果 curl 失败,可以改用 wget
sudo wget -O /usr/share/keyrings/kubernetes-archive-keyring.gpg https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg

安装 Kubernetes 组件

安装 kubeletkubeadmkubectl

1
2
3
sudo apt update
sudo apt install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl # 锁定版本

7. 初始化 Kubernetes

1
2
systemctl status containerd
systemctl status kubelet
1
2
3
sudo kubeadm init \
--pod-network-cidr=10.244.0.0/16 \
--image-repository=registry.aliyuncs.com/google_containers --v=5 # 国内用户使用阿里云镜像

8. 安装 Pod 网络

使用 Flannel:

1
2
# 也可以先下载kube-flannel.yml文件到本地
kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml

9. 验证安装

1
2
kubectl get nodes
kubectl get namespace

常见问题解决

1. 镜像拉取失败

  • 手动拉取镜像并修改标签:
    1
    2
    docker pull registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.0
    docker tag registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.0 k8s.gcr.io/kube-apiserver:v1.28.0

2. 单节点运行 Pod

  • 允许 Master 节点调度 Pod:
    1
    2
    3
    #kubectl taint nodes --all node-role.kubernetes.io/control-plane-
    #kubectl taint nodes --all node-role.kubernetes.io/control-plane:NoSchedule-
    kubectl taint nodes --all node.kubernetes.io/control-plane:NoSchedule

3. 重置集群

1
2
sudo kubeadm reset
sudo rm -rf /etc/cni/net.d ~/.kube/config

4. 注意kubernetes V1.22.2版本及其以后,要求容器的cgroup driver 为systemd

可以使用docker info | grep -i cgroup查看cgroup driver。

1
2
3
4
5
6
7
8
9
10
11
12
# 可以提前拉取
kubeadm config images pull
```bash

`kubeadm config images pull` 失败,可以**配置 `kubeadm` 使用国内镜像源**

如果不想手动拉取,可以让 `kubeadm` 直接使用国内的 `registry.aliyuncs.com`。

**生成 `kubeadm` 配置**

```bash
kubeadm config print init-defaults > kubeadm-config.yaml

修改 imageRepository 编辑 kubeadm-config.yaml,找到 imageRepository: registry.k8s.io,改成:

1
imageRepository: registry.aliyuncs.com/google_containers

使用修改后的配置初始化

1
kubeadm config images pull --config=kubeadm-config.yaml

5. crictl ps -a 错误

dockershim 已被 Kubernetes 移除(从 v1.24 版本开始)。

​ • 你的 containerd 运行在 /run/containerd/containerd.sock,但 crictl 默认仍然尝试 dockershim.sock,导致找不到 socket 文件。

手动指定 crictl 的 containerd 端点

1
crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a

如果这条命令能正常工作,说明 crictl 只是默认使用了错误的 sock 文件。

可以将 containerd 设为默认 runtime,避免每次都要指定:

1
sudo crictl config runtime-endpoint unix:///run/containerd/containerd.sock

systemctl status containered显示 stop pulling image registry.k8s.io/pause:3.8

1
2
stop pulling image registry.k8s.io/pause:3.8: active requests
trying next host error="failed to do request: Head \"https://<host>\""

可以手动拉取镜像进行测试:

1
2
3
4
ctr image pull registry.k8s.io/pause:3.8
# 或者手动修改tag
ctr -n k8s.io images pull registry.aliyuncs.com/google_containers/pause:3.8
ctr -n k8s.io images tag registry.aliyuncs.com/google_containers/pause:3.8 registry.k8s.io/pause:3.8

kubeadm init错误

1
2
# kubeadm init --config=kubeadm-config.yaml
kubeadm init --image-repository registry.aliyuncs.com/google_containers

可能提示错误

1
2
3
4
5
6
Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
timed out waiting for the condition

可以添加参数查看详情

1
2
3
4
5
6
7
8
9
10
11
12
13
# 重新初始化 K8s
kubeadm reset -f --kubeconfig=kubeadm-config.yaml
# 清理残留的 Kubernetes 配置
rm -rf /etc/kubernetes/*
rm -rf ~/.kube
rm -rf /var/lib/etcd

kubeadm init --config=kubeadm-config.yaml --v=5

# 或者使用国内镜像源
sudo kubeadm init \
--pod-network-cidr=10.244.0.0/16 \
--image-repository=registry.aliyuncs.com/google_containers --v=5 # 国内用户使用阿里云镜像

然后按照提示执行:

1
2
3
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config