使用 PVE 搭建 Kubernetes 集群
一般在生产环境使用阿里云 ACK 、开发环境使用 Docker/Minikube 测试和使用 Kubernetes。安装 PVE 之后在本地搭建 Kubernetes 集群变得很方便,这里记录一下搭建过程。因为 PVE 所在环境已经通过前置路由器解决了网络问题,所以这里不考虑网络问题。
版本信息
- Proxmox Virtual Environment 7.4-3
- Ubuntu Server 22.04 LTS
- Kubernetes 1.26.3
使用 Cloud-Init 配置虚拟机模版
Cloud-Init 能够简化虚拟机配置过程,避免手动安装流程,这里使用 Cloud-Init 配置 Ubuntu Server 22.04 的模版:
- Ubuntu 22.04 LTS 下载地址 http://cloud-images.ubuntu.com/releases/22.04/release/
- 具体下载链接为 http://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-amd64.img
- 通过 PVE 的 ISO 镜像管理功能下载到了路径
/var/lib/vz/template/iso/ubuntu-22.04-server-cloudimg-amd64.img
下载好之后创建一个虚拟机:
- 创建时不需要添加硬盘,因为等下要导入镜像到硬盘
- 创建好之后添加一个 Cloud-Init 设备
- 网卡选择 Virtio 虚拟网卡
虚拟机创建好之后进入 PVE Shell 通过下面的命令导入镜像到硬盘:
# 801 是虚拟机的 ID,这里特意把模版设置高一点
qm importdisk 801 /var/lib/vz/template/iso/ubuntu-22.04-server-cloudimg-amd64.img local-lvm --format=qcow2
- 导入好之后在虚拟机硬件管理能看到新的空白硬盘,双击启用,设置为 SCSI,比如 scsi0
- 进入启动顺序管理把 scsi0 放在第一位
- 进入 Cloud-Init 配置用户名和登录方式,推荐配置 SSH key,并开启 DHCP,因为作为模版不需要固定 IP
以上设置就绪之后启动虚拟机,等待启动完成之后,通过 SSH 登录到虚拟机。确认系统正常之后,关机进入硬件配置界面进行硬盘大小调整。需要注意增加单位是 G,在网页只能增加不能减少,如果不慎增加太多,可以通过下面的命令缩减。
# 查看磁盘路径,根据虚拟机 ID 定位到对应的磁盘
lvdisplay
# 缩减磁盘大小
lvreduce -L -7988G /dev/pve/vm-801-disk-0
# 调整之后 PVE 页面显示并没有更新,实测可以在页面操作再增加 1G 的方式触发更新
调整之后登录测试机器,通过下面的命令安装 qemu-guest-agent
,确认没有问题之后关闭机器,在配置选项中勾选启用 QEMU Guest Agent。然后转换虚拟机为模版虚拟机,这样后续就可以通过模版创建虚拟机了。
sudo apt-get install qemu-guest-agent
sudo systemctl start qemu-guest-agent
本节参考文档
- https://foxi.buduanwang.vip/virtualization/pve/388.html/
- https://codingpackets.com/blog/proxmox-import-and-use-cloud-images/
- https://mayanpeng.cn/archives/158.html
- https://www.youtube.com/watch?v=MJgIm03Jxdo&t=1011s
创建和配置虚拟机
对上述模版进行完整克隆得到新虚拟机,启动前按照下表设置,静态 IP 需要在 Cloud-Init 选项中设置,为了节省内存,启用内存 Ballooning 选项。
ID | Name | IP | 配置 |
---|---|---|---|
150 | k8s-c0 | 192.168.50.150 | 2C2-4G(Ballooning) |
160 | k8s-n0 | 192.168.50.160 | 2C2-4G(Ballooning) |
161 | k8s-n1 | 192.168.50.161 | 2C2-4G(Ballooning) |
162 | k8s-n2 | 192.168.50.162 | 2C2-4G(Ballooning) |
如果 Worker 节点非常多,可以考虑配置好 Worker 之后将 Worker 设置为模版虚拟机,然后通过模版创建虚拟机。
配置虚拟机网络
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
# 设置所需的 sysctl 参数,参数在重新启动后保持不变
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
# 应用 sysctl 参数而不重新启动
sudo sysctl --system
# 通过运行以下指令确认 br_netfilter 和 overlay 模块被加载
lsmod | grep br_netfilter
lsmod | grep overlay
# 通过运行以下指令确认 net.bridge.bridge-nf-call-iptables、net.bridge.bridge-nf-call-ip6tables 和 net.ipv4.ip_forward 系统变量在你的 sysctl 配置中被设置为 1
sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward
# 如果有防火墙需要参考此文档
# https://kubernetes.io/docs/reference/networking/ports-and-protocols/
配置容器运行时
参考 https://kubernetes.io/docs/setup/production-environment/container-runtimes/#containerd
# 安装 containerd
sudo apt-get update && sudo apt-get install -y containerd
# 配置 containerd
sudo mkdir -p /etc/containerd
sudo containerd config default | sudo tee /etc/containerd/config.toml
# 修改为 SystemdCgroup
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
cat /etc/containerd/config.toml | grep SystemdCgroup
# 配置 containerd 服务
sudo systemctl enable containerd
sudo systemctl restart containerd
sudo systemctl status containerd
安装 kubelet/kubeadm/kubectl
参考 https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
# 添加配置
sudo apt-get update && sudo apt-get install -y apt-transport-https ca-certificates curl
sudo curl -fsSLo /etc/apt/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
# 安装 kubelet/kubeadm/kubectl
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl # 固定版本
Kubernetes 需要保证不同机器 /sys/class/dmi/id/product_uuid
不一样,但是对 /etc/machine-id
无要求,实测同一模版创建的虚拟机 /sys/class/dmi/id/product_uuid
不一样而 /etc/machine-id
是一样的,可以手动修改 /etc/machine-id
为不同的值。
sudo systemd-machine-id-setup --commit
本节参考文档
- https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
- https://kubernetes.io/docs/setup/production-environment/container-runtimes/
- https://rmoff.net/2016/07/05/proxmox-4-containers-ssh-ssh_exchange_identification-read-connection-reset-by-peer/
创建 Kubernetes 集群
# 提前拉取镜像
sudo kubeadm config images pull
# control-plane-endpoint 是控制平面的地址,node-name 是当前节点的名称,pod-network-cidr 是 pod 网络的网段(不能和集群内其他网段冲突)
sudo kubeadm init --control-plane-endpoint=192.168.50.150 --node-name k8s-c0 --pod-network-cidr=10.244.0.0/16
如果遇到 https://serverfault.com/questions/1118051/failed-to-run-kubelet-validate-service-connection-cri-v1-runtime-api-is-not-im/1127024 问题,可以参考下面解决方案安装更高版本的 containerd
sudo apt remove containerd
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo \
"deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
"$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install containerd.io
以下是 kubeadm init
的执行记录
ubuntu@k8s-c0:~$ sudo kubeadm init --control-plane-endpoint=192.168.50.150 --node-name k8s-c0 --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.26.3
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-c0 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.50.150]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-c0 localhost] and IPs [192.168.50.150 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-c0 localhost] and IPs [192.168.50.150 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 9.501816 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node k8s-c0 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node k8s-c0 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: hki676.*****
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:
kubeadm join 192.168.50.150:6443 --token hki676.***** \
--discovery-token-ca-cert-hash sha256:***** \
--control-plane
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.50.150:6443 --token hki676.***** \
--discovery-token-ca-cert-hash sha256:*****
安装网络插件
# 安装 flannel
kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
# 查看安装状态
kubectl get pods --all-namespaces
把 Worker 节点加入集群
ubuntu@k8s-n0:~$ sudo kubeadm join 192.168.50.150:6443 --token hki676.***** \
--discovery-token-ca-cert-hash sha256:*****
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
如果 kubeadm token 过期可以使用下面的命令查看加入集群的命令
# 查看加入集群的命令
sudo kubeadm token create --print-join-command
本节参考文档
安装 Kubernetes Dashboard
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
创建文件 dashboard-adminuser.yml
, 内容如下,创建之后通过 kubectl apply -f dashboard-adminuser.yml
部署。
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kubernetes-dashboard
然后在本地执行下面的命令,打开 Dashboard。
# 复制控制平面配置到本地
scp ubuntu@k8s-c0:/etc/kubernetes/admin.conf /Users/phyng/.kube/config_pve_k8s
export KUBECONFIG=/Users/phyng/.kube/config_pve_k8s
# 创建 token
kubectl -n kubernetes-dashboard create token admin-user
# 打开 Dashboard,填写上面创建的 token
kubectl proxy
open http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
参考文档
- https://github.com/kubernetes/dashboard
- https://github.com/kubernetes/dashboard/blob/master/docs/user/access-control/creating-sample-user.md
部署测试 Deployment 和 Service
创建文件 nginx-deployment.yml
, 内容如下
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.23.4
ports:
- containerPort: 80
name: 'nginx-http'
创建文件 nginx-service.yml
, 内容如下
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
nodePort: 30080
targetPort: 'nginx-http'
type: NodePort
部署文件
kubectl apply -f nginx-deployment.yml -f nginx-service.yml
在局域网内可以通过任意节点的 IP 地址加端口号访问测试确认部署成功。
curl http://192.168.50.150:30080/
curl http://192.168.50.160:30080/