CSP별로 kubernetes를 managed service로 제공하지만, 수동으로 kubernetes를 설치하면서 구성에 대한 이해를 높이고 서비스 비용에 대한 타당성 및 장단점을 확인해보자.
1. Installing kubeadm
0. 사전 작업
0.1. EC2 - Launch an instance
- 우선 Free tier에 해당하는 instance type: t2.micro은 추후 resource 부족 이슈가 생기므로 t3.small로 진행
- AMI는 Amazon Linux 2023 대신 Amazon Linux 2로 진행. 2023은 설치 후, pod 무한 restart 및 다양한 문제 발생. (EOS는 짧지만 Amazon Linux 2가 안정감 있음)
➜ mango ssh -i "mango-kubeadm-rsa.pem" ec2-user@ec2-54-180-88-250.ap-northeast-2.compute.amazonaws.com
, #_
~\_ ####_ Amazon Linux 2023
~~ \_#####\
~~ \###|
~~ \#/ ___ https://aws.amazon.com/linux/amazon-linux-2023
~~ V~' '->
~~~ /
~~._. _/
_/ _/
_/m/'
Last login: Tue Mar 5 04:40:23 2024 from 211.45.60.5
0.2. Installing containerd
[참고]https://github.com/containerd/containerd/blob/main/docs/getting-started.md
[root@ip-10-180-16-29 tool]# tar Cxzvf /usr/local ./containerd-1.7.13-linux-amd64.tar.gz
bin/
bin/containerd-shim-runc-v2
bin/ctr
bin/containerd-shim
bin/containerd-shim-runc-v1
bin/containerd
bin/containerd-stress
하단 명령어 실행을 위해서는 https://raw.githubusercontent.com/containerd/containerd/main/containerd.service 파일을 아래 위치에 생성 필요
[root@ip-10-180-16-29 tool]# mkdir -p /usr/local/lib/systemd/system
[root@ip-10-180-16-29 tool]# vi /usr/local/lib/systemd/system/containerd.service
..파일 생성..
[root@ip-10-180-16-29 system]# systemctl enable --now containerd
Created symlink /etc/systemd/system/multi-user.target.wants/containerd.service → /usr/local/lib/systemd/system/containerd.service.
0.3. Installing runc
[root@ip-10-180-16-29 tool]# install -m 755 runc.amd64 /usr/local/sbin/runc
0.4. Installing CNI plugins
[root@ip-10-180-16-29 tool]# mkdir -p /opt/cni/bin
[root@ip-10-180-16-29 tool]# tar Cxzvf /opt/cni/bin ./cni-plugins-linux-amd64-v1.4.0.tgz
./
./loopback
./bandwidth
./ptp
./vlan
./host-device
./tuning
./vrf
./sbr
./tap
./dhcp
./static
./firewall
./macvlan
./dummy
./bridge
./ipvlan
./portmap
./host-local
[root@ip-10-180-16-29 tool]# cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
overlay
br_netfilter
[root@ip-10-180-16-29 tool]# sudo modprobe overlay
[root@ip-10-180-16-29 tool]# sudo modprobe br_netfilter
[root@ip-10-180-16-29 tool]# cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
[root@ip-10-180-16-29 tool]# sudo sysctl --system
* Applying /etc/sysctl.d/00-defaults.conf ...
kernel.printk = 8 4 1 7
kernel.panic = 5
..생략..
* Applying /etc/sysctl.d/k8s.conf ...
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
* Applying /etc/sysctl.conf ...
1. Installing kubeadm
Out Of Date된 링크가 있는데, 하단 현행화된 영문 링크를 확인하는 것이 정신 건강에 좋음
[참고]https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
Installing kubeadm
This page shows how to install the kubeadm toolbox. For information on how to create a cluster with kubeadm once you have performed this installation process, see the Creating a cluster with kubeadm page. This installation guide is for Kubernetes v1.29. If
kubernetes.io
[root@ip-10-180-16-29 tool]# sudo setenforce 0
[root@ip-10-180-16-29 tool]# sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
[root@ip-10-180-16-29 tool]# cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.29/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.29/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.29/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.29/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
[root@ip-10-180-16-29 tool]# sudo yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
Kubernetes 11 kB/s | 9.9 kB 00:00
..생략..
Installed:
conntrack-tools-1.4.6-2.amzn2023.0.2.x86_64 cri-tools-1.29.0-150500.1.1.x86_64 iptables-libs-1.8.8-3.amzn2023.0.2.x86_64
iptables-nft-1.8.8-3.amzn2023.0.2.x86_64 kubeadm-1.29.2-150500.1.1.x86_64 kubectl-1.29.2-150500.1.1.x86_64
kubelet-1.29.2-150500.1.1.x86_64 kubernetes-cni-1.3.0-150500.1.1.x86_64 libnetfilter_conntrack-1.0.8-2.amzn2023.0.2.x86_64
libnetfilter_cthelper-1.0.0-21.amzn2023.0.2.x86_64 libnetfilter_cttimeout-1.0.0-19.amzn2023.0.2.x86_64 libnetfilter_queue-1.0.5-2.amzn2023.0.2.x86_64
libnfnetlink-1.0.1-19.amzn2023.0.2.x86_64 libnftnl-1.2.2-2.amzn2023.0.2.x86_64 socat-1.7.4.2-1.amzn2023.0.2.x86_64
Complete!
[root@ip-10-180-16-29 tool]# sudo systemctl enable --now kubelet
Created symlink /etc/systemd/system/multi-user.target.wants/kubelet.service → /usr/lib/systemd/system/kubelet.service.
2. Creating a cluster with kubeadm
[참고]https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/
Creating a cluster with kubeadm
Using kubeadm, you can create a minimum viable Kubernetes cluster that conforms to best practices. In fact, you can use kubeadm to set up a cluster that will pass the Kubernetes Conformance tests. kubeadm also supports other cluster lifecycle functions, su
kubernetes.io
사전 작업을 미완료하거나 instance resource가 부족할 경우, 아래 단계에서 Error가 많이 발생하니 주의.
[root@ip-10-180-16-29 ~]# kubeadm init --control-plane-endpoint 10.180.16.29 --pod-network-cidr 10.80.16.0/24 --apiserver-advertise-address=10.180.16.29
[init] Using Kubernetes version: v1.29.2
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W0307 08:00:50.597205 4024 checks.go:835] detected that the sandbox image "registry.k8s.io/pause:3.8" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.k8s.io/pause:3.9" as the CRI sandbox image.
[certs] Using certificateDir folder "/etc/kubernetes/pki"
..생략..
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:
kubeadm join 10.180.16.29:6443 --token ketes2.tyvjn83p2d8bqm8t \
--discovery-token-ca-cert-hash sha256:1605ad4a12a9b8790bdc3d73ae8d12cbe238526bbd1fc98dce6232b0a285120b \
--control-plane
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.180.16.29:6443 --token ketes2.tyvjn83p2d8bqm8t \
--discovery-token-ca-cert-hash sha256:1605ad4a12a9b8790bdc3d73ae8d12cbe238526bbd1fc98dce6232b0a285120b
위 작업 이 후, 아래와 같이 k8s cluster 접근 설정
[ec2-user@ip-10-180-16-34 /]$ mkdir -p $HOME/.kube
[ec2-user@ip-10-180-16-34 /]$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[ec2-user@ip-10-180-16-34 /]$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
[ec2-user@ip-10-180-16-34 /]$ kubectl get no
NAME STATUS ROLES AGE VERSION
ip-10-180-16-29 Ready control-plane 111s v1.29.2
Trouble Shoot
아래는 설치 과정에서 겪은 문제와 해결 방안을 간단히 정리
1. CNI
설치 후, Cluster를 간단히 살펴보면 아래와 같은 문제가 있음. /etc/cni/net.d 아래 파일을 작성해서 조치도 가능하지만, 유용한 open source들이 많으니 활용하여 해결하기로.
- control-plane node를 보면 'cni plugin not initialized' 라는 상태 오류 확인
- 'coredns-76f75df574-*' pod가 PENDING 상태로 있음
[ec2-user@ip-10-180-16-34 /]$ k describe no ip-10-180-16-34.ap-northeast-2.compute.internal
..생략..
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure False Tue, 05 Mar 2024 08:22:02 +0000 Tue, 05 Mar 2024 07:56:44 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Tue, 05 Mar 2024 08:22:02 +0000 Tue, 05 Mar 2024 07:56:44 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Tue, 05 Mar 2024 08:22:02 +0000 Tue, 05 Mar 2024 07:56:44 +0000 KubeletHasSufficientPID kubelet has sufficient PID available
Ready False Tue, 05 Mar 2024 08:22:02 +0000 Tue, 05 Mar 2024 07:56:44 +0000 KubeletNotReady container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
Install the Cilium
많은 블로그에서 Calico를 설치하는 것 같은데, CNCF Graduated level 에도 있는 Cilium을 설치해서 해결.
https://docs.cilium.io/en/stable/gettingstarted/k8s-install-default/#install-the-cilium-cli
Cilium Quick Installation — Cilium 1.15.1 documentation
Install Cilium into the EKS cluster. Note If you have to uninstall Cilium and later install it again, that could cause connectivity issues due to aws-node DaemonSet flushing Linux routing tables. The issues can be fixed by restarting all pods, alternativel
docs.cilium.io
[root@ip-10-180-16-34 ~]# CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
CLI_ARCH=amd64
if [ "$(uname -m)" = "aarch64" ]; then CLI_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
sha256sum --check cilium-linux-${CLI_ARCH}.tar.gz.sha256sum
sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 40.1M 100 40.1M 0 0 33.1M 0 0:00:01 0:00:01 --:--:-- 282M
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 92 100 92 0 0 87 0 0:00:01 0:00:01 --:--:-- 0
cilium-linux-amd64.tar.gz: OK
cilium
rm: remove regular file 'cilium-linux-amd64.tar.gz'? y
rm: remove regular file 'cilium-linux-amd64.tar.gz.sha256sum'? y
[root@ip-10-180-16-34 ~]# cilium version --client
cilium-cli: compiled with go1.22.0 on linux/amd64
cilium image (default): v1.15.1
cilium image (stable): v1.15.1
설치 확인.
[root@ip-10-180-16-34 ~]$ export KUBECONFIG=/etc/kubernetes/admin.conf
[root@ip-10-180-16-34 ~]$ cilium install --version 1.15.1
ℹ️ Using Cilium version 1.15.1
🔮 Auto-detected cluster name: kubernetes
🔮 Auto-detected kube-proxy has been installed
[root@ip-10-180-16-34 ~]$ cilium status --wait
/¯¯\
/¯¯\__/¯¯\ Cilium: OK
\__/¯¯\__/ Operator: OK
/¯¯\__/¯¯\ Envoy DaemonSet: disabled (using embedded mode)
\__/¯¯\__/ Hubble Relay: disabled
\__/ ClusterMesh: disabled
Cluster 상태 확인
node와 pod 모두 상태가 Ready, Running 상태로 안정.
[ec2-user@ip-10-180-16-34 ~]$ k get no
NAME STATUS ROLES AGE VERSION
ip-10-180-16-34.ap-northeast-2.compute.internal Ready control-plane 34m v1.29.2
[ec2-user@ip-10-180-16-34 ~]$ k get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system cilium-operator-6747b86d84-2wfhh 1/1 Running 3 (18m ago) 30m
kube-system cilium-v2gxq 1/1 Running 1 (21m ago) 30m
kube-system coredns-76f75df574-66vbc 1/1 Running 1 (21m ago) 34m
kube-system coredns-76f75df574-jlksv 1/1 Running 1 (21m ago) 34m
kube-system etcd-ip-10-180-16-34.ap-northeast-2.compute.internal 1/1 Running 3 (18m ago) 34m
kube-system kube-apiserver-ip-10-180-16-34.ap-northeast-2.compute.internal 1/1 Running 3 (18m ago) 34m
kube-system kube-controller-manager-ip-10-180-16-34.ap-northeast-2.compute.internal 1/1 Running 4 (18m ago) 34m
kube-system kube-proxy-jrxg6 1/1 Running 2 (18m ago) 34m
kube-system kube-scheduler-ip-10-180-16-34.ap-northeast-2.compute.internal 1/1 Running 4 (18m ago) 34m
결론
작업
설치하는 과정에 자잘한 이슈 해결에 고통이 수반되고, Cluster 안정화에도 어느 정도 노력이 필요. 특히 OS 영향을 받을 때는 골치 아픔
비용
EKS가 월 $73 지불해야하는데, 수동으로 EC2에 Cluster를 설치할 경우 $20 정도로 가능. on/off 가능하다는 장점이 있지만, 운영/유지보수 측면에서 managed service를 쓰는 것이 나을 것도 같음
'Engineering > k8s' 카테고리의 다른 글
CNI(cilium) helm 재설치 (0) | 2024.05.02 |
---|---|
kubernetes CSI 설치(aws EBS) (0) | 2024.03.19 |
Hubble UI(cilium) 설정하기 (0) | 2024.03.11 |
kubernetes worker node 수동 추가 (0) | 2024.03.08 |
Service mesh(istio) upgrade 하기 (0) | 2024.01.23 |