Engineering/k8s

kubernetes cluster 수동 설치(w/aws)

망고v 2024. 3. 8. 11:22

CSP별로 kubernetes를 managed service로 제공하지만, 수동으로 kubernetes를 설치하면서 구성에 대한 이해를 높이고 서비스 비용에 대한 타당성 및 장단점을 확인해보자.

1. Installing kubeadm

0. 사전 작업

0.1. EC2 - Launch an instance

  • 우선 Free tier에 해당하는 instance type: t2.micro은 추후 resource 부족 이슈가 생기므로 t3.small로 진행
  • AMI는 Amazon Linux 2023 대신 Amazon Linux 2로 진행. 2023은 설치 후, pod 무한 restart 및 다양한 문제 발생. (EOS는 짧지만 Amazon Linux 2가 안정감 있음)
➜  mango ssh -i "mango-kubeadm-rsa.pem" ec2-user@ec2-54-180-88-250.ap-northeast-2.compute.amazonaws.com
   ,     #_
   ~\_  ####_        Amazon Linux 2023
  ~~  \_#####\
  ~~     \###|
  ~~       \#/ ___   https://aws.amazon.com/linux/amazon-linux-2023
   ~~       V~' '->
    ~~~         /
      ~~._.   _/
         _/ _/
       _/m/'
Last login: Tue Mar  5 04:40:23 2024 from 211.45.60.5

 

0.2. Installing containerd

[참고]https://github.com/containerd/containerd/blob/main/docs/getting-started.md

[root@ip-10-180-16-29 tool]# tar Cxzvf /usr/local ./containerd-1.7.13-linux-amd64.tar.gz 
bin/
bin/containerd-shim-runc-v2
bin/ctr
bin/containerd-shim
bin/containerd-shim-runc-v1
bin/containerd
bin/containerd-stress

 

하단 명령어 실행을 위해서는 https://raw.githubusercontent.com/containerd/containerd/main/containerd.service 파일을 아래 위치에 생성 필요

[root@ip-10-180-16-29 tool]# mkdir -p /usr/local/lib/systemd/system
[root@ip-10-180-16-29 tool]# vi /usr/local/lib/systemd/system/containerd.service
..파일 생성..
[root@ip-10-180-16-29 system]# systemctl enable --now containerd
Created symlink /etc/systemd/system/multi-user.target.wants/containerd.service → /usr/local/lib/systemd/system/containerd.service.

 

0.3. Installing runc

[root@ip-10-180-16-29 tool]# install -m 755 runc.amd64 /usr/local/sbin/runc

 

0.4. Installing CNI plugins

[root@ip-10-180-16-29 tool]# mkdir -p /opt/cni/bin
[root@ip-10-180-16-29 tool]# tar Cxzvf /opt/cni/bin ./cni-plugins-linux-amd64-v1.4.0.tgz
./
./loopback
./bandwidth
./ptp
./vlan
./host-device
./tuning
./vrf
./sbr
./tap
./dhcp
./static
./firewall
./macvlan
./dummy
./bridge
./ipvlan
./portmap
./host-local

 

[root@ip-10-180-16-29 tool]# cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
overlay
br_netfilter
[root@ip-10-180-16-29 tool]# sudo modprobe overlay
[root@ip-10-180-16-29 tool]# sudo modprobe br_netfilter
[root@ip-10-180-16-29 tool]# cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
[root@ip-10-180-16-29 tool]# sudo sysctl --system
* Applying /etc/sysctl.d/00-defaults.conf ...
kernel.printk = 8 4 1 7
kernel.panic = 5
..생략..
* Applying /etc/sysctl.d/k8s.conf ...
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
* Applying /etc/sysctl.conf ...

 

 

1. Installing kubeadm

Out Of Date된 링크가 있는데, 하단 현행화된 영문 링크를 확인하는 것이 정신 건강에 좋음

[참고]https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

 

Installing kubeadm

This page shows how to install the kubeadm toolbox. For information on how to create a cluster with kubeadm once you have performed this installation process, see the Creating a cluster with kubeadm page. This installation guide is for Kubernetes v1.29. If

kubernetes.io

[root@ip-10-180-16-29 tool]# sudo setenforce 0
[root@ip-10-180-16-29 tool]# sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
[root@ip-10-180-16-29 tool]# cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.29/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.29/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.29/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.29/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
[root@ip-10-180-16-29 tool]# sudo yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
Kubernetes                                                                                                                       11 kB/s | 9.9 kB     00:00   
..생략..
Installed:
  conntrack-tools-1.4.6-2.amzn2023.0.2.x86_64         cri-tools-1.29.0-150500.1.1.x86_64                   iptables-libs-1.8.8-3.amzn2023.0.2.x86_64          
  iptables-nft-1.8.8-3.amzn2023.0.2.x86_64            kubeadm-1.29.2-150500.1.1.x86_64                     kubectl-1.29.2-150500.1.1.x86_64                   
  kubelet-1.29.2-150500.1.1.x86_64                    kubernetes-cni-1.3.0-150500.1.1.x86_64               libnetfilter_conntrack-1.0.8-2.amzn2023.0.2.x86_64 
  libnetfilter_cthelper-1.0.0-21.amzn2023.0.2.x86_64  libnetfilter_cttimeout-1.0.0-19.amzn2023.0.2.x86_64  libnetfilter_queue-1.0.5-2.amzn2023.0.2.x86_64     
  libnfnetlink-1.0.1-19.amzn2023.0.2.x86_64           libnftnl-1.2.2-2.amzn2023.0.2.x86_64                 socat-1.7.4.2-1.amzn2023.0.2.x86_64                

Complete!

[root@ip-10-180-16-29 tool]# sudo systemctl enable --now kubelet
Created symlink /etc/systemd/system/multi-user.target.wants/kubelet.service → /usr/lib/systemd/system/kubelet.service.

 

 

 

2. Creating a cluster with kubeadm

[참고]https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/

 

Creating a cluster with kubeadm

Using kubeadm, you can create a minimum viable Kubernetes cluster that conforms to best practices. In fact, you can use kubeadm to set up a cluster that will pass the Kubernetes Conformance tests. kubeadm also supports other cluster lifecycle functions, su

kubernetes.io

 

사전 작업을 미완료하거나 instance resource가 부족할 경우, 아래 단계에서 Error가 많이 발생하니 주의.

[root@ip-10-180-16-29 ~]# kubeadm init --control-plane-endpoint 10.180.16.29 --pod-network-cidr 10.80.16.0/24 --apiserver-advertise-address=10.180.16.29
[init] Using Kubernetes version: v1.29.2
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W0307 08:00:50.597205    4024 checks.go:835] detected that the sandbox image "registry.k8s.io/pause:3.8" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.k8s.io/pause:3.9" as the CRI sandbox image.
[certs] Using certificateDir folder "/etc/kubernetes/pki"
..생략..
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:

  kubeadm join 10.180.16.29:6443 --token ketes2.tyvjn83p2d8bqm8t \
        --discovery-token-ca-cert-hash sha256:1605ad4a12a9b8790bdc3d73ae8d12cbe238526bbd1fc98dce6232b0a285120b \
        --control-plane 

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 10.180.16.29:6443 --token ketes2.tyvjn83p2d8bqm8t \
        --discovery-token-ca-cert-hash sha256:1605ad4a12a9b8790bdc3d73ae8d12cbe238526bbd1fc98dce6232b0a285120b

 

위 작업 이 후, 아래와 같이 k8s cluster 접근 설정

[ec2-user@ip-10-180-16-34 /]$ mkdir -p $HOME/.kube
[ec2-user@ip-10-180-16-34 /]$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[ec2-user@ip-10-180-16-34 /]$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
[ec2-user@ip-10-180-16-34 /]$ kubectl get no
NAME              STATUS   ROLES           AGE    VERSION
ip-10-180-16-29   Ready    control-plane   111s   v1.29.2

 

 

 

Trouble Shoot

아래는 설치 과정에서 겪은 문제와 해결 방안을 간단히 정리

1. CNI

설치 후, Cluster를 간단히 살펴보면 아래와 같은 문제가 있음. /etc/cni/net.d 아래 파일을 작성해서 조치도 가능하지만, 유용한 open source들이 많으니 활용하여 해결하기로.

  • control-plane node를 보면 'cni plugin not initialized' 라는 상태 오류 확인
  • 'coredns-76f75df574-*' pod가 PENDING 상태로 있음
[ec2-user@ip-10-180-16-34 /]$ k describe no ip-10-180-16-34.ap-northeast-2.compute.internal

..생략..

Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Tue, 05 Mar 2024 08:22:02 +0000   Tue, 05 Mar 2024 07:56:44 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Tue, 05 Mar 2024 08:22:02 +0000   Tue, 05 Mar 2024 07:56:44 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Tue, 05 Mar 2024 08:22:02 +0000   Tue, 05 Mar 2024 07:56:44 +0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            False   Tue, 05 Mar 2024 08:22:02 +0000   Tue, 05 Mar 2024 07:56:44 +0000   KubeletNotReady              container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized

 

Install the Cilium

많은 블로그에서 Calico를 설치하는 것 같은데, CNCF Graduated level 에도 있는 Cilium을 설치해서 해결.

https://docs.cilium.io/en/stable/gettingstarted/k8s-install-default/#install-the-cilium-cli

 

Cilium Quick Installation — Cilium 1.15.1 documentation

Install Cilium into the EKS cluster. Note If you have to uninstall Cilium and later install it again, that could cause connectivity issues due to aws-node DaemonSet flushing Linux routing tables. The issues can be fixed by restarting all pods, alternativel

docs.cilium.io

[root@ip-10-180-16-34 ~]# CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
CLI_ARCH=amd64
if [ "$(uname -m)" = "aarch64" ]; then CLI_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
sha256sum --check cilium-linux-${CLI_ARCH}.tar.gz.sha256sum
sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 40.1M  100 40.1M    0     0  33.1M      0  0:00:01  0:00:01 --:--:--  282M
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100    92  100    92    0     0     87      0  0:00:01  0:00:01 --:--:--     0
cilium-linux-amd64.tar.gz: OK
cilium
rm: remove regular file 'cilium-linux-amd64.tar.gz'? y
rm: remove regular file 'cilium-linux-amd64.tar.gz.sha256sum'? y

[root@ip-10-180-16-34 ~]# cilium version --client
cilium-cli:  compiled with go1.22.0 on linux/amd64
cilium image (default): v1.15.1
cilium image (stable): v1.15.1

설치 확인.

[root@ip-10-180-16-34 ~]$ export KUBECONFIG=/etc/kubernetes/admin.conf

[root@ip-10-180-16-34 ~]$ cilium install --version 1.15.1
ℹ️  Using Cilium version 1.15.1
🔮 Auto-detected cluster name: kubernetes
🔮 Auto-detected kube-proxy has been installed

[root@ip-10-180-16-34 ~]$ cilium status --wait
    /¯¯\
 /¯¯\__/¯¯\    Cilium:             OK
 \__/¯¯\__/    Operator:           OK
 /¯¯\__/¯¯\    Envoy DaemonSet:    disabled (using embedded mode)
 \__/¯¯\__/    Hubble Relay:       disabled
    \__/       ClusterMesh:        disabled

 

Cluster 상태 확인

node와 pod 모두 상태가 Ready, Running 상태로 안정.

[ec2-user@ip-10-180-16-34 ~]$ k get no
NAME                                              STATUS   ROLES           AGE   VERSION
ip-10-180-16-34.ap-northeast-2.compute.internal   Ready    control-plane   34m   v1.29.2

[ec2-user@ip-10-180-16-34 ~]$ k get po -A
NAMESPACE     NAME                                                                      READY   STATUS    RESTARTS      AGE
kube-system   cilium-operator-6747b86d84-2wfhh                                          1/1     Running   3 (18m ago)   30m
kube-system   cilium-v2gxq                                                              1/1     Running   1 (21m ago)   30m
kube-system   coredns-76f75df574-66vbc                                                  1/1     Running   1 (21m ago)   34m
kube-system   coredns-76f75df574-jlksv                                                  1/1     Running   1 (21m ago)   34m
kube-system   etcd-ip-10-180-16-34.ap-northeast-2.compute.internal                      1/1     Running   3 (18m ago)   34m
kube-system   kube-apiserver-ip-10-180-16-34.ap-northeast-2.compute.internal            1/1     Running   3 (18m ago)   34m
kube-system   kube-controller-manager-ip-10-180-16-34.ap-northeast-2.compute.internal   1/1     Running   4 (18m ago)   34m
kube-system   kube-proxy-jrxg6                                                          1/1     Running   2 (18m ago)   34m
kube-system   kube-scheduler-ip-10-180-16-34.ap-northeast-2.compute.internal            1/1     Running   4 (18m ago)   34m

 

 

결론

작업

설치하는 과정에 자잘한 이슈 해결에 고통이 수반되고, Cluster 안정화에도 어느 정도 노력이 필요. 특히 OS 영향을 받을 때는 골치 아픔

 

비용

EKS가 월 $73 지불해야하는데, 수동으로 EC2에 Cluster를 설치할 경우 $20 정도로 가능. on/off 가능하다는 장점이 있지만, 운영/유지보수 측면에서 managed service를 쓰는 것이 나을 것도 같음

'Engineering > k8s' 카테고리의 다른 글

CNI(cilium) helm 재설치  (0) 2024.05.02
kubernetes CSI 설치(aws EBS)  (0) 2024.03.19
Hubble UI(cilium) 설정하기  (0) 2024.03.11
kubernetes worker node 수동 추가  (0) 2024.03.08
Service mesh(istio) upgrade 하기  (0) 2024.01.23