Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Current »

  • Kubernetes cluster의 Applications tab > Prometheus 설치

설치 전 환경 설정

nfs 서버 설정

  • 공유폴더 생성
    $ mkdir /home/data

  • 폴더 권한 변경
    $ chmod 777 /home/data

  • 패키지 설치
    $ yum install nfs-utils (CentOS)
    $ apt-get install nfs-common nfs-kernel-server (Ubuntu)

  • NFS 설정 수정

    $ vi /etc/exports
    
    /home/data *(rw,sync,no_subtree_check)   ## ip, hostname, 도메인 등으로 설정해야 하나 kubernetes에서 인식 못하는 오류가 있어서 전체 허용으로 설정
  • 설정 반영

    exportfs -a
    systemctl restart nfs-kernel-server

서비스 계정 및 역할 바인딩 배포

  • rbac.yaml 파일 생성

    $ vi rbac.yaml
    
    kind: ServiceAccount
    apiVersion: v1
    metadata:
      name: nfs-client-provisioner
    ---
    kind: ClusterRole
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: nfs-client-provisioner-runner
    rules:
      - apiGroups: [""]
        resources: ["persistentvolumes"]
        verbs: ["get", "list", "watch", "create", "delete"]
      - apiGroups: [""]
        resources: ["persistentvolumeclaims"]
        verbs: ["get", "list", "watch", "update"]
      - apiGroups: ["storage.k8s.io"]
        resources: ["storageclasses"]
        verbs: ["get", "list", "watch"]
      - apiGroups: [""]
        resources: ["events"]
        verbs: ["create", "update", "patch"]
    ---
    kind: ClusterRoleBinding
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: run-nfs-client-provisioner
    subjects:
      - kind: ServiceAccount
        name: nfs-client-provisioner
        namespace: default
    roleRef:
      kind: ClusterRole
      name: nfs-client-provisioner-runner
      apiGroup: rbac.authorization.k8s.io
    ---
    kind: Role
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: leader-locking-nfs-client-provisioner
    rules:
      - apiGroups: [""]
        resources: ["endpoints"]
        verbs: ["get", "list", "watch", "create", "update", "patch"]
    ---
    kind: RoleBinding
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: leader-locking-nfs-client-provisioner
    subjects:
      - kind: ServiceAccount
        name: nfs-client-provisioner
        # replace with namespace where provisioner is deployed
        namespace: default
    roleRef:
      kind: Role
      name: leader-locking-nfs-client-provisioner
      apiGroup: rbac.authorization.k8s.io
  • yaml 배포
    $ kubectl create -f rbac.yaml

  • clusterRole 및 바인딩이 생성되었는지 확인

    $ kubectl get clusterrole, clusterrolebinding, role, rolebinding | grep nfs
    clusterrole.rbac.authorization.k8s.io/nfs-client-provisioner-runner 20m
    clusterrolebinding.rbac.authorization.k8s.io/run-nfs-client-provisioner 20m
    role.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner 20m
    rolebinding.rbac.authorization.k8s.io/leader- locking-nfs-client-provisioner 20m

스토리지 클래스 및 NFS Provisioner 배포

  • StorageClass 생성

    $ vi class.yaml
    
    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: managed-nfs-storage
    provisioner: nfs-gitlab
    reclaimPolicy: Retain
    allowVolumeExpansion: true
    parameters:
      archiveOnDelete: "false"
  • yaml 배포
    $ kubectl create -f class.yaml

  • StorageClass 생성 확인

    $ kubectl get storageclass
    NAME                  PROVISIONER   RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
    managed-nfs-storage   nfs-gitlab    Retain          Immediate           true                   15m
  • deployment.yaml 파일 생성

    $ vi deployment.yaml
    
    kind: Deployment
    apiVersion: apps/v1
    metadata:
      name: nfs-client-provisioner
    spec:
      selector:
        matchLabels:
          app: nfs-client-provisioner
      replicas: 1
      strategy:
        type: Recreate
      template:
        metadata:
          labels:
            app: nfs-client-provisioner
        spec:
          serviceAccountName: nfs-client-provisioner
          containers:
            - name: nfs-client-provisioner
              image: quay.io/external_storage/nfs-client-provisioner:latest
              volumeMounts:
                - name: nfs-client-root
                  mountPath: /persistentvolumes
              env:
                - name: PROVISIONER_NAME
                  value: nfs-gitlab
                - name: NFS_SERVER
                  value: 14.36.48.220
                - name: NFS_PATH
                  value: /home/data
          volumes:
            - name: nfs-client-root
              nfs:
                server: 14.36.48.220
                path: /home/data
  • yaml 배포
    $ kubectl create -f deployment.yaml

  • nfs-client-provisioner pod 생성 확인

    $ kubectl get all
    NAME                                          READY   STATUS    RESTARTS   AGE
    pod/nfs-client-provisioner-6d5d96fffb-5v6n7   1/1     Running   0          16m
    
    NAME                           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
    service/default-http-backend   ClusterIP   10.107.231.71   <none>        80/TCP    3h45m
    service/kubernetes             ClusterIP   10.96.0.1       <none>        443/TCP   30h
    
    NAME                                     READY   UP-TO-DATE   AVAILABLE   AGE
    deployment.apps/nfs-client-provisioner   1/1     1            1           16m
    
    NAME                                                DESIRED   CURRENT   READY   AGE
    replicaset.apps/nfs-client-provisioner-6d5d96fffb   1         1         1       16m

values.yaml 수정

$ vi /opt/gitlab/embedded/service/gitlab-rails/vendor/prometheus/values.yaml

securityContext:
  fsGroup: 999
  runAsUser: 999

alertmanager:
  enabled: true
  persistentVolume:
    accessModes:
      - ReadWriteOnce
    annotations: {}
    existingClaim: ""
    mountPath: /home/data
    size: 20Gi
    storageClass: managed-nfs-storage
    subPath: ""

kubeStateMetrics:
  enabled: true

nodeExporter:
  enabled: false

pushgateway:
  enabled: false

server:
  fullnameOverride: "prometheus-prometheus-server"
  persistentVolume:
    accessModes:
      - ReadWriteOnce
    annotations: {}
    existingClaim: ""
    mountPath: /home/data
    size: 20Gi
    storageClass: managed-nfs-storage
    subPath: ""

...
  • Volume 생성을 위한 권한 부여

    # values.yaml 내용
    securityContext:
      fsGroup: 999
      runAsUser: 999
  • alertmanager 설정

    # values.yaml 내용
    alertmanager:
      enabled: true
      persistentVolume:
        accessModes:
          - ReadWriteOnce
        annotations: {}
        existingClaim: ""
        mountPath: /home/data
        size: 20Gi
        storageClass: managed-nfs-storage
        subPath: ""
  • server 설정

    # values.yaml 내용
    server:
      fullnameOverride: "prometheus-prometheus-server"
      persistentVolume:
        accessModes:
          - ReadWriteOnce
        annotations: {}
        existingClaim: ""
        mountPath: /home/data
        size: 20Gi
        storageClass: managed-nfs-storage
        subPath: ""

Prometheus 설치 확인

  • Kubernetes cluster의 Applications tab 확인

  • Kubernetes cluster의 Health tab 확인

  • 마스터 노드 터미널에서 Pod 로그 확인

$ kubectl logs prometheus-prometheus-server-84b688d6b4-qc7b4 prometheus-server -n gitlab-managed-apps
level=info ts=2020-10-13T07:55:23.639Z caller=main.go:330 msg="Starting Prometheus" version="(version=2.15.2, branch=HEAD, revision=d9613e5c466c6e9de548c4dae1b9aabf9aaf7c57)"
level=info ts=2020-10-13T07:55:23.639Z caller=main.go:331 build_context="(go=go1.13.5, user=root@688433cf4ff7, date=20200106-14:50:51)"
level=info ts=2020-10-13T07:55:23.639Z caller=main.go:332 host_details="(Linux 5.4.0-48-generic #52~18.04.1-Ubuntu SMP Thu Sep 10 12:50:22 UTC 2020 x86_64 prometheus-prometheus-server-84b688d6b4-qc7b4 (none))"
level=info ts=2020-10-13T07:55:23.639Z caller=main.go:333 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2020-10-13T07:55:23.639Z caller=main.go:334 vm_limits="(soft=unlimited, hard=unlimited)"
level=info ts=2020-10-13T07:55:23.694Z caller=web.go:506 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2020-10-13T07:55:23.693Z caller=main.go:648 msg="Starting TSDB ..."
level=info ts=2020-10-13T07:55:23.789Z caller=head.go:584 component=tsdb msg="replaying WAL, this may take awhile"
level=info ts=2020-10-13T07:55:23.791Z caller=head.go:632 component=tsdb msg="WAL segment loaded" segment=0 maxSegment=0
level=info ts=2020-10-13T07:55:23.796Z caller=main.go:663 fs_type=NFS_SUPER_MAGIC
level=info ts=2020-10-13T07:55:23.796Z caller=main.go:664 msg="TSDB started"
level=info ts=2020-10-13T07:55:23.796Z caller=main.go:734 msg="Loading configuration file" filename=/etc/config/prometheus.yml
level=info ts=2020-10-13T07:55:23.805Z caller=kubernetes.go:190 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2020-10-13T07:55:23.808Z caller=kubernetes.go:190 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2020-10-13T07:55:23.810Z caller=kubernetes.go:190 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2020-10-13T07:55:23.813Z caller=kubernetes.go:190 component="discovery manager notify" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2020-10-13T07:55:23.816Z caller=main.go:762 msg="Completed loading of configuration file" filename=/etc/config/prometheus.yml
level=info ts=2020-10-13T07:55:23.816Z caller=main.go:617 msg="Server is ready to receive web requests."

$ kubectl logs prometheus-alertmanager-58b88c665b-b7sqt prometheus-alertmanager -n gitlab-managed-apps
level=info ts=2020-10-13T07:55:24.377Z caller=main.go:231 msg="Starting Alertmanager" version="(version=0.20.0, branch=HEAD, revision=f74be0400a6243d10bb53812d6fa408ad71ff32d)"
level=info ts=2020-10-13T07:55:24.377Z caller=main.go:232 build_context="(go=go1.13.5, user=root@00c3106655f8, date=20191211-14:13:14)"
level=info ts=2020-10-13T07:55:24.516Z caller=cluster.go:623 component=cluster msg="Waiting for gossip to settle..." interval=2s
level=info ts=2020-10-13T07:55:24.545Z caller=coordinator.go:119 component=configuration msg="Loading configuration file" file=/etc/config/alertmanager.yml
level=info ts=2020-10-13T07:55:24.545Z caller=coordinator.go:131 component=configuration msg="Completed loading of configuration file" file=/etc/config/alertmanager.yml
level=info ts=2020-10-13T07:55:24.548Z caller=main.go:497 msg=Listening address=:9093
level=info ts=2020-10-13T07:55:26.516Z caller=cluster.go:648 component=cluster msg="gossip not settled" polls=0 before=0 now=1 elapsed=2.00012541s
level=info ts=2020-10-13T07:55:34.517Z caller=cluster.go:640 component=cluster msg="gossip settled; proceeding" elapsed=10.000979807s
  • No labels