|
|
马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?立即注册
x
1. Kubernetes与微服务概述
1.1 Kubernetes简介
Kubernetes(简称K8s)是一个开源的容器编排平台,由Google基于其内部系统Borg的经验开发并于2014年开源。它自动化了容器化应用的部署、扩展和管理,为容器化应用提供了声明式配置和自动化。Kubernetes已成为云原生应用开发和部署的事实标准,支持跨多个主机的容器集群管理,提供应用容器的快速部署、扩展、维护和故障转移。
1.2 微服务架构概述
微服务架构是一种将应用程序设计为小型、自治服务集合的架构风格,每个服务实现特定业务功能,运行在自己的进程中,通过轻量级机制(通常是HTTP/REST API)通信。微服务架构支持独立部署、扩展和维护,使团队能够使用最适合其服务的技术栈。
1.3 Kubernetes与微服务的结合
Kubernetes为微服务架构提供了理想的运行环境:
• 服务发现与负载均衡:Kubernetes内置服务发现和负载均衡机制,简化了微服务间的通信。
• 自动扩展:基于CPU/内存使用率或自定义指标自动扩展服务实例数量。
• 自我修复:自动重启失败的容器,替换和重新调度容器,确保服务高可用。
• 配置管理:集中管理应用配置,支持配置热更新。
• 存储编排:自动挂载本地或云存储系统,为微服务提供持久化存储。
2. Kubernetes核心架构与组件
2.1 Kubernetes架构概览
Kubernetes采用主从(Master-Node)架构,主要由控制平面(Control Plane)和工作节点(Worker Nodes)组成:
• 控制平面:负责集群管理,包括API服务器、调度器、控制器管理器和etcd。
• 工作节点:运行应用容器,包括kubelet、kube-proxy和容器运行时。
2.2 核心组件详解
API服务器(kube-apiserver)
• API服务器是Kubernetes控制平面的前端,处理REST操作,验证和更新API对象的状态。
• 所有组件(包括用户、控制器、节点等)都通过API服务器交互。
• 水平可扩展,通常部署多个实例以实现高可用。
- # API服务器配置示例
- apiVersion: v1
- kind: Pod
- metadata:
- name: kube-apiserver
- namespace: kube-system
- spec:
- containers:
- - command:
- - kube-apiserver
- - --advertise-address=192.168.1.100
- - --allow-privileged=true
- - --authorization-mode=Node,RBAC
- - --client-ca-file=/etc/kubernetes/pki/ca.crt
- - --enable-admission-plugins=NodeRestriction
- - --enable-bootstrap-token-auth=true
- image: k8s.gcr.io/kube-apiserver:v1.22.0
复制代码
etcd
• etcd是分布式键值存储,保存集群的配置和状态信息。
• 提供高可用、强一致性的数据存储。
• 需要定期备份以防止数据丢失。
- # etcd备份示例
- ETCDCTL_API=3 etcdctl snapshot save snapshot.db \
- --endpoints=https://127.0.0.1:2379 \
- --cacert=/etc/kubernetes/pki/etcd/ca.crt \
- --cert=/etc/kubernetes/pki/etcd/server.crt \
- --key=/etc/kubernetes/pki/etcd/server.key
- # etcd恢复示例
- ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \
- --data-dir /var/lib/etcd-restore \
- --initial-cluster master-1=https://192.168.1.100:2380 \
- --initial-advertise-peer-urls https://192.168.1.100:2380
复制代码
调度器(kube-scheduler)
• 调度器负责将Pod分配到合适的节点上。
• 根据资源需求、硬件约束、亲和性和反亲和性规则、数据位置、工作负载间干扰等因素做出调度决策。
- # 自定义调度器配置示例
- apiVersion: kubescheduler.config.k8s.io/v1beta1
- kind: KubeSchedulerConfiguration
- profiles:
- - schedulerName: custom-scheduler
- plugins:
- score:
- disabled:
- - name: NodeResourcesBalancedAllocation
- enabled:
- - name: CustomScorer
复制代码
控制器管理器(kube-controller-manager)
• 控制器管理器运行核心控制器进程,包括节点控制器、副本控制器、端点控制器等。
• 监控集群状态,并在实际状态与期望状态不符时采取行动。
- # 控制器管理器配置示例
- apiVersion: v1
- kind: Pod
- metadata:
- name: kube-controller-manager
- namespace: kube-system
- spec:
- containers:
- - command:
- - kube-controller-manager
- - --allocate-node-cidrs=true
- - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
- - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- - --cluster-cidr=10.244.0.0/16
- image: k8s.gcr.io/kube-controller-manager:v1.22.0
复制代码
kubelet
• kubelet是运行在每个节点上的代理,确保容器在Pod中运行。
• 与API服务器通信,接收Pod规范并确保容器健康运行。
• 管理节点上的容器生命周期。
- # kubelet配置示例 (/etc/systemd/system/kubelet.service.d/10-kubeadm.conf)
- [Service]
- Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
- Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
- ExecStart=
- ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
复制代码
kube-proxy
• kube-proxy维护节点上的网络规则,实现Kubernetes服务概念。
• 负责为服务提供负载均衡,支持多种代理模式:userspace、iptables和IPVS。
- # kube-proxy配置示例
- apiVersion: kubeproxy.config.k8s.io/v1alpha1
- bindAddress: 0.0.0.0
- clientConnection:
- acceptContentTypes: ""
- burst: 10
- contentType: application/vnd.kubernetes.protobuf
- kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
- qps: 5
- clusterCIDR: "10.244.0.0/16"
- configSyncPeriod: 15m0s
- conntrack:
- maxPerCore: 32768
- min: 131072
- tcpCloseWaitTimeout: 1h0m0s
- tcpEstablishedTimeout: 24h0m0s
- mode: "ipvs"
复制代码
容器运行时
• 容器运行时负责运行容器,如Docker、containerd、CRI-O等。
• Kubernetes通过容器运行时接口(CRI)与容器运行时交互。
- # containerd配置示例
- cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
- overlay
- br_netfilter
- EOF
- sudo modprobe overlay
- sudo modprobe br_netfilter
- cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
- net.bridge.bridge-nf-call-iptables = 1
- net.ipv4.ip_forward = 1
- net.bridge.bridge-nf-call-ip6tables = 1
- EOF
- sudo sysctl --system
- sudo mkdir -p /etc/containerd
- sudo containerd config default | sudo tee /etc/containerd/config.toml
- sudo systemctl restart containerd
复制代码
3. Kubernetes安装与配置
3.1 部署方式选择
Kubernetes有多种部署方式,根据需求选择合适的方案:
• kubeadm:官方推荐的部署工具,适合生产环境,提供引导式安装流程。
• 二进制部署:手动下载和配置各个组件,适合了解Kubernetes内部工作原理。
• 托管Kubernetes服务:如GKE、EKS、AKS,云提供商提供的托管服务,简化运维。
• 第三方工具:如Rancher、OpenShift,提供额外的管理功能和用户界面。
3.2 使用kubeadm部署Kubernetes集群
- # 在所有节点上执行
- # 关闭防火墙
- sudo systemctl stop firewalld
- sudo systemctl disable firewalld
- # 禁用SELinux
- sudo setenforce 0
- sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
- # 禁用swap
- sudo swapoff -a
- sudo sed -i '/swap/d' /etc/fstab
- # 安装Docker
- sudo yum install -y yum-utils
- sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
- sudo yum install -y containerd.io docker-ce docker-ce-cli
- sudo mkdir -p /etc/docker
- cat <<EOF | sudo tee /etc/docker/daemon.json
- {
- "exec-opts": ["native.cgroupdriver=systemd"],
- "log-driver": "json-file",
- "log-opts": {
- "max-size": "100m"
- },
- "storage-driver": "overlay2"
- }
- EOF
- sudo systemctl enable docker
- sudo systemctl daemon-reload
- sudo systemctl restart docker
- # 安装kubeadm, kubelet, kubectl
- cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
- [kubernetes]
- name=Kubernetes
- baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-\$basearch
- enabled=1
- gpgcheck=1
- repo_gpgcheck=1
- gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
- exclude=kubelet kubeadm kubectl
- EOF
- sudo yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
- sudo systemctl enable --now kubelet
- # 配置内核参数
- cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
- net.bridge.bridge-nf-call-ip6tables = 1
- net.bridge.bridge-nf-call-iptables = 1
- net.ipv4.ip_forward = 1
- EOF
- sudo sysctl --system
复制代码- # 在主节点上执行
- sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=<master-ip>
- # 配置kubectl
- mkdir -p $HOME/.kube
- sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
- sudo chown $(id -u):$(id -g) $HOME/.kube/config
- # 安装网络插件(以Flannel为例)
- kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
复制代码- # 在工作节点上执行(使用kubeadm init输出的join命令)
- sudo kubeadm join <master-ip>:6443 --token <token> --discovery-token-ca-cert-hash <hash>
复制代码
3.3 高可用配置
- # 配置负载均衡器(以HAProxy为例)
- cat <<EOF | sudo tee /etc/haproxy/haproxy.cfg
- frontend kubernetes-frontend
- bind *:6443
- mode tcp
- option tcplog
- default_backend kubernetes-backend
- backend kubernetes-backend
- mode tcp
- balance roundrobin
- option tcp-check
- server master1 192.168.1.100:6443 check
- server master2 192.168.1.101:6443 check
- server master3 192.168.1.102:6443 check
- EOF
- sudo systemctl restart haproxy
- # 初始化第一个控制平面节点
- sudo kubeadm init --control-plane-endpoint "LOAD_BALANCER_DNS:LOAD_BALANCER_PORT" --upload-certs --pod-network-cidr=10.244.0.0/16
- # 加入其他控制平面节点
- sudo kubeadm join LOAD_BALANCER_DNS:LOAD_BALANCER_PORT --token <token> --discovery-token-ca-cert-hash <hash> --control-plane --certificate-key <key>
复制代码
4. Kubernetes核心概念与资源对象
4.1 Pod
Pod是Kubernetes中最小的可部署单元,包含一个或多个紧密关联的容器。Pod中的容器共享网络命名空间和存储卷,可以方便地相互通信。
- # Pod示例
- apiVersion: v1
- kind: Pod
- metadata:
- name: myapp-pod
- labels:
- app: myapp
- spec:
- containers:
- - name: myapp-container
- image: busybox
- command: ['sh', '-c', 'echo Hello Kubernetes! && sleep 3600']
- - name: sidecar-container
- image: fluentd
- volumeMounts:
- - name: log-volume
- mountPath: /var/log
- volumes:
- - name: log-volume
- emptyDir: {}
复制代码
4.2 Deployment
Deployment管理Pod的部署和扩展,提供声明式更新和回滚功能。Deployment使用ReplicaSet确保指定数量的Pod副本在运行。
- # Deployment示例
- apiVersion: apps/v1
- kind: Deployment
- metadata:
- name: nginx-deployment
- labels:
- app: nginx
- spec:
- replicas: 3
- selector:
- matchLabels:
- app: nginx
- template:
- metadata:
- labels:
- app: nginx
- spec:
- containers:
- - name: nginx
- image: nginx:1.14.2
- ports:
- - containerPort: 80
- resources:
- requests:
- memory: "64Mi"
- cpu: "250m"
- limits:
- memory: "128Mi"
- cpu: "500m"
- # 使用节点亲和性调度
- affinity:
- nodeAffinity:
- requiredDuringSchedulingIgnoredDuringExecution:
- nodeSelectorTerms:
- - matchExpressions:
- - key: disktype
- operator: In
- values:
- - ssd
复制代码
4.3 Service
Service为Pod提供稳定的网络端点,实现负载均衡和服务发现。Kubernetes支持多种Service类型:ClusterIP、NodePort、LoadBalancer和ExternalName。
- # Service示例
- apiVersion: v1
- kind: Service
- metadata:
- name: my-service
- spec:
- selector:
- app: myapp
- ports:
- - protocol: TCP
- port: 80
- targetPort: 9376
- # ClusterIP类型(默认)
- type: ClusterIP
复制代码- # NodePort Service示例
- apiVersion: v1
- kind: Service
- metadata:
- name: my-nodeport-service
- spec:
- type: NodePort
- selector:
- app: myapp
- ports:
- - port: 80
- targetPort: 80
- # 可选,默认自动分配
- nodePort: 30007
复制代码
4.4 ConfigMap与Secret
ConfigMap用于存储非敏感配置数据,Secret用于存储敏感数据如密码、API密钥等。
- # ConfigMap示例
- apiVersion: v1
- kind: ConfigMap
- metadata:
- name: app-config
- data:
- database_url: "jdbc:mysql://db.example.com:3306/mydb"
- api_endpoint: "https://api.example.com/v1"
- log_level: "INFO"
复制代码- # Secret示例
- apiVersion: v1
- kind: Secret
- metadata:
- name: db-secret
- type: Opaque
- data:
- # echo -n "admin" | base64
- username: YWRtaW4=
- # echo -n "password123" | base64
- password: cGFzc3dvcmQxMjM=
复制代码- # 在Pod中使用ConfigMap和Secret
- apiVersion: v1
- kind: Pod
- metadata:
- name: configmap-secret-pod
- spec:
- containers:
- - name: myapp
- image: myapp:1.0
- envFrom:
- - configMapRef:
- name: app-config
- env:
- - name: DB_PASSWORD
- valueFrom:
- secretKeyRef:
- name: db-secret
- key: password
- volumeMounts:
- - name: config-volume
- mountPath: /etc/config
- volumes:
- - name: config-volume
- configMap:
- name: app-config
复制代码
4.5 Volume与PersistentVolume
Volume为Pod提供持久化存储,PersistentVolume(PV)和PersistentVolumeClaim(PVC)提供存储资源抽象。
- # PersistentVolume示例
- apiVersion: v1
- kind: PersistentVolume
- metadata:
- name: pv-nfs
- spec:
- capacity:
- storage: 10Gi
- accessModes:
- - ReadWriteMany
- nfs:
- path: /data
- server: nfs-server.example.com
复制代码- # PersistentVolumeClaim示例
- apiVersion: v1
- kind: PersistentVolumeClaim
- metadata:
- name: pvc-nfs
- spec:
- accessModes:
- - ReadWriteMany
- resources:
- requests:
- storage: 5Gi
复制代码- # 在Pod中使用PVC
- apiVersion: v1
- kind: Pod
- metadata:
- name: pod-with-pvc
- spec:
- containers:
- - name: myapp
- image: myapp:1.0
- volumeMounts:
- - name: nfs-storage
- mountPath: /data
- volumes:
- - name: nfs-storage
- persistentVolumeClaim:
- claimName: pvc-nfs
复制代码
4.6 Namespace
Namespace将集群划分为多个虚拟集群,实现资源隔离和多租户。
- # Namespace示例
- apiVersion: v1
- kind: Namespace
- metadata:
- name: development
- labels:
- name: development
复制代码- # 设置默认命名空间
- kubectl config set-context --current --namespace=development
复制代码
4.7 Ingress
Ingress管理外部访问集群内服务的HTTP和HTTPS路由,提供负载均衡、SSL终止和基于名称的虚拟主机。
- # Ingress示例
- apiVersion: networking.k8s.io/v1
- kind: Ingress
- metadata:
- name: my-ingress
- annotations:
- nginx.ingress.kubernetes.io/rewrite-target: /
- spec:
- rules:
- - host: myapp.example.com
- http:
- paths:
- - path: /
- pathType: Prefix
- backend:
- service:
- name: my-service
- port:
- number: 80
复制代码
5. 微服务部署实践
5.1 微服务架构设计原则
在Kubernetes上部署微服务时,应遵循以下设计原则:
• 单一职责:每个微服务专注于特定业务功能。
• 松耦合:服务间通过API通信,避免共享数据库。
• 高内聚:相关功能应组织在同一服务中。
• 有界上下文:明确服务边界和责任范围。
• 去中心化治理:各团队可自主选择技术栈。
• 自动化部署:实现CI/CD流水线,支持快速迭代。
• 容错设计:服务应具备故障隔离和恢复能力。
5.2 微服务部署模式
每个微服务部署为单个容器,这是最常见的模式。
- # 单容器微服务部署示例
- apiVersion: apps/v1
- kind: Deployment
- metadata:
- name: user-service
- spec:
- replicas: 3
- selector:
- matchLabels:
- app: user-service
- template:
- metadata:
- labels:
- app: user-service
- spec:
- containers:
- - name: user-service
- image: myregistry/user-service:1.0.0
- ports:
- - containerPort: 8080
- env:
- - name: DB_HOST
- value: "mysql-service"
- - name: DB_PORT
- value: "3306"
- livenessProbe:
- httpGet:
- path: /health
- port: 8080
- initialDelaySeconds: 30
- periodSeconds: 10
- readinessProbe:
- httpGet:
- path: /ready
- port: 8080
- initialDelaySeconds: 5
- periodSeconds: 5
- ---
- apiVersion: v1
- kind: Service
- metadata:
- name: user-service
- spec:
- selector:
- app: user-service
- ports:
- - protocol: TCP
- port: 80
- targetPort: 8080
复制代码
某些微服务可能需要多个容器协同工作,如应用容器和日志收集容器。
- # 多容器微服务部署示例
- apiVersion: apps/v1
- kind: Deployment
- metadata:
- name: payment-service
- spec:
- replicas: 2
- selector:
- matchLabels:
- app: payment-service
- template:
- metadata:
- labels:
- app: payment-service
- spec:
- containers:
- - name: payment-app
- image: myregistry/payment-service:1.0.0
- ports:
- - containerPort: 8080
- volumeMounts:
- - name: log-volume
- mountPath: /var/log/payment
- - name: log-collector
- image: fluent/fluentd:v1.14-1
- volumeMounts:
- - name: log-volume
- mountPath: /var/log/payment
- - name: config-volume
- mountPath: /fluentd/etc
- volumes:
- - name: log-volume
- emptyDir: {}
- - name: config-volume
- configMap:
- name: fluentd-config
- ---
- apiVersion: v1
- kind: ConfigMap
- metadata:
- name: fluentd-config
- data:
- fluent.conf: |
- <source>
- @type tail
- path /var/log/payment/*.log
- pos_file /var/log/payment.log.pos
- tag payment.*
- format json
- </source>
-
- <match payment.**>
- @type elasticsearch
- host elasticsearch-service
- port 9200
- index_name payment
- type_name _doc
- </match>
复制代码
5.3 微服务通信模式
使用REST或gRPC进行同步通信,通过Kubernetes Service实现服务发现。
- # REST API服务示例
- apiVersion: apps/v1
- kind: Deployment
- metadata:
- name: order-service
- spec:
- replicas: 2
- template:
- spec:
- containers:
- - name: order-service
- image: myregistry/order-service:1.0.0
- env:
- - name: USER_SERVICE_URL
- value: "http://user-service"
- - name: PRODUCT_SERVICE_URL
- value: "http://product-service"
复制代码- // Go服务调用示例
- func getUserInfo(userID string) (*User, error) {
- userServiceURL := os.Getenv("USER_SERVICE_URL")
- if userServiceURL == "" {
- userServiceURL = "http://user-service.default.svc.cluster.local"
- }
-
- resp, err := http.Get(fmt.Sprintf("%s/users/%s", userServiceURL, userID))
- if err != nil {
- return nil, fmt.Errorf("failed to call user service: %v", err)
- }
- defer resp.Body.Close()
-
- if resp.StatusCode != http.StatusOK {
- return nil, fmt.Errorf("user service returned non-200 status: %d", resp.StatusCode)
- }
-
- var user User
- if err := json.NewDecoder(resp.Body).Decode(&user); err != nil {
- return nil, fmt.Errorf("failed to decode user response: %v", err)
- }
-
- return &user, nil
- }
复制代码
使用消息队列实现异步通信,如Kafka、RabbitMQ等。
- # Kafka部署示例
- apiVersion: apps/v1
- kind: StatefulSet
- metadata:
- name: kafka
- spec:
- serviceName: kafka
- replicas: 3
- selector:
- matchLabels:
- app: kafka
- template:
- metadata:
- labels:
- app: kafka
- spec:
- containers:
- - name: kafka
- image: confluentinc/cp-kafka:latest
- ports:
- - containerPort: 9092
- env:
- - name: KAFKA_BROKER_ID
- valueFrom:
- fieldRef:
- fieldPath: metadata.name
- - name: KAFKA_ZOOKEEPER_CONNECT
- value: "zookeeper:2181"
- - name: KAFKA_ADVERTISED_LISTENERS
- value: "PLAINTEXT://kafka-0.kafka.default.svc.cluster.local:9092,kafka-1.kafka.default.svc.cluster.local:9092,kafka-2.kafka.default.svc.cluster.local:9092"
- volumeMounts:
- - name: data
- mountPath: /var/lib/kafka/data
- volumeClaimTemplates:
- - metadata:
- name: data
- spec:
- accessModes: [ "ReadWriteOnce" ]
- resources:
- requests:
- storage: 10Gi
- ---
- apiVersion: v1
- kind: Service
- metadata:
- name: kafka
- spec:
- ports:
- - port: 9092
- name: server
- clusterIP: None
- selector:
- app: kafka
复制代码- // Go Kafka生产者示例
- func produceOrderEvent(order Order) error {
- broker := os.Getenv("KAFKA_BROKERS")
- if broker == "" {
- broker = "kafka:9092"
- }
-
- config := sarama.NewConfig()
- config.Producer.Return.Successes = true
-
- producer, err := sarama.NewSyncProducer([]string{broker}, config)
- if err != nil {
- return fmt.Errorf("failed to create Kafka producer: %v", err)
- }
- defer producer.Close()
-
- orderJSON, err := json.Marshal(order)
- if err != nil {
- return fmt.Errorf("failed to marshal order: %v", err)
- }
-
- msg := &sarama.ProducerMessage{
- Topic: "orders",
- Value: sarama.StringEncoder(orderJSON),
- }
-
- _, _, err = producer.SendMessage(msg)
- if err != nil {
- return fmt.Errorf("failed to send message: %v", err)
- }
-
- return nil
- }
复制代码
5.4 微服务配置管理
使用ConfigMap和Secret管理不同环境的配置。
- # 开发环境ConfigMap
- apiVersion: v1
- kind: ConfigMap
- metadata:
- name: app-config-dev
- namespace: development
- data:
- log_level: "DEBUG"
- database_url: "jdbc:mysql://dev-db:3306/myapp_dev"
- feature_flags: "new-ui,experimental-api"
复制代码- # 生产环境ConfigMap
- apiVersion: v1
- kind: ConfigMap
- metadata:
- name: app-config-prod
- namespace: production
- data:
- log_level: "INFO"
- database_url: "jdbc:mysql://prod-db.cluster.example.com:3306/myapp_prod"
- feature_flags: "new-ui"
复制代码- # 使用ConfigMap的部署
- apiVersion: apps/v1
- kind: Deployment
- metadata:
- name: myapp
- namespace: development
- spec:
- template:
- spec:
- containers:
- - name: myapp
- image: myregistry/myapp:1.0.0
- envFrom:
- - configMapRef:
- name: app-config-dev
复制代码
使用配置中心如Spring Cloud Config、Consul或Apollo实现动态配置更新。
- # Spring Cloud Config Server部署
- apiVersion: apps/v1
- kind: Deployment
- metadata:
- name: config-server
- spec:
- replicas: 1
- selector:
- matchLabels:
- app: config-server
- template:
- metadata:
- labels:
- app: config-server
- spec:
- containers:
- - name: config-server
- image: myregistry/config-server:1.0.0
- ports:
- - containerPort: 8888
- env:
- - name: SPRING_PROFILES_ACTIVE
- value: "native"
- - name: SPRING_CLOUD_CONFIG_SERVER_NATIVE_SEARCH_LOCATIONS
- value: "file:/config"
- volumeMounts:
- - name: config-repo
- mountPath: /config
- volumes:
- - name: config-repo
- gitRepo:
- repository: https://github.com/myorg/config-repo.git
- revision: main
- ---
- apiVersion: v1
- kind: Service
- metadata:
- name: config-server
- spec:
- selector:
- app: config-server
- ports:
- - protocol: TCP
- port: 8888
- targetPort: 8888
复制代码- # 使用配置中心的微服务
- apiVersion: apps/v1
- kind: Deployment
- metadata:
- name: user-service
- spec:
- template:
- spec:
- containers:
- - name: user-service
- image: myregistry/user-service:1.0.0
- env:
- - name: SPRING_PROFILES_ACTIVE
- value: "kubernetes"
- - name: SPRING_CLOUD_CONFIG_URI
- value: "http://config-server:8888"
- - name: SPRING_CLOUD_CONFIG_NAME
- value: "user-service"
- livenessProbe:
- httpGet:
- path: /actuator/health
- port: 8080
- initialDelaySeconds: 60
- periodSeconds: 30
复制代码
5.5 微服务监控与日志
部署Prometheus和Grafana实现微服务监控。
- # Prometheus部署
- apiVersion: apps/v1
- kind: Deployment
- metadata:
- name: prometheus
- spec:
- replicas: 1
- selector:
- matchLabels:
- app: prometheus
- template:
- metadata:
- labels:
- app: prometheus
- spec:
- containers:
- - name: prometheus
- image: prom/prometheus:v2.30.0
- ports:
- - containerPort: 9090
- volumeMounts:
- - name: prometheus-config
- mountPath: /etc/prometheus
- - name: prometheus-data
- mountPath: /prometheus
- volumes:
- - name: prometheus-config
- configMap:
- name: prometheus-config
- - name: prometheus-data
- emptyDir: {}
- ---
- apiVersion: v1
- kind: ConfigMap
- metadata:
- name: prometheus-config
- data:
- prometheus.yml: |
- global:
- scrape_interval: 15s
- scrape_configs:
- - job_name: 'kubernetes-pods'
- kubernetes_sd_configs:
- - role: pod
- relabel_configs:
- - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
- action: keep
- regex: true
- - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
- action: replace
- target_label: __metrics_path__
- regex: (.+)
- - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
- action: replace
- regex: ([^:]+)(?::\d+)?;(\d+)
- replacement: $1:$2
- target_label: __address__
- - action: labelmap
- regex: __meta_kubernetes_pod_label_(.+)
- - source_labels: [__meta_kubernetes_namespace]
- action: replace
- target_label: kubernetes_namespace
- - source_labels: [__meta_kubernetes_pod_name]
- action: replace
- target_label: kubernetes_pod_name
复制代码- # Grafana部署
- apiVersion: apps/v1
- kind: Deployment
- metadata:
- name: grafana
- spec:
- replicas: 1
- selector:
- matchLabels:
- app: grafana
- template:
- metadata:
- labels:
- app: grafana
- spec:
- containers:
- - name: grafana
- image: grafana/grafana:8.2.0
- ports:
- - containerPort: 3000
- env:
- - name: GF_SECURITY_ADMIN_PASSWORD
- valueFrom:
- secretKeyRef:
- name: grafana-secret
- key: admin-password
- volumeMounts:
- - name: grafana-data
- mountPath: /var/lib/grafana
- volumes:
- - name: grafana-data
- emptyDir: {}
- ---
- apiVersion: v1
- kind: Service
- metadata:
- name: grafana
- spec:
- selector:
- app: grafana
- ports:
- - protocol: TCP
- port: 80
- targetPort: 3000
- type: NodePort
复制代码
部署EFK(Elasticsearch、Fluentd、Kibana)栈实现日志收集和分析。
- # Elasticsearch部署
- apiVersion: apps/v1
- kind: StatefulSet
- metadata:
- name: elasticsearch
- spec:
- serviceName: elasticsearch
- replicas: 3
- selector:
- matchLabels:
- app: elasticsearch
- template:
- metadata:
- labels:
- app: elasticsearch
- spec:
- containers:
- - name: elasticsearch
- image: docker.elastic.co/elasticsearch/elasticsearch:7.15.0
- ports:
- - containerPort: 9200
- name: rest
- - containerPort: 9300
- name: inter-node
- env:
- - name: discovery.zen.minimum_master_nodes
- value: "2"
- - name: discovery.seed_hosts
- value: "elasticsearch-0.elasticsearch,elasticsearch-1.elasticsearch,elasticsearch-2.elasticsearch"
- - name: cluster.initial_master_nodes
- value: "elasticsearch-0,elasticsearch-1,elasticsearch-2"
- - name: ES_JAVA_OPTS
- value: "-Xms512m -Xmx512m"
- volumeMounts:
- - name: data
- mountPath: /usr/share/elasticsearch/data
- volumeClaimTemplates:
- - metadata:
- name: data
- spec:
- accessModes: [ "ReadWriteOnce" ]
- resources:
- requests:
- storage: 10Gi
复制代码- # Fluentd DaemonSet部署
- apiVersion: apps/v1
- kind: DaemonSet
- metadata:
- name: fluentd
- spec:
- selector:
- matchLabels:
- name: fluentd
- template:
- metadata:
- labels:
- name: fluentd
- spec:
- tolerations:
- - key: node-role.kubernetes.io/master
- effect: NoSchedule
- containers:
- - name: fluentd
- image: fluent/fluentd-kubernetes-daemonset:v1.14-debian-elasticsearch7-1
- env:
- - name: FLUENT_ELASTICSEARCH_HOST
- value: "elasticsearch"
- - name: FLUENT_ELASTICSEARCH_PORT
- value: "9200"
- - name: FLUENT_ELASTICSEARCH_SCHEME
- value: "http"
- resources:
- limits:
- memory: 512Mi
- requests:
- cpu: 100m
- memory: 200Mi
- volumeMounts:
- - name: varlog
- mountPath: /var/log
- - name: varlibdockercontainers
- mountPath: /var/lib/docker/containers
- readOnly: true
- terminationGracePeriodSeconds: 30
- volumes:
- - name: varlog
- hostPath:
- path: /var/log
- - name: varlibdockercontainers
- hostPath:
- path: /var/lib/docker/containers
复制代码- # Kibana部署
- apiVersion: apps/v1
- kind: Deployment
- metadata:
- name: kibana
- spec:
- replicas: 1
- selector:
- matchLabels:
- app: kibana
- template:
- metadata:
- labels:
- app: kibana
- spec:
- containers:
- - name: kibana
- image: docker.elastic.co/kibana/kibana:7.15.0
- ports:
- - containerPort: 5601
- env:
- - name: ELASTICSEARCH_HOSTS
- value: http://elasticsearch:9200
- ---
- apiVersion: v1
- kind: Service
- metadata:
- name: kibana
- spec:
- selector:
- app: kibana
- ports:
- - protocol: TCP
- port: 5601
- targetPort: 5601
- type: NodePort
复制代码
6. 企业级Kubernetes应用技巧
6.1 多环境管理
使用命名空间隔离不同环境,并配置网络策略和资源配额。
- # 开发环境命名空间
- apiVersion: v1
- kind: Namespace
- metadata:
- name: development
- labels:
- name: development
- env: dev
- ---
- # 生产环境命名空间
- apiVersion: v1
- kind: Namespace
- metadata:
- name: production
- labels:
- name: production
- env: prod
复制代码- # 资源配额示例
- apiVersion: v1
- kind: ResourceQuota
- metadata:
- name: dev-quota
- namespace: development
- spec:
- hard:
- requests.cpu: "4"
- requests.memory: 8Gi
- limits.cpu: "10"
- limits.memory: 16Gi
- persistentvolumeclaims: "5"
- requests.storage: "50Gi"
复制代码- # 网络策略示例
- apiVersion: networking.k8s.io/v1
- kind: NetworkPolicy
- metadata:
- name: dev-network-policy
- namespace: development
- spec:
- podSelector: {}
- policyTypes:
- - Ingress
- - Egress
- ingress:
- - from:
- - namespaceSelector:
- matchLabels:
- name: development
- - namespaceSelector:
- matchLabels:
- name: shared-services
- egress:
- - to:
- - namespaceSelector:
- matchLabels:
- name: development
- - namespaceSelector:
- matchLabels:
- name: shared-services
复制代码
使用Argo CD或Flux CD实现GitOps工作流,管理多环境部署。
- # Argo CD Application示例
- apiVersion: argoproj.io/v1alpha1
- kind: Application
- metadata:
- name: myapp-dev
- namespace: argocd
- spec:
- project: default
- source:
- repoURL: https://github.com/myorg/myapp-k8s-manifests.git
- targetRevision: HEAD
- path: overlays/development
- destination:
- server: https://kubernetes.default.svc
- namespace: development
- syncPolicy:
- automated:
- prune: true
- selfHeal: true
- ---
- apiVersion: argoproj.io/v1alpha1
- kind: Application
- metadata:
- name: myapp-prod
- namespace: argocd
- spec:
- project: default
- source:
- repoURL: https://github.com/myorg/myapp-k8s-manifests.git
- targetRevision: HEAD
- path: overlays/production
- destination:
- server: https://kubernetes.default.svc
- namespace: production
- syncPolicy:
- automated:
- prune: true
- selfHeal: true
- syncOptions:
- - Validate=false
复制代码
6.2 安全最佳实践
使用基于角色的访问控制(RBAC)限制用户和服务账户的权限。
- # 角色定义
- apiVersion: rbac.authorization.k8s.io/v1
- kind: Role
- metadata:
- namespace: development
- name: developer-role
- rules:
- - apiGroups: [""]
- resources: ["pods", "services", "configmaps", "secrets"]
- verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- - apiGroups: ["apps"]
- resources: ["deployments", "replicasets"]
- verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
复制代码- # 角色绑定
- apiVersion: rbac.authorization.k8s.io/v1
- kind: RoleBinding
- metadata:
- name: developer-binding
- namespace: development
- subjects:
- - kind: User
- name: developer1
- apiGroup: rbac.authorization.k8s.io
- roleRef:
- kind: Role
- name: developer-role
- apiGroup: rbac.authorization.k8s.io
复制代码- # 集群角色定义
- apiVersion: rbac.authorization.k8s.io/v1
- kind: ClusterRole
- metadata:
- name: cluster-admin-view
- rules:
- - apiGroups: [""]
- resources: ["nodes", "namespaces"]
- verbs: ["get", "list", "watch"]
复制代码
使用Pod安全策略(Pod Security Policies)或Pod安全准入控制器(Pod Security Admission)增强Pod安全性。
- # Pod安全策略示例
- apiVersion: policy/v1beta1
- kind: PodSecurityPolicy
- metadata:
- name: restricted-psp
- spec:
- privileged: false
- allowPrivilegeEscalation: false
- requiredDropCapabilities:
- - ALL
- volumes:
- - 'configMap'
- - 'emptyDir'
- - 'projected'
- - 'secret'
- - 'downwardAPI'
- - 'persistentVolumeClaim'
- runAsUser:
- rule: 'MustRunAsNonRoot'
- seLinux:
- rule: 'RunAsAny'
- fsGroup:
- rule: 'RunAsAny'
复制代码- # Pod安全标准(使用Pod Security Admission)
- apiVersion: v1
- kind: Namespace
- metadata:
- name: secure-namespace
- labels:
- pod-security.kubernetes.io/enforce: restricted
- pod-security.kubernetes.io/enforce-version: v1.24
- pod-security.kubernetes.io/audit: restricted
- pod-security.kubernetes.io/audit-version: v1.24
- pod-security.kubernetes.io/warn: restricted
- pod-security.kubernetes.io/warn-version: v1.24
复制代码
使用网络策略控制Pod间通信,实现零信任网络。
- # 默认拒绝所有入站和出站流量
- apiVersion: networking.k8s.io/v1
- kind: NetworkPolicy
- metadata:
- name: default-deny-all
- namespace: production
- spec:
- podSelector: {}
- policyTypes:
- - Ingress
- - Egress
复制代码- # 允许特定命名空间间的通信
- apiVersion: networking.k8s.io/v1
- kind: NetworkPolicy
- metadata:
- name: allow-api-to-db
- namespace: production
- spec:
- podSelector:
- matchLabels:
- app: database
- policyTypes:
- - Ingress
- ingress:
- - from:
- - namespaceSelector:
- matchLabels:
- name: api
- - podSelector:
- matchLabels:
- app: api-service
- ports:
- - protocol: TCP
- port: 3306
复制代码
6.3 高可用与灾备
使用多区域部署实现高可用和灾备。
- # 多区域部署示例(使用反亲和性)
- apiVersion: apps/v1
- kind: Deployment
- metadata:
- name: critical-service
- spec:
- replicas: 6
- template:
- spec:
- affinity:
- podAntiAffinity:
- requiredDuringSchedulingIgnoredDuringExecution:
- - labelSelector:
- matchExpressions:
- - key: app
- operator: In
- values:
- - critical-service
- topologyKey: "kubernetes.io/hostname"
- podAntiAffinity:
- preferredDuringSchedulingIgnoredDuringExecution:
- - weight: 100
- podAffinityTerm:
- labelSelector:
- matchExpressions:
- - key: app
- operator: In
- values:
- - critical-service
- topologyKey: "topology.kubernetes.io/zone"
- containers:
- - name: critical-service
- image: myregistry/critical-service:1.0.0
- ports:
- - containerPort: 8080
复制代码
使用Velero实现集群备份与恢复。
- # 安装Velero
- velero install \
- --provider aws \
- --bucket velero-backups \
- --secret-file ./credentials-velero \
- --use-volume-snapshots=true \
- --plugins velero/velero-plugin-for-aws:v1.3.0
- # 创建备份
- velero backup create my-backup --include-namespaces production
- # 计划备份
- velero schedule create daily-backup --schedule="0 2 * * *" --include-namespaces production
- # 恢复备份
- velero restore create --from-backup my-backup
复制代码
6.4 自动扩缩容
使用HPA根据CPU、内存或自定义指标自动调整Pod数量。
- # HPA示例
- apiVersion: autoscaling/v2
- kind: HorizontalPodAutoscaler
- metadata:
- name: myapp-hpa
- spec:
- scaleTargetRef:
- apiVersion: apps/v1
- kind: Deployment
- name: myapp
- minReplicas: 2
- maxReplicas: 10
- metrics:
- - type: Resource
- resource:
- name: cpu
- target:
- type: Utilization
- averageUtilization: 50
- - type: Resource
- resource:
- name: memory
- target:
- type: Utilization
- averageUtilization: 70
- - type: Pods
- pods:
- metric:
- name: packets-per-second
- target:
- type: AverageValue
- averageValue: 1k
- - type: Object
- object:
- metric:
- name: requests-per-second
- describedObject:
- apiVersion: networking.k8s.io/v1beta1
- kind: Ingress
- name: myapp-ingress
- target:
- type: Value
- value: 10k
复制代码
使用VPA自动调整Pod的资源请求和限制。
- # VPA示例
- apiVersion: autoscaling.k8s.io/v1
- kind: VerticalPodAutoscaler
- metadata:
- name: myapp-vpa
- spec:
- targetRef:
- apiVersion: "apps/v1"
- kind: Deployment
- name: myapp
- updatePolicy:
- updateMode: "Auto"
- resourcePolicy:
- containerPolicies:
- - containerName: "*"
- minAllowed:
- cpu: "100m"
- memory: "50Mi"
- maxAllowed:
- cpu: "1"
- memory: "500Mi"
- controlledResources: ["cpu", "memory"]
复制代码
使用集群自动扩缩容器根据资源需求调整节点数量。
- # 集群自动扩缩容配置示例
- apiVersion: "autoscaling.openshift.io/v1"
- kind: "ClusterAutoscaler"
- metadata:
- name: "default"
- spec:
- podPriorityThreshold: -10
- resourceLimits:
- maxNodesTotal: 50
- cores:
- min: 20
- max: 100
- memory:
- min: 80
- max: 400
- scaleDown:
- enabled: true
- delayAfterAdd: "10m"
- delayAfterDelete: "5m"
- delayAfterFailure: "3m"
- unneededTime: "30m"
复制代码
7. 实际项目部署难题及解决方案
7.1 资源管理难题
问题:如何合理设置Pod的资源请求和限制,避免资源浪费或性能问题?
解决方案:
1. 使用监控工具收集历史资源使用数据
2. 进行负载测试确定资源需求
3. 使用VPA自动调整资源配置
4. 实施资源配额和限制范围
- # LimitRange示例
- apiVersion: v1
- kind: LimitRange
- metadata:
- name: resource-limits
- namespace: development
- spec:
- limits:
- - default:
- cpu: "500m"
- memory: "512Mi"
- defaultRequest:
- cpu: "250m"
- memory: "256Mi"
- type: Container
复制代码- # 使用kubectl top命令监控资源使用
- kubectl top pods --all-namespaces
- kubectl top nodes
- # 使用Prometheus查询资源使用历史
- # 容器CPU使用率
- sum(rate(container_cpu_usage_seconds_total{image!="", container!="POD"}[5m])) by (namespace, pod)
- # 容器内存使用量
- sum(container_memory_working_set_bytes{image!="", container!="POD"}) by (namespace, pod)
复制代码
问题:如何处理资源争用,确保关键服务的性能?
解决方案:
1. 使用Kubernetes的QoS类别(Guaranteed、Burstable、BestEffort)
2. 为关键服务设置更高的优先级
3. 使用节点亲和性和反亲和性优化资源分配
- # Guaranteed QoS示例(设置相等的requests和limits)
- apiVersion: v1
- kind: Pod
- metadata:
- name: guaranteed-pod
- spec:
- containers:
- - name: guaranteed-container
- image: nginx
- resources:
- requests:
- memory: "200Mi"
- cpu: "500m"
- limits:
- memory: "200Mi"
- cpu: "500m"
复制代码- # PriorityClass示例
- apiVersion: scheduling.k8s.io/v1
- kind: PriorityClass
- metadata:
- name: high-priority
- value: 1000000
- globalDefault: false
- description: "This priority class should be used for critical service pods only."
- ---
- apiVersion: v1
- kind: Pod
- metadata:
- name: critical-pod
- spec:
- priorityClassName: high-priority
- containers:
- - name: critical-container
- image: myregistry/critical-service:1.0.0
复制代码
7.2 网络通信难题
问题:微服务间如何实现高效的服务发现?
解决方案:
1. 使用Kubernetes内置的DNS服务
2. 实现服务网格(如Istio、Linkerd)
3. 使用Headless Service进行有状态服务发现
- # Headless Service示例
- apiVersion: v1
- kind: Service
- metadata:
- name: stateful-service
- spec:
- clusterIP: None # Headless Service
- selector:
- app: stateful-app
- ports:
- - port: 80
- targetPort: 8080
复制代码- // Go服务发现示例
- func discoverServiceEndpoints(serviceName, namespace string) ([]string, error) {
- // 使用Kubernetes客户端库查询服务端点
- config, err := rest.InClusterConfig()
- if err != nil {
- return nil, fmt.Errorf("failed to get in-cluster config: %v", err)
- }
-
- clientset, err := kubernetes.NewForConfig(config)
- if err != nil {
- return nil, fmt.Errorf("failed to create clientset: %v", err)
- }
-
- endpoints, err := clientset.CoreV1().Endpoints(namespace).Get(context.TODO(), serviceName, metav1.GetOptions{})
- if err != nil {
- return nil, fmt.Errorf("failed to get endpoints: %v", err)
- }
-
- var addresses []string
- for _, subset := range endpoints.Subsets {
- for _, port := range subset.Ports {
- for _, addr := range subset.Addresses {
- addresses = append(addresses, fmt.Sprintf("%s:%d", addr.IP, port.Port))
- }
- }
- }
-
- return addresses, nil
- }
复制代码
问题:如何优化微服务间的网络通信性能?
解决方案:
1. 使用本地缓存减少网络请求
2. 实现连接池和HTTP/2
3. 使用服务网格优化通信路径
4. 部署网络策略优化流量路由
- # Istio服务网格配置示例
- apiVersion: networking.istio.io/v1alpha3
- kind: DestinationRule
- metadata:
- name: user-service
- spec:
- host: user-service
- trafficPolicy:
- connectionPool:
- tcp:
- maxConnections: 100
- connectTimeout: 30ms
- tcpKeepalive:
- time: 7200s
- interval: 75s
- http:
- http1MaxPendingRequests: 100
- http2MaxRequests: 1000
- maxRetries: 3
- idleTimeout: 90s
- h2UpgradePolicy: UPGRADE
- outlierDetection:
- consecutiveGatewayErrors: 5
- interval: 30s
- baseEjectionTime: 30s
- maxEjectionPercent: 50
复制代码- // HTTP/2客户端示例
- func createHTTP2Client() *http.Client {
- transport := &http.Transport{
- ForceAttemptHTTP2: true,
- MaxIdleConns: 100,
- IdleConnTimeout: 90 * time.Second,
- TLSClientConfig: &tls.Config{
- InsecureSkipVerify: true, // 仅用于测试环境
- },
- }
-
- return &http.Client{
- Transport: transport,
- Timeout: 30 * time.Second,
- }
- }
- // 使用连接池示例
- var httpClient = &http.Client{
- Transport: &http.Transport{
- MaxIdleConns: 100,
- MaxIdleConnsPerHost: 10,
- IdleConnTimeout: 90 * time.Second,
- },
- Timeout: 30 * time.Second,
- }
复制代码
7.3 存储管理难题
问题:如何为有状态微服务提供可靠的持久化存储?
解决方案:
1. 使用StatefulSet管理有状态应用
2. 配置持久化卷和持久化卷声明
3. 实现存储类(StorageClass)支持动态卷供应
4. 使用分布式存储系统(如Ceph、GlusterFS)
- # StatefulSet示例
- apiVersion: apps/v1
- kind: StatefulSet
- metadata:
- name: cassandra
- spec:
- serviceName: cassandra
- replicas: 3
- selector:
- matchLabels:
- app: cassandra
- template:
- metadata:
- labels:
- app: cassandra
- spec:
- containers:
- - name: cassandra
- image: cassandra:3.11
- ports:
- - containerPort: 9042
- name: cql
- volumeMounts:
- - name: cassandra-data
- mountPath: /var/lib/cassandra
- volumeClaimTemplates:
- - metadata:
- name: cassandra-data
- spec:
- accessModes: [ "ReadWriteOnce" ]
- storageClassName: fast-ssd
- resources:
- requests:
- storage: 10Gi
复制代码- # StorageClass示例
- apiVersion: storage.k8s.io/v1
- kind: StorageClass
- metadata:
- name: fast-ssd
- provisioner: kubernetes.io/gce-pd
- parameters:
- type: pd-ssd
- replication-type: regional-pd
- allowVolumeExpansion: true
- volumeBindingMode: WaitForFirstConsumer
复制代码
问题:如何实现微服务数据的可靠备份与恢复?
解决方案:
1. 使用Velero备份Kubernetes资源和持久化卷
2. 实现数据库专用备份工具(如pg_dump、mysqldump)
3. 定期测试恢复流程
4. 实现异地备份
- # Velero备份配置示例
- apiVersion: velero.io/v1
- kind: Backup
- metadata:
- name: daily-backup
- namespace: velero
- spec:
- includedNamespaces:
- - production
- includedResources:
- - persistentvolumeclaims
- - persistentvolumes
- - deployments
- - statefulsets
- - configmaps
- - secrets
- ttl: 168h0m0s # 7天
- storageLocation: default
- volumeSnapshotLocations:
- - default
复制代码- # 数据库备份脚本示例
- #!/bin/bash
- # MySQL备份
- mysqldump -h $DB_HOST -u $DB_USER -p$DB_PASSWORD --all-databases | gzip > /backups/mysql-$(date +%Y%m%d).sql.gz
- # PostgreSQL备份
- pg_dump -h $DB_HOST -U $DB_USER -d $DB_NAME | gzip > /backups/postgres-$(date +%Y%m%d).sql.gz
- # MongoDB备份
- mongodump --host $DB_HOST --username $DB_USER --password $DB_PASSWORD --out /backups/mongodb-$(date +%Y%m%d)
- tar -czf /backups/mongodb-$(date +%Y%m%d).tar.gz /backups/mongodb-$(date +%Y%m%d)
- rm -rf /backups/mongodb-$(date +%Y%m%d)
- # 上传到云存储
- aws s3 cp /backups/ s3://$S3_BUCKET/$(date +%Y%m%d)/ --recursive
复制代码
7.4 配置管理难题
问题:如何确保不同环境(开发、测试、生产)的配置一致性?
解决方案:
1. 使用Kustomize管理环境差异
2. 实现配置模板和参数化
3. 使用GitOps工作流同步配置
4. 配置版本控制和审计
- # Kustomize基础配置
- # base/deployment.yaml
- apiVersion: apps/v1
- kind: Deployment
- metadata:
- name: myapp
- spec:
- replicas: 2
- selector:
- matchLabels:
- app: myapp
- template:
- metadata:
- labels:
- app: myapp
- spec:
- containers:
- - name: myapp
- image: myregistry/myapp:1.0.0
- ports:
- - containerPort: 8080
- env:
- - name: LOG_LEVEL
- value: "INFO"
复制代码- # Kustomize覆盖配置(开发环境)
- # overlays/development/kustomization.yaml
- apiVersion: kustomize.config.k8s.io/v1beta1
- kind: Kustomization
- resources:
- - ../../base
- patchesStrategicMerge:
- - |-
- apiVersion: apps/v1
- kind: Deployment
- metadata:
- name: myapp
- spec:
- replicas: 1
- template:
- spec:
- containers:
- - name: myapp
- env:
- - name: LOG_LEVEL
- value: "DEBUG"
- - name: DATABASE_URL
- value: "jdbc:mysql://dev-db:3306/myapp_dev"
复制代码- # Kustomize覆盖配置(生产环境)
- # overlays/production/kustomization.yaml
- apiVersion: kustomize.config.k8s.io/v1beta1
- kind: Kustomization
- resources:
- - ../../base
- patchesStrategicMerge:
- - |-
- apiVersion: apps/v1
- kind: Deployment
- metadata:
- name: myapp
- spec:
- replicas: 5
- template:
- spec:
- containers:
- - name: myapp
- env:
- - name: LOG_LEVEL
- value: "WARN"
- - name: DATABASE_URL
- value: "jdbc:mysql://prod-db.cluster.example.com:3306/myapp_prod"
- resources:
- requests:
- memory: "512Mi"
- cpu: "500m"
- limits:
- memory: "1Gi"
- cpu: "1000m"
复制代码
问题:如何安全地管理微服务中的敏感信息(如密码、API密钥)?
解决方案:
1. 使用Kubernetes Secret对象
2. 集成外部密钥管理系统(如HashiCorp Vault)
3. 实现密钥轮换策略
4. 使用密钥注入工具(如Sealed Secrets)
- # 使用外部密钥管理系统的示例
- apiVersion: v1
- kind: Secret
- metadata:
- name: db-credentials
- annotations:
- vault.security.banzaicloud.io/vault-role: "database"
- vault.security.banzaicloud.io/vault-path: "database/creds/myapp"
- type: Opaque
复制代码- # Sealed Secret示例
- # 1. 创建普通Secret
- apiVersion: v1
- kind: Secret
- metadata:
- name: my-secret
- type: Opaque
- data:
- username: YWRtaW4=
- password: MWYyZDFlMmU2N2Rm
- # 2. 使用kubeseal工具加密
- kubeseal --format yaml --cert mycert.pem < my-secret.yaml > my-sealed-secret.yaml
- # 3. 生成的SealedSecret可以安全地提交到代码库
- apiVersion: bitnami.com/v1alpha1
- kind: SealedSecret
- metadata:
- creationTimestamp: null
- name: my-secret
- namespace: my-namespace
- spec:
- encryptedData:
- password: AgBy3i4OJSWK+PiTySYZZA9rO43cGDEq4KuP1j4oFQm6S8Qw7JF5Yg1x8uyRVlMG6yBhU6G9XuVqYzY6j5Z5J5O5i9j7k5l5m5n5o5p5q5r5s5t5u5v5w==
- username: AgBy3i4OJSWK+PiTySYZZA9rO43cGDEq4KuP1j4oFQm6S8Qw7JF5Yg1x8uyRVlMG6yBhU6G9XuVqYzY6j5Z5J5O5i9j7k5l5m5n5o5p5q5r5s5t5u5v5w==
- template:
- metadata:
- creationTimestamp: null
- name: my-secret
- namespace: my-namespace
- type: Opaque
复制代码
8. Kubernetes运维效率提升策略
8.1 自动化运维
使用Jenkins、GitLab CI或GitHub Actions实现自动化CI/CD流水线。
- # GitHub Actions示例
- name: Build and Deploy to Kubernetes
- on:
- push:
- branches: [ main ]
- env:
- REGISTRY: ghcr.io
- IMAGE_NAME: ${{ github.repository }}
- jobs:
- build-and-push:
- runs-on: ubuntu-latest
- permissions:
- contents: read
- packages: write
- steps:
- - name: Checkout repository
- uses: actions/checkout@v2
- - name: Log in to the Container registry
- uses: docker/login-action@f054a8b539a109f9f41c372932f1ae047eff08c9
- with:
- registry: ${{ env.REGISTRY }}
- username: ${{ github.actor }}
- password: ${{ secrets.GITHUB_TOKEN }}
- - name: Extract metadata
- id: meta
- uses: docker/metadata-action@98669ae865ea3cffbcbaa878cf57c20bbf1c6c38
- with:
- images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
- - name: Build and push Docker image
- uses: docker/build-push-action@ad44023a93711e3deb337508980b4b5e9bcdc5dc
- with:
- context: .
- push: true
- tags: ${{ steps.meta.outputs.tags }}
- labels: ${{ steps.meta.outputs.labels }}
- deploy:
- needs: build-and-push
- runs-on: ubuntu-latest
- steps:
- - name: Checkout repository
- uses: actions/checkout@v2
- - name: Setup kubectl
- uses: azure/setup-kubectl@v1
- with:
- version: '1.22.0'
- - name: Configure kubeconfig
- run: |
- mkdir -p $HOME/.kube
- echo "${{ secrets.KUBECONFIG }}" | base64 -d > $HOME/.kube/config
- chmod 600 $HOME/.kube/config
- - name: Deploy to Kubernetes
- run: |
- # 更新镜像标签
- sed -i "s|IMAGE_TAG|${{ github.sha }}|g" k8s/deployment.yaml
-
- # 应用配置
- kubectl apply -f k8s/
-
- # 验证部署
- kubectl rollout status deployment/myapp
复制代码
使用Shell脚本或Python编写自动化运维脚本,提高日常操作效率。
- #!/bin/bash
- # Kubernetes集群健康检查脚本
- # 检查节点状态
- echo "=== 检查节点状态 ==="
- kubectl get nodes
- # 检查系统Pod状态
- echo -e "\n=== 检查系统Pod状态 ==="
- kubectl get pods -n kube-system
- # 检查所有命名空间中的异常Pod
- echo -e "\n=== 检查异常Pod ==="
- kubectl get pods --all-namespaces --field-selector=status.phase!=Running,status.phase!=Succeeded
- # 检查未就绪的Deployment
- echo -e "\n=== 检查未就绪的Deployment ==="
- kubectl get deployments --all-namespaces -o custom-columns=NAME:.metadata.namespace/.metadata.name,READY:.status.readyReplicas,UP_TO_DATE:.status.updatedReplicas,AVAILABLE:.status.availableReplicas | awk '$2!=$3 || $2!=$4'
- # 检查资源使用情况
- echo -e "\n=== 检查资源使用情况 ==="
- kubectl top nodes
- # 检查证书过期情况
- echo -e "\n=== 检查证书过期情况 ==="
- kubectl get secrets -n kube-system -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.data.ca\.crt}{"\n"}{end}' | while read -r secret_name ca_crt_base64; do
- if [ -n "$ca_crt_base64" ]; then
- echo "$ca_crt_base64" | base64 -d | openssl x509 -noout -dates | grep notAfter | awk -F'=' '{print $2}' | xargs -I {} date -d {} -D "%b %d %H:%M:%S %Y %Z" +"%Y-%m-%d" | while read -r expiry_date; do
- current_date=$(date +"%Y-%m-%d")
- days_left=$(( ($(date -d "$expiry_date" +%s) - $(date -d "$current_date" +%s)) / 86400 ))
- if [ "$days_left" -lt 30 ]; then
- echo "警告: 证书 $secret_name 将在 $days_left 天后过期 ($expiry_date)"
- fi
- done
- fi
- done
复制代码- #!/usr/bin/env python3
- # Kubernetes资源清理脚本
- from kubernetes import client, config
- import datetime
- import argparse
- def cleanup_completed_jobs(namespace, older_than_days):
- config.load_kube_config()
- batch_v1 = client.BatchV1Api()
-
- now = datetime.datetime.now()
- cutoff_date = now - datetime.timedelta(days=older_than_days)
-
- print(f"清理 {namespace} 命名空间中早于 {older_than_days} 天的已完成Job...")
-
- jobs = batch_v1.list_namespaced_job(namespace)
- count = 0
-
- for job in jobs.items:
- if job.status.completion_time and job.status.completion_time < cutoff_date:
- print(f"删除Job: {job.metadata.name} (完成时间: {job.status.completion_time})")
- batch_v1.delete_namespaced_job(job.metadata.name, namespace)
- count += 1
-
- print(f"已删除 {count} 个Job")
- def cleanup_failed_pods(namespace, older_than_days):
- config.load_kube_config()
- core_v1 = client.CoreV1Api()
-
- now = datetime.datetime.now()
- cutoff_date = now - datetime.timedelta(days=older_than_days)
-
- print(f"清理 {namespace} 命名空间中早于 {older_than_days} 天的失败Pod...")
-
- pods = core_v1.list_namespaced_pod(namespace)
- count = 0
-
- for pod in pods.items:
- if pod.status.phase == "Failed" and pod.status.start_time and pod.status.start_time < cutoff_date:
- print(f"删除Pod: {pod.metadata.name} (状态: {pod.status.phase}, 开始时间: {pod.status.start_time})")
- core_v1.delete_namespaced_pod(pod.metadata.name, namespace)
- count += 1
-
- print(f"已删除 {count} 个Pod")
- def main():
- parser = argparse.ArgumentParser(description="Kubernetes资源清理工具")
- parser.add_argument("--namespace", required=True, help="命名空间")
- parser.add_argument("--older-than", type=int, default=7, help="清理早于N天的资源")
- parser.add_argument("--jobs", action="store_true", help="清理已完成的Job")
- parser.add_argument("--pods", action="store_true", help="清理失败的Pod")
-
- args = parser.parse_args()
-
- if args.jobs:
- cleanup_completed_jobs(args.namespace, args.older_than)
-
- if args.pods:
- cleanup_failed_pods(args.namespace, args.older_than)
- if __name__ == "__main__":
- main()
复制代码
8.2 监控与告警
构建基于Prometheus、Grafana和AlertManager的全方位监控体系。
- # AlertManager配置示例
- apiVersion: v1
- kind: ConfigMap
- metadata:
- name: alertmanager-config
- namespace: monitoring
- data:
- config.yml: |
- global:
- resolve_timeout: 5m
- smtp_smarthost: 'smtp.example.com:587'
- smtp_from: 'alerts@example.com'
- smtp_auth_username: 'alerts@example.com'
- smtp_auth_password: 'password'
-
- route:
- group_by: ['alertname', 'cluster', 'service']
- group_wait: 30s
- group_interval: 5m
- repeat_interval: 12h
- receiver: 'web.hook'
- routes:
- - match:
- alertname: DeadMansSwitch
- receiver: 'deadmansswitch'
- - match:
- service: database
- receiver: 'database-pager'
- - match_re:
- service: ^(api|web).*
- receiver: 'webhook'
-
- receivers:
- - name: 'deadmansswitch'
- - name: 'web.hook'
- webhook_configs:
- - url: 'http://127.0.0.1:5001/'
- - name: 'database-pager'
- email_configs:
- - to: 'database-team@example.com'
- - name: 'webhook'
- webhook_configs:
- - url: 'http://webhook.example.com/alerts'
复制代码- # Prometheus告警规则示例
- apiVersion: v1
- kind: ConfigMap
- metadata:
- name: prometheus-alerts
- namespace: monitoring
- data:
- alerts.yml: |
- groups:
- - name: kubernetes-apps
- rules:
- - alert: PodCrashLooping
- expr: rate(kube_pod_container_status_restarts_total[15m]) * 60 > 0
- for: 15m
- labels:
- severity: critical
- annotations:
- summary: "Pod {{ $labels.pod }} is crash looping"
- description: "Pod {{ $labels.pod }} ({{ $labels.namespace }}) is in crash loop back-off state."
-
- - alert: PodNotReady
- expr: sum by (namespace, pod) (kube_pod_status_ready{condition="false"}) == 1
- for: 10m
- labels:
- severity: warning
- annotations:
- summary: "Pod {{ $labels.pod }} is not ready"
- description: "Pod {{ $labels.pod }} ({{ $labels.namespace }}) has been in a non-ready state for more than 10 minutes."
-
- - name: kubernetes-resources
- rules:
- - alert: NodeMemoryUsage
- expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 85
- for: 5m
- labels:
- severity: warning
- annotations:
- summary: "Node memory usage is high"
- description: "Node {{ $labels.instance }} memory usage is above 85% (current value: {{ $value }}%)."
-
- - alert: NodeDiskUsage
- expr: (1 - (node_filesystem_avail_bytes{fstype!="tmpfs"} / node_filesystem_size_bytes{fstype!="tmpfs"})) * 100 > 85
- for: 5m
- labels:
- severity: warning
- annotations:
- summary: "Node disk usage is high"
- description: "Node {{ $labels.instance }} disk usage is above 85% (current value: {{ $value }}%)."
复制代码
使用ELK或EFK栈实现日志分析与告警。
- # Elasticsearch告警规则示例(使用ElastAlert)
- apiVersion: v1
- kind: ConfigMap
- metadata:
- name: elastalert-config
- namespace: logging
- data:
- config.yaml: |
- rules_folder: /etc/elastalert/rules
- run_every:
- seconds: 10
- buffer_time:
- minutes: 15
- es_host: elasticsearch
- es_port: 9200
- writeback_index: elastalert_status
- writeback_alias: elastalert_alerts
- alert_time_limit:
- days: 2
复制代码- # 告警规则示例
- apiVersion: v1
- kind: ConfigMap
- metadata:
- name: elastalert-rules
- namespace: logging
- data:
- error_frequency.yaml: |
- name: High Error Frequency
- type: frequency
- index: kubernetes-*
- num_events: 50
- timeframe:
- minutes: 5
- filter:
- - query:
- query_string:
- query: "log_level: ERROR"
- alert:
- - "email"
- email: "devops@example.com"
- alert_text: "High error frequency detected in service {0}"
- alert_text_args: ["kubernetes.pod_name"]
- alert_subject: "High Error Frequency Alert"
复制代码
8.3 运维工具链
使用Rancher、Lens或kubectl插件实现多集群管理。
- # 安装kubectl插件kubectx和kubens
- git clone https://github.com/ahmetb/kubectx /opt/kubectx
- ln -sf /opt/kubectx/kubectx /usr/local/bin/kubectx
- ln -sf /opt/kubectx/kubens /usr/local/bin/kubens
- # 配置多集群访问
- kubectl config view --flatten > ~/.kube/config
- # 使用kubectx切换集群
- kubectx prod-cluster
- kubectx dev-cluster
- # 使用kubens切换命名空间
- kubens production
- kubens development
复制代码- # Rancher集群配置示例
- apiVersion: provisioning.cattle.io/v1
- kind: Cluster
- metadata:
- name: production-cluster
- namespace: fleet-default
- spec:
- rkeConfig:
- machinePools:
- - name: control-plane
- controlPlaneRole: true
- etcdRole: true
- quantity: 3
- machineConfigRef:
- name: control-plane-config
- namespace: fleet-default
- - name: worker
- workerRole: true
- quantity: 5
- machineConfigRef:
- name: worker-config
- namespace: fleet-default
复制代码
使用Helm、Kustomize或Operator SDK简化资源管理。
- # Helm Chart示例
- # Chart.yaml
- apiVersion: v2
- name: myapp
- description: A Helm chart for my application
- version: 0.1.0
- appVersion: 1.0.0
- # values.yaml
- replicaCount: 3
- image:
- repository: myregistry/myapp
- pullPolicy: IfNotPresent
- tag: "1.0.0"
- service:
- type: ClusterIP
- port: 80
- ingress:
- enabled: true
- annotations:
- kubernetes.io/ingress.class: nginx
- hosts:
- - host: myapp.example.com
- paths: ["/"]
- resources:
- limits:
- cpu: 100m
- memory: 128Mi
- requests:
- cpu: 100m
- memory: 128Mi
- autoscaling:
- enabled: true
- minReplicas: 3
- maxReplicas: 10
- targetCPUUtilizationPercentage: 80
复制代码- # Operator示例(使用Operator SDK)
- # api/v1alpha1/myapp_types.go
- package v1alpha1
- import (
- metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
- )
- // MyAppSpec defines the desired state of MyApp
- type MyAppSpec struct {
- // Size is the size of the myapp deployment
- Size int32 `json:"size"`
- // Image is the container image to use
- Image string `json:"image"`
- // Port is the port the container should expose
- Port int32 `json:"port"`
- }
- // MyAppStatus defines the observed state of MyApp
- type MyAppStatus struct {
- // Nodes represents the current nodes in the myapp cluster
- Nodes []string `json:"nodes"`
- }
- // +kubebuilder:object:root=true
- // +kubebuilder:subresource:status
- // MyApp is the Schema for the myapps API
- type MyApp struct {
- metav1.TypeMeta `json:",inline"`
- metav1.ObjectMeta `json:"metadata,omitempty"`
- Spec MyAppSpec `json:"spec,omitempty"`
- Status MyAppStatus `json:"status,omitempty"`
- }
- // +kubebuilder:object:root=true
- // MyAppList contains a list of MyApp
- type MyAppList struct {
- metav1.TypeMeta `json:",inline"`
- metav1.ListMeta `json:"metadata,omitempty"`
- Items []MyApp `json:"items"`
- }
- func init() {
- SchemeBuilder.Register(&MyApp{}, &MyAppList{})
- }
复制代码- // controllers/myapp_controller.go
- package controllers
- import (
- "context"
- "fmt"
- "time"
- appsv1 "k8s.io/api/apps/v1"
- corev1 "k8s.io/api/core/v1"
- "k8s.io/apimachinery/pkg/api/errors"
- "k8s.io/apimachinery/pkg/api/resource"
- metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
- "k8s.io/apimachinery/pkg/types"
- "k8s.io/apimachinery/pkg/runtime"
- ctrl "sigs.k8s.io/controller-runtime"
- "sigs.k8s.io/controller-runtime/pkg/client"
- myappv1alpha1 "myapp.domain.com/api/v1alpha1"
- )
- // MyAppReconciler reconciles a MyApp object
- type MyAppReconciler struct {
- client.Client
- Scheme *runtime.Scheme
- }
- // +kubebuilder:rbac:groups=myapp.domain.com,resources=myapps,verbs=get;list;watch;create;update;patch;delete
- // +kubebuilder:rbac:groups=myapp.domain.com,resources=myapps/status,verbs=get;update;patch
- // +kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete
- // +kubebuilder:rbac:groups=core,resources=services,verbs=get;list;watch;create;update;patch;delete
- func (r *MyAppReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
- reqLogger := log.WithValues("Request.Namespace", req.Namespace, "Request.Name", req.Name)
- reqLogger.Info("Reconciling MyApp")
- // Fetch the MyApp instance
- myapp := &myappv1alpha1.MyApp{}
- err := r.Get(ctx, req.NamespacedName, myapp)
- if err != nil {
- if errors.IsNotFound(err) {
- // Request object not found, could have been deleted after reconcile request.
- // Owned objects are automatically garbage collected. For additional cleanup logic use finalizers.
- // Return and don't requeue
- reqLogger.Info("MyApp resource not found. Ignoring since object must be deleted.")
- return ctrl.Result{}, nil
- }
- // Error reading the object - requeue the request.
- reqLogger.Error(err, "Failed to get MyApp")
- return ctrl.Result{}, err
- }
- // Check if the deployment already exists, if not create a new one
- found := &appsv1.Deployment{}
- err = r.Get(ctx, types.NamespacedName{Name: myapp.Name, Namespace: myapp.Namespace}, found)
- if err != nil && errors.IsNotFound(err) {
- // Define a new deployment
- dep := r.deploymentForMyApp(myapp)
- reqLogger.Info("Creating a new Deployment", "Deployment.Namespace", dep.Namespace, "Deployment.Name", dep.Name)
- err = r.Create(ctx, dep)
- if err != nil {
- reqLogger.Error(err, "Failed to create new Deployment", "Deployment.Namespace", dep.Namespace, "Deployment.Name", dep.Name)
- return ctrl.Result{}, err
- }
- // Deployment created successfully - return and requeue
- return ctrl.Result{Requeue: true}, nil
- } else if err != nil {
- reqLogger.Error(err, "Failed to get Deployment")
- return ctrl.Result{}, err
- }
- // Ensure the deployment size is the same as the spec
- size := myapp.Spec.Size
- if *found.Spec.Replicas != size {
- found.Spec.Replicas = &size
- err = r.Update(ctx, found)
- if err != nil {
- reqLogger.Error(err, "Failed to update Deployment", "Deployment.Namespace", found.Namespace, "Deployment.Name", found.Name)
- return ctrl.Result{}, err
- }
- // Spec updated - return and requeue
- return ctrl.Result{Requeue: true}, nil
- }
- // Update the MyApp status with the pod names
- // List the pods for this myapp's deployment
- podList := &corev1.PodList{}
- listOpts := []client.ListOption{
- client.InNamespace(myapp.Namespace),
- client.MatchingLabels(labelsForMyApp(myapp.Name)),
- }
- if err = r.List(ctx, podList, listOpts...); err != nil {
- reqLogger.Error(err, "Failed to list pods", "MyApp.Namespace", myapp.Namespace, "MyApp.Name", myapp.Name)
- return ctrl.Result{}, err
- }
- podNames := getPodNames(podList.Items)
- // Update status.Nodes if needed
- if !reflect.DeepEqual(podNames, myapp.Status.Nodes) {
- myapp.Status.Nodes = podNames
- err := r.Status().Update(ctx, myapp)
- if err != nil {
- reqLogger.Error(err, "Failed to update MyApp status")
- return ctrl.Result{}, err
- }
- }
- return ctrl.Result{RequeueAfter: time.Minute * 5}, nil
- }
- func (r *MyAppReconciler) deploymentForMyApp(m *myappv1alpha1.MyApp) *appsv1.Deployment {
- labels := labelsForMyApp(m.Name)
- replicas := m.Spec.Size
- dep := &appsv1.Deployment{
- ObjectMeta: metav1.ObjectMeta{
- Name: m.Name,
- Namespace: m.Namespace,
- },
- Spec: appsv1.DeploymentSpec{
- Replicas: &replicas,
- Selector: &metav1.LabelSelector{
- MatchLabels: labels,
- },
- Template: corev1.PodTemplateSpec{
- ObjectMeta: metav1.ObjectMeta{
- Labels: labels,
- },
- Spec: corev1.PodSpec{
- Containers: []corev1.Container{{
- Image: m.Spec.Image,
- Name: m.Name,
- Ports: []corev1.ContainerPort{{
- ContainerPort: m.Spec.Port,
- Name: "http",
- }},
- Resources: corev1.ResourceRequirements{
- Limits: corev1.ResourceList{
- corev1.ResourceCPU: resource.MustParse("100m"),
- corev1.ResourceMemory: resource.MustParse("100Mi"),
- },
- Requests: corev1.ResourceList{
- corev1.ResourceCPU: resource.MustParse("50m"),
- corev1.ResourceMemory: resource.MustParse("50Mi"),
- },
- },
- }},
- },
- },
- },
- }
- // Set MyApp instance as the owner and controller
- ctrl.SetControllerReference(m, dep, r.Scheme)
- return dep
- }
- func labelsForMyApp(name string) map[string]string {
- return map[string]string{"app": "myapp", "myapp_cr": name}
- }
- func getPodNames(pods []corev1.Pod) []string {
- var podNames []string
- for _, pod := range pods {
- podNames = append(podNames, pod.Name)
- }
- return podNames
- }
- func (r *MyAppReconciler) SetupWithManager(mgr ctrl.Manager) error {
- return ctrl.NewControllerManagedBy(mgr).
- For(&myappv1alpha1.MyApp{}).
- Owns(&appsv1.Deployment{}).
- Complete(r)
- }
复制代码
9. 高级主题与最佳实践
9.1 服务网格
部署和使用Istio服务网格增强微服务通信管理。
- # Istio Operator部署
- apiVersion: install.istio.io/v1alpha1
- kind: IstioOperator
- metadata:
- namespace: istio-system
- name: istio-controlplane
- spec:
- profile: default
- components:
- pilot:
- k8s:
- env:
- - name: PILOT_TRACE_SAMPLING
- value: "100"
- ingressGateways:
- - name: istio-ingressgateway
- enabled: true
- k8s:
- serviceAnnotations:
- service.beta.kubernetes.io/aws-load-balancer-type: nlb
- service:
- type: LoadBalancer
- ports:
- - port: 80
- targetPort: 8080
- name: http2
- - port: 443
- targetPort: 8443
- name: https
- egressGateways:
- - name: istio-egressgateway
- enabled: true
- values:
- global:
- meshID: mesh1
- network: network1
复制代码- # Istio VirtualService示例
- apiVersion: networking.istio.io/v1alpha3
- kind: VirtualService
- metadata:
- name: reviews
- spec:
- hosts:
- - reviews
- http:
- - match:
- - headers:
- end-user:
- exact: jason
- fault:
- delay:
- percentage:
- value: 100
- fixedDelay: 7s
- route:
- - destination:
- host: reviews
- subset: v2
- - route:
- - destination:
- host: reviews
- subset: v3
复制代码- # Istio DestinationRule示例
- apiVersion: networking.istio.io/v1alpha3
- kind: DestinationRule
- metadata:
- name: reviews
- spec:
- host: reviews
- trafficPolicy:
- connectionPool:
- tcp:
- maxConnections: 100
- connectTimeout: 30ms
- tcpKeepalive:
- time: 7200s
- interval: 75s
- http:
- http1MaxPendingRequests: 100
- http2MaxRequests: 1000
- maxRetries: 3
- idleTimeout: 90s
- h2UpgradePolicy: UPGRADE
- outlierDetection:
- consecutiveGatewayErrors: 5
- interval: 30s
- baseEjectionTime: 30s
- maxEjectionPercent: 50
- subsets:
- - name: v1
- labels:
- version: v1
- - name: v2
- labels:
- version: v2
- - name: v3
- labels:
- version: v3
复制代码
部署和使用Linkerd服务网格,提供轻量级服务网格解决方案。
- # 安装Linkerd CLI
- curl -sL https://run.linkerd.io/install | sh
- # 验证Kubernetes集群配置
- linkerd check --pre
- # 安装Linkerd控制平面
- linkerd install | kubectl apply -f -
- # 验证安装
- linkerd check
- # 注入Linkerd到命名空间
- kubectl get namespaces -l linkerd.io/control-plane-ns=linkerd
- kubectl annotate namespace default linkerd.io/inject=enabled
复制代码- # Linkerd服务配置示例
- apiVersion: split.smi-spec.io/v1alpha1
- kind: TrafficSplit
- metadata:
- name: reviews-split
- spec:
- service: reviews
- backends:
- - service: reviews-v1
- weight: 50
- - service: reviews-v2
- weight: 50
复制代码
9.2 Serverless架构
使用Knative在Kubernetes上构建Serverless应用。
- # Knative Service示例
- apiVersion: serving.knative.dev/v1
- kind: Service
- metadata:
- name: hello
- spec:
- template:
- spec:
- containers:
- - image: gcr.io/knative-samples/helloworld-go
- ports:
- - containerPort: 8080
- env:
- - name: TARGET
- value: "Knative"
复制代码- # Knative事件示例
- apiVersion: sources.knative.dev/v1
- kind: ContainerSource
- metadata:
- name: heartbeat-source
- spec:
- template:
- spec:
- containers:
- - image: gcr.io/knative-releases/knative.dev/eventing/cmd/heartbeats
- name: heartbeats
- args:
- - --period=1
- env:
- - name: POD_NAME
- value: "heartbeats"
- - name: POD_NAMESPACE
- value: "default"
- sink:
- ref:
- apiVersion: serving.knative.dev/v1
- kind: Service
- name: event-display
复制代码
使用OpenFaaS构建Serverless函数。
- # OpenFaaS函数示例
- apiVersion: openfaas.com/v1
- kind: Function
- metadata:
- name: echo
- namespace: openfaas-fn
- spec:
- name: echo
- image: functions/echo:latest
- labels:
- com.openfaas.scale.min: "2"
- com.openfaas.scale.max: "10"
- com.openfaas.scale.factor: "20"
- environment:
- write_debug: "true"
复制代码- # OpenFaaS异步函数示例
- apiVersion: openfaas.com/v1
- kind: Function
- metadata:
- name: async-echo
- namespace: openfaas-fn
- spec:
- name: async-echo
- image: functions/async-echo:latest
- labels:
- com.openfaas.scale.min: "1"
- com.openfaas.scale.max: "5"
- annotations:
- com.openfaas.async: "true"
- com.openfaas.topic: "echo-topic"
复制代码
9.3 多集群管理
使用Kubernetes Federation v2实现多集群管理。
- # KubeFed控制平面部署
- apiVersion: types.kubefed.io/v1beta1
- kind: KubeFedCluster
- metadata:
- name: cluster2
- namespace: kube-federation-system
- spec:
- apiEndpoint: https://cluster2.example.com
- secretRef:
- name: cluster2-secret
- caBundle: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0t...
复制代码- # 联邦资源示例
- apiVersion: types.kubefed.io/v1beta1
- kind: FederatedDeployment
- metadata:
- name: myapp
- namespace: default
- spec:
- template:
- spec:
- replicas: 3
- selector:
- matchLabels:
- app: myapp
- template:
- metadata:
- labels:
- app: myapp
- spec:
- containers:
- - name: myapp
- image: myregistry/myapp:1.0.0
- ports:
- - containerPort: 8080
- placement:
- clusters:
- - name: cluster1
- - name: cluster2
- overrides:
- - clusterName: cluster2
- clusterOverrides:
- - path: "spec.replicas"
- value: 5
复制代码
使用Rancher实现多集群统一管理。
- # Rancher集群配置
- apiVersion: provisioning.cattle.io/v1
- kind: Cluster
- metadata:
- name: cluster1
- namespace: fleet-default
- spec:
- rkeConfig:
- machinePools:
- - name: control-plane
- controlPlaneRole: true
- etcdRole: true
- quantity: 3
- machineConfigRef:
- name: control-plane-config
- namespace: fleet-default
- - name: worker
- workerRole: true
- quantity: 5
- machineConfigRef:
- name: worker-config
- namespace: fleet-default
- ---
- apiVersion: provisioning.cattle.io/v1
- kind: Cluster
- metadata:
- name: cluster2
- namespace: fleet-default
- spec:
- rkeConfig:
- machinePools:
- - name: control-plane
- controlPlaneRole: true
- etcdRole: true
- quantity: 3
- machineConfigRef:
- name: control-plane-config
- namespace: fleet-default
- - name: worker
- workerRole: true
- quantity: 5
- machineConfigRef:
- name: worker-config
- namespace: fleet-default
复制代码- # Rancher多集群项目
- apiVersion: management.cattle.io/v3
- kind: Project
- metadata:
- name: multi-cluster-project
- namespace: fleet-default
- spec:
- clusterName: cluster1
- description: "Multi-cluster project"
- displayName: "Multi-cluster Project"
- projectTemplateId: local:p-xxxxx
复制代码
9.4 GitOps实践
使用Argo CD实现GitOps工作流。
- # Argo CD Application示例
- apiVersion: argoproj.io/v1alpha1
- kind: Application
- metadata:
- name: myapp
- namespace: argocd
- spec:
- project: default
- source:
- repoURL: https://github.com/myorg/myapp-k8s-manifests.git
- targetRevision: HEAD
- path: overlays/production
- destination:
- server: https://kubernetes.default.svc
- namespace: production
- syncPolicy:
- automated:
- prune: true
- selfHeal: true
- syncOptions:
- - Validate=false
复制代码- # Argo CD App of Apps模式
- apiVersion: argoproj.io/v1alpha1
- kind: Application
- metadata:
- name: root-app
- namespace: argocd
- spec:
- project: default
- source:
- repoURL: https://github.com/myorg/myapp-k8s-manifests.git
- targetRevision: HEAD
- path: apps
- destination:
- server: https://kubernetes.default.svc
- namespace: argocd
- syncPolicy:
- automated:
- prune: true
- selfHeal: true
复制代码
使用Flux CD实现GitOps工作流。
- # Flux HelmRelease示例
- apiVersion: helm.toolkit.fluxcd.io/v2beta1
- kind: HelmRelease
- metadata:
- name: myapp
- namespace: production
- spec:
- interval: 5m
- chart:
- spec:
- chart: myapp
- version: "1.0.0"
- sourceRef:
- kind: HelmRepository
- name: myapp-charts
- namespace: flux-system
- values:
- image:
- repository: myregistry/myapp
- tag: "1.0.0"
- replicaCount: 3
- service:
- type: ClusterIP
- port: 80
- ingress:
- enabled: true
- annotations:
- kubernetes.io/ingress.class: nginx
- hosts:
- - host: myapp.example.com
- paths: ["/"]
复制代码- # Flux Kustomization示例
- apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
- kind: Kustomization
- metadata:
- name: myapp
- namespace: production
- spec:
- interval: 5m
- path: "./overlays/production"
- prune: true
- sourceRef:
- kind: GitRepository
- name: myapp-repo
- namespace: flux-system
- validation: client
- healthChecks:
- - kind: Deployment
- name: myapp
- namespace: production
复制代码
10. 总结与展望
10.1 Kubernetes最佳实践总结
在本文中,我们全面探讨了Kubernetes微服务部署的各个方面,从基础概念到高级应用。以下是关键最佳实践的总结:
1. 资源管理:合理设置Pod的资源请求和限制使用LimitRange和ResourceQuota控制资源使用实施HPA和VPA实现自动扩缩容
2. 合理设置Pod的资源请求和限制
3. 使用LimitRange和ResourceQuota控制资源使用
4. 实施HPA和VPA实现自动扩缩容
5. 网络与通信:使用Service实现服务发现和负载均衡实施NetworkPolicy控制网络流量考虑使用服务网格(如Istio或Linkerd)增强微服务通信
6. 使用Service实现服务发现和负载均衡
7. 实施NetworkPolicy控制网络流量
8. 考虑使用服务网格(如Istio或Linkerd)增强微服务通信
9. 存储管理:为有状态应用使用StatefulSet和持久化存储实现定期备份和恢复策略使用StorageClass支持动态卷供应
10. 为有状态应用使用StatefulSet和持久化存储
11. 实现定期备份和恢复策略
12. 使用StorageClass支持动态卷供应
13. 配置管理:使用ConfigMap和Secret管理配置实现多环境配置管理(如Kustomize)安全管理敏感信息(如Sealed Secrets)
14. 使用ConfigMap和Secret管理配置
15. 实现多环境配置管理(如Kustomize)
16. 安全管理敏感信息(如Sealed Secrets)
17. 安全实践:实施RBAC控制访问权限使用Pod安全策略或Pod安全准入控制器定期更新和扫描容器镜像
18. 实施RBAC控制访问权限
19. 使用Pod安全策略或Pod安全准入控制器
20. 定期更新和扫描容器镜像
21. 监控与日志:建立全面的监控体系(如Prometheus和Grafana)实现集中式日志收集和分析(如EFK栈)设置有效的告警机制
22. 建立全面的监控体系(如Prometheus和Grafana)
23. 实现集中式日志收集和分析(如EFK栈)
24. 设置有效的告警机制
25. 自动化运维:实现CI/CD流水线自动化部署使用GitOps工作流管理配置开发自动化运维脚本和工具
26. 实现CI/CD流水线自动化部署
27. 使用GitOps工作流管理配置
28. 开发自动化运维脚本和工具
29. 高可用与灾备:实现多区域部署提高可用性定期测试备份和恢复流程建立灾难恢复计划
30. 实现多区域部署提高可用性
31. 定期测试备份和恢复流程
32. 建立灾难恢复计划
资源管理:
• 合理设置Pod的资源请求和限制
• 使用LimitRange和ResourceQuota控制资源使用
• 实施HPA和VPA实现自动扩缩容
网络与通信:
• 使用Service实现服务发现和负载均衡
• 实施NetworkPolicy控制网络流量
• 考虑使用服务网格(如Istio或Linkerd)增强微服务通信
存储管理:
• 为有状态应用使用StatefulSet和持久化存储
• 实现定期备份和恢复策略
• 使用StorageClass支持动态卷供应
配置管理:
• 使用ConfigMap和Secret管理配置
• 实现多环境配置管理(如Kustomize)
• 安全管理敏感信息(如Sealed Secrets)
安全实践:
• 实施RBAC控制访问权限
• 使用Pod安全策略或Pod安全准入控制器
• 定期更新和扫描容器镜像
监控与日志:
• 建立全面的监控体系(如Prometheus和Grafana)
• 实现集中式日志收集和分析(如EFK栈)
• 设置有效的告警机制
自动化运维:
• 实现CI/CD流水线自动化部署
• 使用GitOps工作流管理配置
• 开发自动化运维脚本和工具
高可用与灾备:
• 实现多区域部署提高可用性
• 定期测试备份和恢复流程
• 建立灾难恢复计划
10.2 Kubernetes生态系统发展趋势
Kubernetes生态系统正在快速发展,以下是一些值得关注的趋势:
1. 边缘计算:K3s等轻量级Kubernetes发行版在边缘场景的应用KubeEdge等边缘计算框架的成熟5G与Kubernetes结合的边缘应用
2. K3s等轻量级Kubernetes发行版在边缘场景的应用
3. KubeEdge等边缘计算框架的成熟
4. 5G与Kubernetes结合的边缘应用
5. Serverless:Knative、OpenFaaS等Serverless框架的普及事件驱动架构在Kubernetes上的应用FaaS(函数即服务)与微服务的结合
6. Knative、OpenFaaS等Serverless框架的普及
7. 事件驱动架构在Kubernetes上的应用
8. FaaS(函数即服务)与微服务的结合
9. 服务网格:服务网格技术的标准化(如SMI规范)服务网格与无代理模式的演进服务网格在多集群环境中的应用
10. 服务网格技术的标准化(如SMI规范)
11. 服务网格与无代理模式的演进
12. 服务网格在多集群环境中的应用
13. GitOps:GitOps成为云原生应用部署的主流模式GitOps工具链的成熟和多样化GitOps与DevSecOps的融合
14. GitOps成为云原生应用部署的主流模式
15. GitOps工具链的成熟和多样化
16. GitOps与DevSecOps的融合
17. AI/ML与Kubernetes:Kubeflow等机器学习平台的发展GPU资源调度和管理的优化AI驱动的智能运维(AIOps)
18. Kubeflow等机器学习平台的发展
19. GPU资源调度和管理的优化
20. AI驱动的智能运维(AIOps)
21. 安全增强:零信任网络架构的普及机密计算在Kubernetes上的应用安全策略即代码(Policy as Code)的实践
22. 零信任网络架构的普及
23. 机密计算在Kubernetes上的应用
24. 安全策略即代码(Policy as Code)的实践
边缘计算:
• K3s等轻量级Kubernetes发行版在边缘场景的应用
• KubeEdge等边缘计算框架的成熟
• 5G与Kubernetes结合的边缘应用
Serverless:
• Knative、OpenFaaS等Serverless框架的普及
• 事件驱动架构在Kubernetes上的应用
• FaaS(函数即服务)与微服务的结合
服务网格:
• 服务网格技术的标准化(如SMI规范)
• 服务网格与无代理模式的演进
• 服务网格在多集群环境中的应用
GitOps:
• GitOps成为云原生应用部署的主流模式
• GitOps工具链的成熟和多样化
• GitOps与DevSecOps的融合
AI/ML与Kubernetes:
• Kubeflow等机器学习平台的发展
• GPU资源调度和管理的优化
• AI驱动的智能运维(AIOps)
安全增强:
• 零信任网络架构的普及
• 机密计算在Kubernetes上的应用
• 安全策略即代码(Policy as Code)的实践
10.3 持续学习与提升
Kubernetes技术栈庞大且不断演进,持续学习至关重要。以下是一些建议:
1. 官方资源:定期阅读Kubernetes官方文档和博客参与Kubernetes社区活动和讨论关注CNCF(云原生计算基金会)的项目和活动
2. 定期阅读Kubernetes官方文档和博客
3. 参与Kubernetes社区活动和讨论
4. 关注CNCF(云原生计算基金会)的项目和活动
5. 实践项目:在个人项目中应用Kubernetes技术参与开源项目贡献搭建实验环境测试新功能
6. 在个人项目中应用Kubernetes技术
7. 参与开源项目贡献
8. 搭建实验环境测试新功能
9. 认证与培训:考取CKA、CKAD、CKS等Kubernetes认证参加专业培训课程参与技术会议和研讨会
10. 考取CKA、CKAD、CKS等Kubernetes认证
11. 参加专业培训课程
12. 参与技术会议和研讨会
13. 社区参与:加入本地Kubernetes用户组参与线上技术分享和讨论关注Kubernetes专家的博客和社交媒体
14. 加入本地Kubernetes用户组
15. 参与线上技术分享和讨论
16. 关注Kubernetes专家的博客和社交媒体
官方资源:
• 定期阅读Kubernetes官方文档和博客
• 参与Kubernetes社区活动和讨论
• 关注CNCF(云原生计算基金会)的项目和活动
实践项目:
• 在个人项目中应用Kubernetes技术
• 参与开源项目贡献
• 搭建实验环境测试新功能
认证与培训:
• 考取CKA、CKAD、CKS等Kubernetes认证
• 参加专业培训课程
• 参与技术会议和研讨会
社区参与:
• 加入本地Kubernetes用户组
• 参与线上技术分享和讨论
• 关注Kubernetes专家的博客和社交媒体
通过持续学习和实践,您可以不断提升Kubernetes技能,更好地应对企业级容器编排平台的挑战,提高运维效率,为业务发展提供强有力的技术支持。
版权声明
1、转载或引用本网站内容(全面掌握Kubernetes微服务部署实践从入门到精通详解企业级容器编排平台核心技术与应用技巧解决实际项目部署难题提升运维效率)须注明原网址及作者(威震华夏关云长),并标明本网站网址(https://www.pixtech.cc/)。
2、对于不当转载或引用本网站内容而引起的民事纷争、行政处理或其他损失,本网站不承担责任。
3、对不遵守本声明或其他违法、恶意使用本网站内容者,本网站保留追究其法律责任的权利。
本文地址: https://www.pixtech.cc/thread-36850-1-1.html
|
|