Rocky Linux集群环境构建实战打造高效稳定的企业服务器架构

威震华夏关云长 · 发表于 2025-10-3 20:00:01

马上注册，结交更多好友，享用更多功能，让你轻松玩转社区。

您需要登录才可以下载或查看，没有账号？立即注册

x

引言

在当今数字化转型的浪潮中，企业对服务器架构的稳定性、高效性和可靠性要求越来越高。Rocky Linux作为CentOS的替代品，已经成为企业级Linux发行版的首选之一。本文将详细介绍如何构建基于Rocky Linux的高效稳定集群环境，为企业提供强大的服务器架构支持。

1. Rocky Linux概述

Rocky Linux是一个社区支持的企业级操作系统，设计为与Red Hat Enterprise Linux (RHEL) 100%二进制兼容。它由CentOS的创始人Gregory Kurtzer发起，旨在填补CentOS转向CentOS Stream后留下的空白。

1.1 Rocky Linux的优势

• 稳定性：提供长期支持版本，适合企业关键应用
• 安全性：及时的安全更新和补丁
• 兼容性：与RHEL完全兼容，无需担心软件兼容性问题
• 社区支持：活跃的社区提供技术支持和持续开发
• 免费使用：无需许可费用，降低企业IT成本

2. 集群环境的基础概念

2.1 什么是集群

集群是一组相互连接的计算机，它们作为一个统一的计算资源工作，提供高可用性、负载均衡和并行处理能力。

2.2 集群的类型

• 高可用性集群(HA Cluster)：确保关键应用持续可用，减少停机时间
• 负载均衡集群：分配工作负载，优化资源使用
• 高性能计算集群(HPC)：用于处理复杂的计算任务
• 存储集群：提供集中式、高可用的存储解决方案

2.3 集群架构的关键组件

• 节点(Node)：集群中的单个服务器
• 负载均衡器(Load Balancer)：分发请求到不同节点
• 集群管理软件：如Pacemaker、Corosync等
• 共享存储：如NFS、iSCSI或分布式存储系统
• 心跳机制：监控节点健康状态
• 故障转移(Failover)：在节点故障时自动切换服务

3. Rocky Linux集群环境规划

3.1 需求分析

在构建集群前，需明确以下需求：

• 应用类型：Web服务、数据库、文件服务等
• 性能要求：CPU、内存、存储、网络带宽
• 可用性要求：预期的正常运行时间百分比
• 扩展性需求：未来可能的扩展规模
• 预算限制：硬件、软件和维护成本

3.2 集群规模设计

根据需求确定集群规模：

• 小型集群：2-3个节点，适合小型企业或部门级应用
• 中型集群：4-8个节点，适合中型企业应用
• 大型集群：9个以上节点，适合大型企业或云服务提供商

3.3 拓扑结构设计

常见的集群拓扑结构：

• 主动/被动模式：一个节点提供服务，另一个作为备份
• 主动/主动模式：所有节点同时提供服务
• N层架构：前端Web服务器、中间应用服务器、后端数据库服务器分层设计

4. 硬件和网络准备

4.1 硬件要求

• CPU：64位处理器，建议使用Intel Xeon或AMD EPYC系列
• 内存：至少16GB RAM，根据应用需求调整
• 存储：SSD用于系统盘，HDD或高性能SSD用于数据存储
• 网络接口：至少双网卡，建议使用万兆网卡

• SAN存储：通过光纤通道或iSCSI连接
• NAS存储：通过NFS或SMB协议访问
• 分布式存储：如Ceph、GlusterFS等

4.2 网络配置

[Internet] -> [防火墙] -> [负载均衡器] -> [集群节点]
|
v
[管理网络]

复制代码

为每个节点分配多个IP地址：

• 公共IP：对外提供服务的IP地址
• 私有IP：内部通信的IP地址
• 心跳IP：专用于集群心跳检测的IP地址
• 管理IP：用于系统管理的IP地址

# 编辑网络配置文件
vi /etc/sysconfig/network-scripts/ifcfg-eth0
# 示例配置
TYPE=Ethernet
BOOTPROTO=static
DEFROUTE=yes
NAME=eth0
DEVICE=eth0
ONBOOT=yes
IPADDR=192.168.1.10
PREFIX=24
GATEWAY=192.168.1.1
DNS1=8.8.8.8
DNS2=8.8.4.4
# 重启网络服务
systemctl restart network

复制代码

5. Rocky Linux系统安装和基础配置

5.1 系统安装

从官方网站下载最新的Rocky Linux ISO镜像：

wget https://download.rockylinux.org/pub/rocky/9/isos/x86_64/Rocky-9.1-x86_64-dvd.iso

复制代码

使用dd命令创建USB启动盘：

# 确定USB设备名称
lsblk
# 创建启动盘（假设USB设备为/dev/sdb）
dd if=Rocky-9.1-x86_64-dvd.iso of=/dev/sdb bs=4M status=progress

复制代码

1. 从USB启动盘启动计算机
2. 选择”Install Rocky Linux”
3. 配置语言、键盘和时区
4. 配置网络和主机名
5. 配置磁盘分区（建议使用LVM以便于扩展）
6. 设置root密码和创建用户
7. 开始安装并等待完成

5.2 系统基础配置

# 更新系统软件包
dnf update -y
# 安装常用工具
dnf install -y vim wget curl net-tools telnet

复制代码

# 设置主机名
hostnamectl set-hostname node1.example.com
# 编辑hosts文件
vi /etc/hosts
# 添加以下内容
192.168.1.10 node1.example.com node1
192.168.1.11 node2.example.com node2
192.168.1.12 node3.example.com node3

复制代码

# 安装chrony时间同步服务
dnf install -y chrony
# 启动并设置开机自启
systemctl start chronyd
systemctl enable chronyd
# 检查时间同步状态
chronyc sources

复制代码

# 启动防火墙
systemctl start firewalld
systemctl enable firewalld
# 开放必要端口（以Web服务为例）
firewall-cmd --permanent --add-service=http
firewall-cmd --permanent --add-service=https
firewall-cmd --permanent --add-service=ssh
# 重新加载防火墙配置
firewall-cmd --reload

复制代码

# 检查SELinux状态
sestatus
# 临时禁用SELinux
setenforce 0
# 永久禁用SELinux（编辑配置文件）
vi /etc/selinux/config
# 将SELINUX=enforcing改为SELINUX=disabled
SELINUX=disabled

复制代码

6. 集群软件安装和配置

6.1 高可用性集群软件

# 安装高可用性集群软件包
dnf install -y pcs pacemaker corosync fence-agents-all
# 设置hacluster用户密码
echo "password" | passwd --stdin hacluster
# 启动pcsd服务并设置开机自启
systemctl start pcsd
systemctl enable pcsd

复制代码

# 在所有节点上认证集群节点（只需在一个节点上执行）
pcs host auth node1.example.com node2.example.com node3.example.com -u hacluster -p password

复制代码

# 创建集群（只需在一个节点上执行）
pcs cluster setup --name mycluster node1.example.com node2.example.com node3.example.com
# 启动集群
pcs cluster start --all
# 设置集群开机自启
pcs cluster enable --all
# 检查集群状态
pcs status

复制代码

6.2 负载均衡软件

# 安装HAProxy
dnf install -y haproxy
# 配置HAProxy
vi /etc/haproxy/haproxy.cfg
# 基本配置示例
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
frontend http-in
bind *:80
default_backend servers
backend servers
balance roundrobin
server node1 192.168.1.10:80 check
server node2 192.168.1.11:80 check
server node3 192.168.1.12:80 check
# 启动HAProxy服务
systemctl start haproxy
systemctl enable haproxy

复制代码

# 安装Nginx
dnf install -y nginx
# 配置Nginx作为负载均衡器
vi /etc/nginx/nginx.conf
# 添加upstream和server配置
http {
upstream backend {
server 192.168.1.10:80;
server 192.168.1.11:80;
server 192.168.1.12:80;
}
server {
listen 80;
location / {
proxy_pass http://backend;
}
}
}
# 启动Nginx服务
systemctl start nginx
systemctl enable nginx

复制代码

7. 高可用性配置

7.1 配置浮动IP

# 创建浮动IP资源
pcs resource create ClusterIP ocf:heartbeat:IPaddr2 ip=192.168.1.100 cidr_netmask=24 op monitor interval=30s
# 确保浮动IP在集群启动时启动
pcs constraint colocation add ClusterIP with cluster

复制代码

7.2 配置Web服务高可用性

# 在所有节点上安装Apache
dnf install -y httpd
# 创建测试页面
echo "<h1>Node $(hostname)</h1>" > /var/www/html/index.html
# 启动Apache服务
systemctl start httpd
systemctl enable httpd

复制代码

# 创建Apache资源
pcs resource create WebServer ocf:heartbeat:apache configfile=/etc/httpd/conf/httpd.conf statusurl="http://localhost/server-status" op monitor interval=30s
# 设置资源约束
pcs constraint colocation add WebServer with ClusterIP INFINITY
pcs constraint order ClusterIP then WebServer

复制代码

7.3 配置STONITH设备

STONITH (Shoot The Other Node In The Head) 是一种确保数据完整性的机制，当节点发生故障时，它会强制重启或关闭故障节点。

# 配置fence_xvm设备（示例）
pcs stonith create vm-fence fence_xvm pcmk_host_map="node1.example.com:node1;node2.example.com:node2;node3.example.com:node3" op monitor interval=60s
# 启用STONITH
pcs property set stonith-enabled=true

复制代码

8. 负载均衡设置

8.1 配置HAProxy高可用性

# 安装HAProxy（如果尚未安装）
dnf install -y haproxy
# 配置HAProxy
vi /etc/haproxy/haproxy.cfg
# 添加以下配置
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
listen stats
bind *:9000
stats enable
stats uri /stats
stats refresh 30s
stats show-node
stats auth admin:password
frontend http-in
bind *:80
default_backend servers
backend servers
balance roundrobin
cookie SERVERID insert indirect nocache
server node1 192.168.1.10:80 check cookie node1
server node2 192.168.1.11:80 check cookie node2
server node3 192.168.1.12:80 check cookie node3
# 创建HAProxy资源
pcs resource create HAProxy systemd:haproxy op monitor interval=20s
# 设置资源约束
pcs constraint colocation add HAProxy with ClusterIP
pcs constraint order ClusterIP then HAProxy

复制代码

8.2 配置Keepalived实现VIP高可用

# 安装Keepalived
dnf install -y keepalived
# 配置Keepalived
vi /etc/keepalived/keepalived.conf
# 主节点配置示例
vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 2
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 101
advert_int 1
authentication {
auth_type PASS
auth_pass password
}
virtual_ipaddress {
192.168.1.100/24 dev eth0
}
track_script {
chk_haproxy
}
}
# 备节点配置示例（priority值较低）
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass password
}
virtual_ipaddress {
192.168.1.100/24 dev eth0
}
track_script {
chk_haproxy
}
}
# 启动Keepalived服务
systemctl start keepalived
systemctl enable keepalived

复制代码

9. 存储解决方案

9.1 配置NFS共享存储

# 安装NFS服务器
dnf install -y nfs-utils
# 创建共享目录
mkdir -p /data/shared
chmod 777 /data/shared
# 配置NFS共享
vi /etc/exports
# 添加以下内容
/data/shared 192.168.1.0/24(rw,sync,no_root_squash)
# 启动NFS服务
systemctl start nfs-server
systemctl enable nfs-server
# 导出共享目录
exportfs -a

复制代码

# 安装NFS客户端
dnf install -y nfs-utils
# 创建挂载点
mkdir -p /mnt/nfs
# 挂载NFS共享
mount 192.168.1.100:/data/shared /mnt/nfs
# 添加到fstab实现开机自动挂载
echo "192.168.1.100:/data/shared /mnt/nfs nfs defaults 0 0" >> /etc/fstab

复制代码

9.2 配置iSCSI共享存储

# 安装iSCSI目标软件
dnf install -y targetcli
# 启动并设置开机自启
systemctl start target
systemctl enable target
# 配置iSCSI目标
targetcli
# 创建后端存储
/backstores/block create disk1 /dev/sdb1
# 创建iSCSI目标
/iscsi create iqn.2023-01.com.example:storage.disk1
# 创建LUN
/iscsi/iqn.2023-01.com.example:storage.disk1/tpg1/luns create /backstores/block/disk1
# 设置ACL
/iscsi/iqn.2023-01.com.example:storage.disk1/tpg1/acls create iqn.2023-01.com.example:client
# 保存配置
saveconfig
exit

复制代码

# 安装iSCSI发起端软件
dnf install -y iscsi-initiator-utils
# 配置发起端名称
vi /etc/iscsi/initiatorname.iscsi
# 设置为与目标服务器ACL匹配的名称
InitiatorName=iqn.2023-01.com.example:client
# 启动并设置开机自启
systemctl start iscsid
systemctl enable iscsid
# 发现目标
iscsiadm -m discovery -t st -p 192.168.1.100
# 登录目标
iscsiadm -m node -l
# 查看新发现的磁盘
lsblk
# 分区并格式化新磁盘
fdisk /dev/sdb
mkfs.ext4 /dev/sdb1
# 挂载新磁盘
mkdir -p /mnt/iscsi
mount /dev/sdb1 /mnt/iscsi
# 添加到fstab实现开机自动挂载
echo "/dev/sdb1 /mnt/iscsi ext4 defaults,_netdev 0 0" >> /etc/fstab

复制代码

9.3 配置Ceph分布式存储

# 安装Ceph部署工具
dnf install -y cephadm
# 配置Ceph仓库
cephadm add-repo --release pacific
# 安装Ceph Common
dnf install -y ceph-common

复制代码

# 引导Ceph集群
cephadm bootstrap --mon-ip 192.168.1.10
# 安装Ceph CLI工具
cephadm install ceph-common
# 添加其他节点到集群
ceph orch host add node2 192.168.1.11
ceph orch host add node3 192.168.1.12
# 部署OSD（假设使用/dev/sdb作为OSD磁盘）
ceph orch daemon add osd node1:/dev/sdb
ceph orch daemon add osd node2:/dev/sdb
ceph orch daemon add osd node3:/dev/sdb
# 创建Ceph池
ceph osd pool create mypool 64 64
# 创建Ceph文件系统
ceph fs new myfs myfs_metadata myfs_data
# 挂载Ceph文件系统
mkdir -p /mnt/cephfs
mount -t ceph 192.168.1.10:6789:/ /mnt/cephfs
# 添加到fstab实现开机自动挂载
echo "192.168.1.10:6789:/ /mnt/cephfs ceph name=admin,secretfile=/etc/ceph/secret.key,noatime,_netdev 0 0" >> /etc/fstab

复制代码

10. 监控和维护

10.1 安装和配置Zabbix监控系统

# 安装Zabbix仓库
rpm -Uvh https://repo.zabbix.com/zabbix/5.0/rhel/9/x86_64/zabbix-release-5.0-1.el9.noarch.rpm
dnf clean all
# 安装Zabbix服务器、前端和代理
dnf install -y zabbix-server-mysql zabbix-web-mysql zabbix-apache-conf zabbix-sql-scripts zabbix-agent
# 安装MariaDB数据库
dnf install -y mariadb-server mariadb
# 启动MariaDB并设置开机自启
systemctl start mariadb
systemctl enable mariadb
# 配置MariaDB
mysql_secure_installation
# 创建Zabbix数据库和用户
mysql -u root -p
create database zabbix character set utf8 collate utf8_bin;
create user zabbix@localhost identified by 'password';
grant all privileges on zabbix.* to zabbix@localhost;
quit;
# 导入Zabbix数据库架构
zcat /usr/share/doc/zabbix-sql-scripts/mysql/create.sql.gz | mysql -uzabbix -p zabbix
# 配置Zabbix服务器
vi /etc/zabbix/zabbix_server.conf
# 设置数据库密码
DBPassword=password
# 启动Zabbix服务器和代理
systemctl restart zabbix-server zabbix-agent httpd php-fpm
systemctl enable zabbix-server zabbix-agent httpd php-fpm

复制代码

1. 访问http://zabbix-server-ip/zabbix
2. 按照安装向导完成前端配置
3. 默认用户名：Admin，密码：zabbix

# 在所有集群节点上安装Zabbix代理
rpm -Uvh https://repo.zabbix.com/zabbix/5.0/rhel/9/x86_64/zabbix-release-5.0-1.el9.noarch.rpm
dnf clean all
dnf install -y zabbix-agent
# 配置Zabbix代理
vi /etc/zabbix/zabbix_agentd.conf
# 设置服务器IP
Server=192.168.1.100
ServerActive=192.168.1.100
Hostname=node1.example.com
# 启动Zabbix代理
systemctl start zabbix-agent
systemctl enable zabbix-agent

复制代码

10.2 安装和配置Prometheus和Grafana

# 创建Prometheus用户
useradd --no-create-home --shell /bin/false prometheus
# 下载Prometheus
wget https://github.com/prometheus/prometheus/releases/download/v2.36.2/prometheus-2.36.2.linux-amd64.tar.gz
tar -xvzf prometheus-2.36.2.linux-amd64.tar.gz
# 移动文件到合适位置
mkdir -p /etc/prometheus /var/lib/prometheus
cp prometheus-2.36.2.linux-amd64/prometheus /usr/local/bin/
cp prometheus-2.36.2.linux-amd64/promtool /usr/local/bin/
cp -r prometheus-2.36.2.linux-amd64/console* /etc/prometheus/
cp -r prometheus-2.36.2.linux-amd64/prometheus.yml /etc/prometheus/
# 设置权限
chown -R prometheus:prometheus /etc/prometheus /var/lib/prometheus
chown prometheus:prometheus /usr/local/bin/prometheus
chown prometheus:prometheus /usr/local/bin/promtool
# 创建systemd服务文件
vi /etc/systemd/system/prometheus.service
# 添加以下内容
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file /etc/prometheus/prometheus.yml \
--storage.tsdb.path /var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries
[Install]
WantedBy=multi-user.target
# 启动Prometheus服务
systemctl start prometheus
systemctl enable prometheus

复制代码

# 在所有集群节点上安装Node Exporter
useradd --no-create-home --shell /bin/false node_exporter
wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz
tar -xvzf node_exporter-1.3.1.linux-amd64.tar.gz
cp node_exporter-1.3.1.linux-amd64/node_exporter /usr/local/bin/
chown -R node_exporter:node_exporter /usr/local/bin/node_exporter
# 创建systemd服务文件
vi /etc/systemd/system/node_exporter.service
# 添加以下内容
[Unit]
Description=Node Exporter
After=network.target
[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=multi-user.target
# 启动Node Exporter服务
systemctl start node_exporter
systemctl enable node_exporter

复制代码

# 编辑Prometheus配置文件
vi /etc/prometheus/prometheus.yml
# 添加以下内容到scrape_configs部分
- job_name: 'cluster_nodes'
static_configs:
- targets: ['node1.example.com:9100']
- targets: ['node2.example.com:9100']
- targets: ['node3.example.com:9100']
# 重启Prometheus服务
systemctl restart prometheus

复制代码

# 安装Grafana仓库
dnf install -y grafana
# 启动Grafana服务
systemctl start grafana-server
systemctl enable grafana-server
# 配置防火墙
firewall-cmd --permanent --add-port=3000/tcp
firewall-cmd --reload
# 访问Grafana Web界面（http://grafana-server-ip:3000）
# 默认用户名：admin，密码：admin
# 添加Prometheus数据源
1. 登录Grafana
2. 进入Configuration > Data Sources
3. 点击Add data source
4. 选择Prometheus
5. 设置URL为http://prometheus-server-ip:9090
6. 点击Save & Test
# 导入Node Exporter仪表板
1. 进入Dashboards > Import
2. 输入仪表板ID：1860
3. 点击Load
4. 选择Prometheus数据源
5. 点击Import

复制代码

11. 安全性考虑

11.1 系统安全加固

# 编辑SSH配置文件
vi /etc/ssh/sshd_config
# 修改以下配置
PermitRootLogin no
PasswordAuthentication no
Port 2222
AllowUsers adminuser
# 重启SSH服务
systemctl restart sshd

复制代码

# 配置防火墙规则
firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" service name="ssh" accept'
firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" service name="http" accept'
firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" service name="https" accept'
firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" port protocol="tcp" port="5405" accept'
firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" port protocol="udp" port="5404" accept'
firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" port protocol="tcp" port="21064" accept'
firewall-cmd --reload

复制代码

# 安装Fail2ban
dnf install -y fail2ban
# 创建配置文件
cp /etc/fail2ban/jail.conf /etc/fail2ban/jail.local
# 编辑配置文件
vi /etc/fail2ban/jail.local
# 添加以下内容
[sshd]
enabled = true
port = 2222
filter = sshd
logpath = /var/log/secure
maxretry = 3
bantime = 3600
# 启动Fail2ban服务
systemctl start fail2ban
systemctl enable fail2ban

复制代码

11.2 集群安全配置

# 生成Corosync密钥
corosync-keygen
# 复制密钥到所有节点
scp /etc/corosync/authkey node2.example.com:/etc/corosync/
scp /etc/corosync/authkey node3.example.com:/etc/corosync/
# 设置正确的权限
chmod 400 /etc/corosync/authkey

复制代码

# 设置Pacemaker属性
pcs property set stonith-enabled=true
pcs property set no-quorum-policy=stop
pcs property set symmetric-cluster=true
pcs property set default-resource-stickiness=100

复制代码

12. 实战案例

12.1 Web服务器集群案例

构建一个高可用的Web服务器集群，包含3个节点，使用Apache作为Web服务器，HAProxy作为负载均衡器，NFS作为共享存储。

[Internet] -> [防火墙] -> [HAProxy (VIP: 192.168.1.100)] -> [Web服务器集群]
|
v
[NFS共享存储]

复制代码

1. 系统准备

# 在所有节点上更新系统
dnf update -y
# 安装必要软件包
dnf install -y vim wget curl net-tools telnet
# 配置主机名和hosts文件
hostnamectl set-hostname node1.example.com
echo "192.168.1.10 node1.example.com node1" >> /etc/hosts
echo "192.168.1.11 node2.example.com node2" >> /etc/hosts
echo "192.168.1.12 node3.example.com node3" >> /etc/hosts
# 配置时间同步
dnf install -y chrony
systemctl start chronyd
systemctl enable chronyd

复制代码

1. 安装和配置集群软件

# 在所有节点上安装集群软件
dnf install -y pcs pacemaker corosync fence-agents-all
# 设置hacluster用户密码
echo "password" | passwd --stdin hacluster
# 启动pcsd服务
systemctl start pcsd
systemctl enable pcsd
# 在node1上认证集群节点
pcs host auth node1.example.com node2.example.com node3.example.com -u hacluster -p password
# 创建集群
pcs cluster setup --name webcluster node1.example.com node2.example.com node3.example.com
# 启动集群
pcs cluster start --all
pcs cluster enable --all

复制代码

1. 配置浮动IP

# 创建浮动IP资源
pcs resource create WebVIP ocf:heartbeat:IPaddr2 ip=192.168.1.100 cidr_netmask=24 op monitor interval=30s

复制代码

1. 安装和配置Web服务器

# 在所有节点上安装Apache
dnf install -y httpd
# 创建测试页面
mkdir -p /var/www/html
echo "<h1>Web Server Cluster</h1>" > /var/www/html/index.html
# 启动Apache服务
systemctl start httpd
systemctl enable httpd

复制代码

1. 配置Apache为集群资源

# 创建Apache资源
pcs resource create WebServer ocf:heartbeat:apache configfile=/etc/httpd/conf/httpd.conf statusurl="http://localhost/server-status" op monitor interval=30s
# 设置资源约束
pcs constraint colocation add WebServer with WebVIP INFINITY
pcs constraint order WebVIP then WebServer

复制代码

1. 安装和配置NFS共享存储

# 在专用存储服务器上安装NFS
dnf install -y nfs-utils
# 创建共享目录
mkdir -p /data/web
chmod 777 /data/web
# 配置NFS共享
echo "/data/web 192.168.1.0/24(rw,sync,no_root_squash)" >> /etc/exports
# 启动NFS服务
systemctl start nfs-server
systemctl enable nfs-server
exportfs -a

复制代码

1. 在Web服务器节点上挂载NFS共享

# 在所有Web服务器节点上安装NFS客户端
dnf install -y nfs-utils
# 创建挂载点
mkdir -p /var/www/html
# 挂载NFS共享
mount 192.168.1.100:/data/web /var/www/html
# 添加到fstab
echo "192.168.1.100:/data/web /var/www/html nfs defaults,_netdev 0 0" >> /etc/fstab

复制代码

1. 安装和配置HAProxy

# 在专用负载均衡器节点上安装HAProxy
dnf install -y haproxy
# 配置HAProxy
vi /etc/haproxy/haproxy.cfg
# 添加以下配置
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
listen stats
bind *:9000
stats enable
stats uri /stats
stats refresh 30s
stats show-node
stats auth admin:password
frontend http-in
bind *:80
default_backend servers
backend servers
balance roundrobin
cookie SERVERID insert indirect nocache
server node1 192.168.1.10:80 check cookie node1
server node2 192.168.1.11:80 check cookie node2
server node3 192.168.1.12:80 check cookie node3
# 启动HAProxy服务
systemctl start haproxy
systemctl enable haproxy

复制代码

1. 测试集群

# 检查集群状态
pcs status
# 测试Web服务
curl http://192.168.1.100
# 模拟节点故障
pcs node standby node1.example.com
# 再次测试Web服务
curl http://192.168.1.100
# 恢复节点
pcs node unstandby node1.example.com

复制代码

12.2 数据库集群案例

构建一个高可用的MySQL数据库集群，包含3个节点，使用Galera Cluster进行多主复制，HAProxy作为负载均衡器。

[应用服务器] -> [HAProxy (VIP: 192.168.1.100)] -> [MySQL Galera集群]
(node1, node2, node3)

复制代码

1. 系统准备

# 在所有节点上更新系统
dnf update -y
# 安装必要软件包
dnf install -y vim wget curl net-tools telnet
# 配置主机名和hosts文件
hostnamectl set-hostname dbnode1.example.com
echo "192.168.1.10 dbnode1.example.com dbnode1" >> /etc/hosts
echo "192.168.1.11 dbnode2.example.com dbnode2" >> /etc/hosts
echo "192.168.1.12 dbnode3.example.com dbnode3" >> /etc/hosts
# 配置时间同步
dnf install -y chrony
systemctl start chronyd
systemctl enable chronyd

复制代码

1. 安装MariaDB和Galera

# 在所有节点上安装MariaDB和Galera
dnf install -y mariadb-server mariadb-client galera
# 启动MariaDB服务
systemctl start mariadb
systemctl enable mariadb
# 运行安全安装脚本
mysql_secure_installation

复制代码

1. 配置Galera集群

# 在所有节点上创建Galera配置文件
vi /etc/my.cnf.d/galera.cnf
# 添加以下内容
[mysqld]
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0
# Galera Provider Configuration
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
# Galera Cluster Configuration
wsrep_cluster_name="my_galera_cluster"
wsrep_cluster_address="gcomm://dbnode1.example.com,dbnode2.example.com,dbnode3.example.com"
# Galera Synchronization Configuration
wsrep_sst_method=rsync
# Galera Node Configuration
wsrep_node_address="dbnode1.example.com" # 在每个节点上使用对应的主机名
wsrep_node_name="dbnode1" # 在每个节点上使用对应的节点名

复制代码

1. 启动集群

# 在第一个节点上启动集群
systemctl stop mariadb
galera_new_cluster
# 在其他节点上启动MariaDB
systemctl start mariadb
# 检查集群状态
mysql -u root -p -e "SHOW STATUS LIKE 'wsrep_cluster_size'"

复制代码

1. 安装和配置HAProxy

# 在专用负载均衡器节点上安装HAProxy
dnf install -y haproxy
# 配置HAProxy
vi /etc/haproxy/haproxy.cfg
# 添加以下配置
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
defaults
mode tcp
log global
option httplog
option dontlognull
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout check 10s
maxconn 3000
listen stats
bind *:9000
stats enable
stats uri /stats
stats refresh 30s
stats show-node
stats auth admin:password
listen mysql-cluster
bind *:3306
mode tcp
balance roundrobin
option mysql-check user haproxy_check
server dbnode1 192.168.1.10:3306 check
server dbnode2 192.168.1.11:3306 check
server dbnode3 192.168.1.12:3306 check
# 启动HAProxy服务
systemctl start haproxy
systemctl enable haproxy

复制代码

1. 创建监控用户

# 在所有MySQL节点上创建监控用户
mysql -u root -p
CREATE USER 'haproxy_check'@'%';
FLUSH PRIVILEGES;
quit;

复制代码

1. 测试集群

# 测试数据库连接
mysql -h 192.168.1.100 -u root -p
# 创建测试数据库和表
CREATE DATABASE testdb;
USE testdb;
CREATE TABLE testtable (id INT PRIMARY KEY AUTO_INCREMENT, name VARCHAR(50));
INSERT INTO testtable (name) VALUES ('test');
# 在另一个节点上验证数据复制
mysql -h 192.168.1.11 -u root -p
USE testdb;
SELECT * FROM testtable;
# 模拟节点故障
systemctl stop mariadb
# 测试高可用性
mysql -h 192.168.1.100 -u root -p
USE testdb;
INSERT INTO testtable (name) VALUES ('test2');
# 恢复节点
systemctl start mariadb
# 验证数据同步
mysql -h 192.168.1.10 -u root -p
USE testdb;
SELECT * FROM testtable;

复制代码

13. 总结和最佳实践

13.1 总结

本文详细介绍了如何构建基于Rocky Linux的高效稳定集群环境，涵盖了从系统安装、集群配置、高可用性设置到负载均衡和存储解决方案的各个方面。通过实战案例，我们展示了Web服务器集群和数据库集群的具体实施步骤，为企业构建高效稳定的服务器架构提供了全面指导。

13.2 最佳实践

1. 规划先行：在构建集群前，充分了解需求，进行详细规划，包括硬件、网络、软件和存储等方面。
2. 安全第一：始终将安全性放在首位，包括系统加固、网络安全配置和数据加密等。
3. 监控完备：建立全面的监控系统，实时监控集群状态，及时发现和解决问题。
4. 文档记录：详细记录集群配置、变更和故障处理过程，便于后续维护和问题排查。
5. 定期备份：制定完善的备份策略，定期备份关键数据和配置文件。
6. 测试验证：对集群配置进行充分测试，包括功能测试、性能测试和故障恢复测试。
7. 版本控制：对配置文件使用版本控制系统，便于追踪变更和回滚。
8. 持续优化：根据实际运行情况，持续优化集群配置，提高性能和稳定性。
9. 团队培训：确保团队成员熟悉集群架构和管理流程，提高运维效率。
10. 社区参与：积极参与Rocky Linux和相关软件的社区，获取最新信息和技术支持。

规划先行：在构建集群前，充分了解需求，进行详细规划，包括硬件、网络、软件和存储等方面。

安全第一：始终将安全性放在首位，包括系统加固、网络安全配置和数据加密等。

监控完备：建立全面的监控系统，实时监控集群状态，及时发现和解决问题。

文档记录：详细记录集群配置、变更和故障处理过程，便于后续维护和问题排查。

定期备份：制定完善的备份策略，定期备份关键数据和配置文件。

测试验证：对集群配置进行充分测试，包括功能测试、性能测试和故障恢复测试。

版本控制：对配置文件使用版本控制系统，便于追踪变更和回滚。

持续优化：根据实际运行情况，持续优化集群配置，提高性能和稳定性。

团队培训：确保团队成员熟悉集群架构和管理流程，提高运维效率。

社区参与：积极参与Rocky Linux和相关软件的社区，获取最新信息和技术支持。

通过遵循这些最佳实践，企业可以构建出高效稳定、安全可靠的Rocky Linux集群环境，为业务发展提供强有力的IT基础设施支持。

版权声明

1、转载或引用本网站内容(Rocky Linux集群环境构建实战打造高效稳定的企业服务器架构)须注明原网址及作者(威震华夏关云长)，并标明本网站网址(https://www.pixtech.cc/)。

2、对于不当转载或引用本网站内容而引起的民事纷争、行政处理或其他损失，本网站不承担责任。

3、对不遵守本声明或其他违法、恶意使用本网站内容者，本网站保留追究其法律责任的权利。

本文地址: https://www.pixtech.cc/thread-40932-1-1.html

	通知：是的！我们正在计划一个大动作！	11-02 12:46
	通知：Telegram 推送频道https://t.me/+2tB3a7aKXlw2YjA1 及时接收第一手论坛帖子信息～	10-23 09:32
	通知：本站资源由网友上传分享，如有违规等问题请到版务模块进行投诉，将及时处理！	10-23 09:31
	通知：加入QQ社群吧 https://qm.qq.com/q/QZibQd1hiq	10-23 09:28
	通知：签到时间调整为每日4:00（东八区）	10-23 09:26

活动公告

Rocky Linux集群环境构建实战打造高效稳定的企业服务器架构

马上注册，结交更多好友，享用更多功能，让你轻松玩转社区。

版权声明

浏览过的版块

财Doro

三倍冰淇淋

无人之境【一阶】

立华奏

小樱（小丑装）

⑨的冰沙

以外的星空【二阶】

友情链接

频道订阅

加入社群