PostgreSQL高可用之repmgr(1主2从+1witness)+Pgpool-II实现主从切换+读写分离

PostgreSQL高可用之repmgr(1主2从+1witness)+Pgpool-II实现主从切换+读写分离

一、总体架构

二、环境申请

三、配置4台主机之间的互信

四、配置repmgr

4.1、主库修改pg_hba.conf参数文件

4.2、主库修改postgresql.conf参数文件

4.3、主库创建相关用户和数据库

4.4、4个节点分别修改repmgr.conf

4.5、在主库注册主库服务

4.6、克隆备库1和备库2

4.6.1、4个节点都配~/.pgpass密码文件

4.6.2、克隆备库1

八、总结

一、总体架构

本文最终需要实现的目标:

1、1主2从流复制

2、读写分离+负载均衡(pgpool-II)

3、主从自动切换(repmgr)

所有的节点:安装操作系统、创建PG用户目录、安装PG软件、安装repmgr

主库:只初始化主库、启动主库归档

二、环境申请

先申请4台机器,6.66节点最后再申请,反正docker容器都是即开即用的。

-- 主库

docker rm -f lhrrepmgr64361

docker run -d --name lhrrepmgr64361 -h lhrrepmgr64361 \

--net=pg-network --ip 172.72.6.61 \

-p 64361:5432 \

-v /sys/fs/cgroup:/sys/fs/cgroup \

--privileged=true lhrbest/lhrpgall:1.0 \

/usr/sbin/init

docker network connect bridge lhrrepmgr64361

-- 从库1

docker rm -f lhrrepmgr64362

docker run -d --name lhrrepmgr64362 -h lhrrepmgr64362 \

--net=pg-network --ip 172.72.6.62 \

-p 64362:5432 \

-v /sys/fs/cgroup:/sys/fs/cgroup \

--privileged=true lhrbest/lhrpgall:1.0 \

/usr/sbin/init

docker network connect bridge lhrrepmgr64362

-- 从库2

docker rm -f lhrrepmgr64363

docker run -d --name lhrrepmgr64363 -h lhrrepmgr64363 \

--net=pg-network --ip 172.72.6.63 \

-p 64363:5432 \

-v /sys/fs/cgroup:/sys/fs/cgroup \

--privileged=true lhrbest/lhrpgall:1.0 \

/usr/sbin/init

docker network connect bridge lhrrepmgr64363

-- Witness库

docker rm -f lhrrepmgr64364

docker run -d --name lhrrepmgr64364 -h lhrrepmgr64364 \

--net=pg-network --ip 172.72.6.64 \

-p 64364:5432 \

-v /sys/fs/cgroup:/sys/fs/cgroup \

--privileged=true lhrbest/lhrpgall:1.0 \

/usr/sbin/init

docker network connect bridge lhrrepmgr64364

-- 重启4台机器

docker restart lhrrepmgr64361 lhrrepmgr64362 lhrrepmgr64363 lhrrepmgr64364

[root@docker35 ~]# docker ps | grep repmgr

31d3d31c1073 lhrbest/lhrpgall:1.0 "/usr/sbin/init" 41 hours ago Up 41 hours 5433-5435/tcp, 0.0.0.0:64364->5432/tcp lhrrepmgr64364

572d4ea2c072 lhrbest/lhrpgall:1.0 "/usr/sbin/init" 41 hours ago Up 41 hours 5433-5435/tcp, 0.0.0.0:64363->5432/tcp lhrrepmgr64363

6ded416b2016 lhrbest/lhrpgall:1.0 "/usr/sbin/init" 41 hours ago Up 41 hours 5433-5435/tcp, 0.0.0.0:64362->5432/tcp lhrrepmgr64362

fa38b58b8f3d lhrbest/lhrpgall:1.0 "/usr/sbin/init" 41 hours ago Up 41 hours 5433-5435/tcp, 0.0.0.0:64361->5432/tcp lhrrepmgr64361

[root@docker35 ~]#

👉 注意:

1、镜像lhrbest/lhrpgall:1.0中,申请的4台主机均已安装好PG 13和repmgr软件

2、PG安装方式为编译安装,数据库已初始化完成,用户为pg13

3、该镜像里已安装了PG13、PG12、PG11、PG10和PG9.6这几个版本,均为编译安装,本文使用PG13来操作

三、配置4台主机之间的互信

-- 只在主库61节点上,以root用户执行:

chmod +x sshUserSetup.sh

./sshUserSetup.sh -user pg13 -hosts "172.72.6.61 172.72.6.62 172.72.6.63 172.72.6.64" -advanced exverify -confirm

-- 在每台主机修改权限,否则报错:Bad owner or permissions on /home/pg13/.ssh/config

chmod 600 /home/pg13/.ssh/config

依次输入:yes,回车和密码即可。

脚本下载:https://share.weiyun.com/57HUxNi

验证:

ssh pg13@lhrrepmgr64361 date

ssh pg13@lhrrepmgr64362 date

ssh pg13@lhrrepmgr64363 date

ssh pg13@lhrrepmgr64364 date

第二次执行时不再提示输入yes,并且可以成功执行命令,则表示SSH对等性配置成功。

执行过程:

[pg13@lhrrepmgr64361 /]# ssh pg13@lhrrepmgr64362 date

Tue Apr 27 17:15:04 CST 2021

[root@lhrrepmgr64361 /]#

[pg13@lhrrepmgr64361 /]# ssh pg13@lhrrepmgr64363 date

Tue Apr 27 17:15:08 CST 2021

[root@lhrrepmgr64361 /]#

[pg13@lhrrepmgr64361 /]# ssh pg13@lhrrepmgr64364 date

Tue Apr 27 17:15:13 CST 2021

[pg13@lhrrepmgr64361 /]#

四、配置repmgr

4.1、主库修改pg_hba.conf参数文件

-- 进入主库

docker exec -it lhrrepmgr64361 bash

su - pg13

cat >> /pg13/pgdata/pg_hba.conf <<"EOF"

local repmgr repmgr md5

host repmgr repmgr 127.0.0.1/32 md5

host repmgr repmgr 172.72.6.0/24 md5

local replication repmgr md5

host replication repmgr 127.0.0.1/32 md5

host replication repmgr 172.72.6.0/24 md5

EOF

4.2、主库修改postgresql.conf参数文件

-- 修改参数

cat >> /pg13/pgdata/postgresql.conf <<"EOF"

# 归档参数

wal_level='replica'

archive_mode='on'

archive_command='test ! -f /pg13/archive/%f && cp %p /pg13/archive/%f'

restore_command='cp /pg13/archive/%f %p'

# 主从流复制

hot_standby=on

max_wal_senders=10

wal_sender_timeout=60s

wal_keep_size=16MB

# 主从切换参数,启用PG数据库的复制槽,PG12不需要"use_replication_slots=true"这个参数了。

max_replication_slots=10

wal_log_hints=on

# 自动切换

shared_preload_libraries ='repmgr'

EOF

-- 重启

pg_ctl start

-- 查询

psql -U postgres -h 192.168.66.35 -p 64361

select * from pg_settings where name in ('wal_level','archive_mode','archive_command');

-- 切换归档

select pg_switch_wal();

4.3、主库创建相关用户和数据库

-- 创建相关用户和数据库

su - pg13

createuser -s repmgr

createdb repmgr -O repmgr

psql -h 127.0.0.1 -c "alter user repmgr with password 'lhr';"

psql -h 127.0.0.1 -c "alter user repmgr set search_path to repmgr, \"\$user\",public;"

结果:

[root@lhrrepmgr64361 /]# systemctl start pg13

[root@lhrrepmgr64361 /]# systemctl status pg13

● pg13.service - PostgreSQL database server

Loaded: loaded (/etc/systemd/system/pg13.service; disabled; vendor preset: disabled)

Active: active (running) since Tue 2021-04-27 16:25:24 CST; 6s ago

Docs: man:postgres(1)

Process: 769 ExecStart=/pg13/pg13/bin/pg_ctl start -D ${PGDATA} -s -o -p ${PGPORT} -w -t 300 (code=exited, status=0/SUCCESS)

Main PID: 771 (postgres)

CGroup: /docker/a777ef12d5ff83a9d47be51a98531bd45b42d2b008f7a25f894f3244ce9cc0d4/system.slice/pg13.service

├─771 /pg13/pg13/bin/postgres -D /pg13/pgdata -p 5432

├─772 postgres: logger

├─774 postgres: checkpointer

├─775 postgres: background writer

├─776 postgres: walwriter

├─777 postgres: autovacuum launcher

├─778 postgres: stats collector

└─779 postgres: logical replication launcher

Apr 27 16:25:23 lhrrepmgr64361 systemd[1]: Starting PostgreSQL database server...

Apr 27 16:25:24 lhrrepmgr64361 pg_ctl[769]: 2021-04-27 16:25:24.024 CST [771] LOG: redirecting log output to logging collector process

Apr 27 16:25:24 lhrrepmgr64361 pg_ctl[769]: 2021-04-27 16:25:24.024 CST [771] HINT: Future log output will appear in directory "pg_log".

Apr 27 16:25:24 lhrrepmgr64361 systemd[1]: Started PostgreSQL database server.

[root@lhrrepmgr64361 /]# su - pg13

Last login: Tue Apr 27 16:24:50 CST 2021 on pts/0

[pg13@lhrrepmgr64361 ~]$

[pg13@lhrrepmgr64361 ~]$ createuser -s repmgr

[pg13@lhrrepmgr64361 ~]$

[pg13@lhrrepmgr64361 ~]$ createdb repmgr -O repmgr

[pg13@lhrrepmgr64361 ~]$

[pg13@lhrrepmgr64361 ~]$ psql -h 127.0.0.1 -c "alter user repmgr with password 'lhr'"

ALTER ROLE

[pg13@lhrrepmgr64361 ~]$

[pg13@lhrrepmgr64361 ~]$ psql -h 127.0.0.1 -c "alter user repmgr set search_path to repmgr, \"\$user\",public";

ALTER ROLE

C:\Users\lhrxxt>psql -U postgres -h 192.168.66.35 -p 64361

Password for user postgres:

psql (13.2)

Type "help" for help.

postgres=# \du

List of roles

Role name | Attributes | Member of

-----------+------------------------------------------------------------+-----------

postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {}

repmgr | Superuser, Create role, Create DB | {}

postgres=# \l

List of databases

Name | Owner | Encoding | Collate | Ctype | Access privileges

-----------+----------+----------+------------+------------+-----------------------

postgres | postgres | UTF8 | en_US.utf8 | en_US.utf8 |

repmgr | repmgr | UTF8 | en_US.utf8 | en_US.utf8 |

template0 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +

| | | | | postgres=CTc/postgres

template1 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +

| | | | | postgres=CTc/postgres

(4 rows)

4.4、4个节点分别修改repmgr.conf

-- 以pg13用户修改

su - pg13

-- 主库

cat > /pg13/pg13/repmgr.conf << "EOF"

node_id=1

node_name=lhrrepmgr64361

conninfo='host=172.72.6.61 user=repmgr password=lhr dbname=repmgr connect_timeout=2'

data_directory='/pg13/pgdata'

pg_bindir='/pg13/pg13/bin'

EOF

-- 从库1

cat > /pg13/pg13/repmgr.conf << "EOF"

node_id=2

node_name=lhrrepmgr64362

conninfo='host=172.72.6.62 user=repmgr password=lhr dbname=repmgr connect_timeout=2'

data_directory='/pg13/pgdata'

pg_bindir='/pg13/pg13/bin'

EOF

-- 从库2

cat > /pg13/pg13/repmgr.conf << "EOF"

node_id=3

node_name=lhrrepmgr64363

conninfo='host=172.72.6.63 user=repmgr password=lhr dbname=repmgr connect_timeout=2'

data_directory='/pg13/pgdata'

pg_bindir='/pg13/pg13/bin'

EOF

-- witness节点

cat > /pg13/pg13/repmgr.conf << "EOF"

node_id=4

node_name=lhrrepmgr64364

conninfo='host=172.72.6.64 user=repmgr password=lhr dbname=repmgr connect_timeout=2'

data_directory='/pg13/pgdata'

pg_bindir='/pg13/pg13/bin'

EOF

-- 测试

psql 'host=172.72.6.61 user=repmgr password=lhr dbname=repmgr connect_timeout=2'

psql 'host=172.72.6.62 user=repmgr password=lhr dbname=repmgr connect_timeout=2'

psql 'host=172.72.6.63 user=repmgr password=lhr dbname=repmgr connect_timeout=2'

psql 'host=172.72.6.64 user=repmgr password=lhr dbname=repmgr connect_timeout=2'

4.5、在主库注册主库服务

-- 注册服务

repmgr -f /pg13/pg13/repmgr.conf primary register

-- 查询

repmgr -f /pg13/pg13/repmgr.conf cluster show

psql -U repmgr -h 192.168.66.35 -p 64361 -d repmgr

执行过程:

[pg13@lhrrepmgr64361 pg13]$ repmgr -f /pg13/pg13/repmgr.conf primary register

INFO: connecting to primary database...

NOTICE: attempting to install extension "repmgr"

NOTICE: "repmgr" extension successfully installed

NOTICE: primary node record (ID: 1) registered

[pg13@lhrrepmgr64361 pg13]$

[pg13@lhrrepmgr64361 pg13]$ repmgr -f /pg13/pg13/repmgr.conf cluster show

ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string

----+----------------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------

1 | lhrrepmgr64361 | primary | * running | | default | 100 | 1 | host=172.72.6.61 user=repmgr password=lhr dbname=repmgr connect_timeout=2

C:\Users\lhrxxt>psql -U repmgr -h 192.168.66.35 -p 64361 -d repmgr

Password for user repmgr:

psql (13.2)

Type "help" for help.

repmgr=# select * from repmgr.nodes;

node_id | upstream_node_id | active | node_name | type | location | priority | conninfo | repluser | slot_name | config_file

---------+------------------+--------+----------------+---------+----------+----------+---------------------------------------------------------------------------+----------+-----------+------------------------

1 | | t | lhrrepmgr64361 | primary | default | 100 | host=172.72.6.61 user=repmgr password=lhr dbname=repmgr connect_timeout=2 | repmgr | | /pg13/pg13/repmgr.conf

(1 row)

4.6、克隆备库1和备库2

4.6.1、4个节点都配~/.pgpass密码文件

su - pg13

echo "*:*:*:repmgr:lhr" > ~/.pgpass

chmod 0600 ~/.pgpass

-- 测试,若不需要密码,那就证明配置正确

psql 'host=172.72.6.61 user=repmgr dbname=repmgr connect_timeout=2'

psql 'host=172.72.6.62 user=repmgr dbname=repmgr connect_timeout=2'

psql 'host=172.72.6.63 user=repmgr dbname=repmgr connect_timeout=2'

psql 'host=172.72.6.64 user=repmgr dbname=repmgr connect_timeout=2'

4.6.2、克隆备库1

-- 其中--dry-run表示不真实执行克隆过程,只是检查是否有问题

repmgr -h 172.72.6.61 -U repmgr -d repmgr -f /pg13/pg13/repmgr.conf standby clone --force --dry-run

-- 执行克隆命令,其实还是调用pg_basebackup

repmgr -h 172.72.6.61 -U repmgr -d repmgr -f /pg13/pg13/repmgr.conf standby clone --force

-- 启动

pg_ctl -D /pg13/pgdata start

-- 查询

psql

select * from pg_stat_wal_receiver;

执行过程:

[pg13@lhrrepmgr64362 ~]$ repmgr -h 172.72.6.61 -U repmgr -d repmgr -f /pg13/pg13/repmgr.conf standby clone --force --dry-run

NOTICE: destination directory "/pg13/pgdata" provided

INFO: connecting to source node

DETAIL: connection string is: host=172.72.6.61 user=repmgr dbname=repmgr

DETAIL: current installation size is 29 MB

INFO: "repmgr" extension is installed in database "repmgr"

WARNING: target data directory appears to be a PostgreSQL data directory

DETAIL: target data directory is "/pg13/pgdata"

HINT: use -F/--force to overwrite the existing data directory

INFO: replication slot usage not requested; no replication slot will be set up for this standby

INFO: parameter "max_wal_senders" set to 10

NOTICE: checking for available walsenders on the source node (2 required)

INFO: sufficient walsenders available on the source node

DETAIL: 2 required, 10 available

NOTICE: checking replication connections can be made to the source server (2 required)

INFO: required number of replication connections could be made to the source server

DETAIL: 2 replication connections required

WARNING: data checksums are not enabled and "wal_log_hints" is "off"

DETAIL: pg_rewind requires "wal_log_hints" to be enabled

NOTICE: standby will attach to upstream node 1

HINT: consider using the -c/--fast-checkpoint option

INFO: all prerequisites for "standby clone" are met

[pg13@lhrrepmgr64362 ~]$ repmgrrepmgr -h 172.72.6.61 -U repmgr -d repmgr -f /pg13/pg13/repmgr.conf standby clone^C

[pg13@lhrrepmgr64362 ~]$ repmgr -h 172.72.6.61 -U repmgr -d repmgr -f /pg13/pg13/repmgr.conf standby clone

NOTICE: destination directory "/pg13/pgdata" provided

INFO: connecting to source node

DETAIL: connection string is: host=172.72.6.61 user=repmgr dbname=repmgr

DETAIL: current installation size is 29 MB

ERROR: target data directory appears to be a PostgreSQL data directory

DETAIL: target data directory is "/pg13/pgdata"

HINT: use -F/--force to overwrite the existing data directory

[pg13@lhrrepmgr64362 ~]$ repmgr -h 172.72.6.61 -U repmgr -d repmgr -f /pg13/pg13/repmgr.conf standby clone --force

NOTICE: destination directory "/pg13/pgdata" provided

INFO: connecting to source node

DETAIL: connection string is: host=172.72.6.61 user=repmgr dbname=repmgr

DETAIL: current installation size is 29 MB

INFO: replication slot usage not requested; no replication slot will be set up for this standby

NOTICE: checking for available walsenders on the source node (2 required)

NOTICE: checking replication connections can be made to the source server (2 required)

WARNING: data checksums are not enabled and "wal_log_hints" is "off"

DETAIL: pg_rewind requires "wal_log_hints" to be enabled

WARNING: directory "/pg13/pgdata" exists but is not empty

NOTICE: -F/--force provided - deleting existing data directory "/pg13/pgdata"

NOTICE: starting backup (using pg_basebackup)...

HINT: this may take some time; consider using the -c/--fast-checkpoint option

INFO: executing:

/pg13/pg13/bin/pg_basebackup -l "repmgr base backup" -D /pg13/pgdata -h 172.72.6.61 -p 5432 -U repmgr -X stream

Password:

WARNING: skipping special file "./.s.PGSQL.5432"

WARNING: skipping special file "./.s.PGSQL.5432"

NOTICE: standby clone (using pg_basebackup) complete

NOTICE: you can now start your PostgreSQL server

HINT: for example: pg_ctl -D /pg13/pgdata start

HINT: after starting the server, you need to register this standby with "repmgr standby register"

[pg13@lhrrepmgr64362 ~]$ pg_ctl -D /pg13/pgdata start

waiting for server to start....2021-04-27 19:14:08.750 CST [9298] LOG: redirecting log output to logging collector process

2021-04-27 19:14:08.750 CST [9298] HINT: Future log output will appear in directory "pg_log".

done

server started

[pg13@lhrrepmgr64362 ~]$ psql

psql (13.2)

Type "help" for help.

postgres=# select * from pg_stat_wal_receiver;

pid | status | receive_start_lsn | receive_start_tli | written_lsn | flushed_lsn | received_tli | last_msg_send_time | last_msg_receipt_time | latest_end_lsn | latest_end_time | slot_name | sender_host | sender_port | conninfo

------+-----------+-------------------+-------------------+-------------+-------------+--------------+-------------------------------+-------------------------------+----------------+-------------------------------+-----------+-------------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

9304 | streaming | 0/3000000 | 1 | 0/30001F0 | 0/30001F0 | 1 | 2021-04-27 19:15:08.862599+08 | 2021-04-27 19:15:08.862719+08 | 0/30001F0 | 2021-04-27 19:14:08.830865+08 | | 172.72.6.61 | 5432 | user=repmgr password=******** channel_binding=disable connect_timeout=2 dbname=replication host=172.72.6.61 port=5432 application_name=lhrrepmgr64362 fallback_application_name=walreceiver sslmode=disable sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any

(1 row)

信息“NOTICE: standby clone (using pg_basebackup) complete”表示测试完成。

八、总结

1、由于主从切换需要启动数据库,所以,在整个环境中,所有的PG数据库均不能使用systemctl来管理,否则切换会出现问题。

2、若要清除节点的注册信息,也可以进入主库中,操作repmgr.nodes表信息。

相关推荐

情侣冷战多少天可以分手了 情侣冷战了怎么和好
365彩票数据最专业

情侣冷战多少天可以分手了 情侣冷战了怎么和好

📅 07-07 👁️ 7427
定位球安天下
365彩票数据最专业

定位球安天下

📅 07-30 👁️ 6075
如何通过 7 种方法在电脑上运行安卓APP?
365账号限制投注怎么办

如何通过 7 种方法在电脑上运行安卓APP?

📅 07-30 👁️ 6333
计算机上的酷狗音乐被锁定,Win10系统酷狗音乐锁定、解锁歌词图文教程
华为mate40怎么抢购最新
365彩票数据最专业

华为mate40怎么抢购最新

📅 07-07 👁️ 6584
帅邦品牌介绍
365账号限制投注怎么办

帅邦品牌介绍

📅 08-25 👁️ 5960
准确预测6场世界杯足球赛 北京故宫流浪猫病逝 – 法国国际广播电台
快手本地音乐怎么才能添加上去?快手直播本地音乐怎么才能添加上去
Word查找和替换通配符(完全版)
s365 2.2.3

Word查找和替换通配符(完全版)

📅 07-15 👁️ 6486