pgpool ii 的ha方案 watch dog -m6米乐安卓版下载

原创 大表哥 2022-06-30
1838

image.png

大家好,这次大表哥带来技术分享是 pgpool ii 这个连接池自身的ha方案: watch dog.

本文参考m6米乐安卓版下载官网的 configuration examples:

https://tatsuo-ishii.github.io/pgpool-ii/current/example-cluster.html

watch dog 是pgpool 的内置的监控进程不需要独立安装,打开相关的参数开关即可。

具体的配置步骤如下:

1)官方建议配置至少3台(奇数个)几点配置watch dog 进程。

image.png

ip pg pool role pg instance role
10.67.38.50 leader primary database
10.67.39.149 standby standby database
10.67.39.49 standby standy database

2)手动创建 pgpool_node_id 文件, 在路径下:${pgpool_home}/etc
这一步是必须的,否则启动会报错:if watchdog is enable, pgpool_node_id file is required

infra [root@wqdcsrv3352 etc]# /opt/postgresql/pgpool4.3/bin/pgpool 2022-06-30 14:53:25.867: main pid 69061: 2022-06-30 14:53:25fatal: pgpool node id file /opt/postgresql/pgpool4.3/etc/pgpool_node_id does not exist 2022-06-30 14:53:25.867: main pid 69061: 2022-06-30 14:53:25detail: if watchdog is enable, pgpool_node_id file is required
infra [postgres@wqdcsrv3352 pgpool4.3]# vi pgpool_node_id infra [postgres@wqdcsrv3352 pgpool4.3]# cat pgpool_node_id 0 infra [postgres@wqdcsrv3354 pgpool]# vi pgpool_node_id infra [postgres@wqdcsrv3354 pgpool]# cat pgpool_node_id 1 infra [postgres@wqdcsrv3353 pgpool]# vi pgpool_node_id infra [postgres@wqdcsrv3353 pgpool]# cat pgpool_node_id 2

3)关于watch dog 核心参数的配置: 配置文件 ${pgpool_home}/etc/pgpool.conf

#------------------------------------------------------------------------------ # watchdog #------------------------------------------------------------------------------ ## watch dog的开关 use_watchdog = on ## watch 的节点 ip,port 信息 hostname0 = '10.67.38.50' wd_port0 = 9000 pgpool_port0 = 9999 hostname1 = '10.67.39.149' wd_port1 = 9000 pgpool_port1 = 9999 hostname2 = '10.67.39.49' wd_port2 = 9000 pgpool_port2 = 9999 ### 优先级的设置 这里我们设置成一致的 为1 , ### 没有设置加密方式 ### 进程文件设置在了 /tmp下面 wd_priority = 1 wd_authkey = '' wd_ipc_socket_dir = '/tmp' ###vip相关的设置 需要根据实际的 网卡设备名称 eth0 ### 这个我们pgpool 是通过 pg pool 启动的,所以 不需要 sudo 可以直接执行, 如果是 postgres等普通权限的用户,需要添加 /sbin/sudo delegate_ip = '10.67.39.200' if_up_cmd = '/sbin/ip addr add $_ip_$/24 dev eth0 label eth0:0' if_down_cmd = '/sbin/ip addr del $_ip_$/24 dev eth0' arping_cmd = '/usr/sbin/arping -u $_ip_$ -w 1 -i eth0' wd_escalation_command = '/opt/postgresql/pgpool4.3/etc/escalation.sh' ### 这个脚本的执行发生在挂上vip之前,需要把所有的节点vip 卸载一遍,以防止脑裂的发生 ###watch dog 的心跳方式配置: ###通过网卡端口监控, 10秒为超时 wd_lifecheck_method = 'heartbeat' wd_interval = 10 heartbeat_hostname0 = '10.67.38.50' heartbeat_port0 = 9694 heartbeat_device0 = 'eth0' heartbeat_hostname1 = '10.67.39.149' heartbeat_port1 = 9694 heartbeat_device1 = 'eth0' heartbeat_hostname2 = '10.67.39.49' heartbeat_port2 = 9694 heartbeat_device2 = 'eth0' wd_heartbeat_keepalive = 2 wd_heartbeat_deadtime = 30
  1. 需要修改一下 escalation.sh 这个shell 文件

修改成实际节点的ip和vip

#!/bin/bash # this script is run by wd_escalation_command to bring down the virtual ip on other pgpool nodes # before bringing up the virtual ip on the new active pgpool node. set -o xtrace pgpools=(10.67.38.50 10.67.39.149 10.67.39.49) hostname=`hostname -i` vip=10.67.39.200 device=eth0 for pgpool in "${pgpools[@]}"; do [ "$hostname" = "$pgpool" ] && continue ssh -t -o stricthostkeychecking=no -o userknownhostsfile=/dev/null postgres@$pgpool -i ~/.ssh/id_rsa_pgpool " /usr/bin/sudo /sbin/ip addr del $vip/24 dev $device " done exit 0

5)依次启动3个节点的 watch dog (这里我们用大权限的账户 root ), 观察 watch dog的 节点状态

infra [root@wqdcsrv3352 etc]# /opt/postgresql/pgpool4.3/bin/pgpool infra [root@wqdcsrv3353 etc]# /opt/postgresql/pgpool4.3/bin/pgpool infra [root@wqdcsrv3354 etc]# /opt/postgresql/pgpool4.3/bin/pgpool infra [postgres@wqdcsrv3352 etc]# /opt/postgresql/pgpool4.3/bin/pcp_watchdog_info -h localhost -p 9898 -u postgres password: 3 3 yes 10.67.38.50:9999 linux wqdcsrv3352.cn.infra 10.67.38.50 10.67.38.50:9999 linux wqdcsrv3352.cn.infra 10.67.38.50 9999 9000 4 leader 0 member 10.67.39.149:9999 linux wqdcsrv3354.cn.infra 10.67.39.149 9999 9000 7 standby 0 member 10.67.39.49:9999 linux wqdcsrv3353.cn.infra 10.67.39.49 9999 9000 7 standby 0 member

6)手动测试vip 漂移
我们尝试关闭 leader 节点上的pgpool : 10.67.38.50

infra [root@wqdcsrv3352 ~]# /opt/postgresql/pgpool4.3/bin/pgpool -m fast stop 2022-06-29 17:34:33.552: main pid 76158: 2022-06-29 17:34:33log: stop request sent to pgpool (pid: 73720). waiting for termination... .done.

我们可以从pgpool 的日志中,看到vip已经被移除了

2022-06-29 17:34:33.552: main pid 73720: 2022-06-29 17:34:33log: terminating all child processes 2022-06-29 17:34:33.556: watchdog pid 73723: 2022-06-29 17:34:33log: watchdog is shutting down 2022-06-29 17:34:33.556: watchdog_utility pid 76159: 2022-06-29 17:34:33log: watchdog: de-escalation started 2022-06-29 17:34:33.558: watchdog_utility pid 76159: 2022-06-29 17:34:33log: successfully released the delegate ip:"10.67.39.200" 2022-06-29 17:34:33.558: watchdog_utility pid 76159: 2022-06-29 17:34:33detail: 'if_down_cmd' returned with success 2022-06-29 17:34:33.559: main pid 73720: 2022-06-29 17:34:33log: pgpool-ii system is shutdown
infra [root@wqdcsrv3352 ~]# ip addr | grep 10.67.39.200

我们这个时候发现vip 10.67.39.200 已经漂移到了节点 149:

infra [root@wqdcsrv3354 ~]# ip addr |grep 10.67.39.200 inet 10.67.39.200/24 scope global eth0:0

此时我们再次观察pgpool 集群的状态:

infra [root@wqdcsrv3354 ~]# /opt/postgresql/pgpool4.3/bin/pcp_watchdog_info -h localhost -p 9898 -u postgres password: 3 3 yes 10.67.39.149:9999 linux wqdcsrv3354.cn.infra 10.67.39.149 10.67.39.149:9999 linux wqdcsrv3354.cn.infra 10.67.39.149 9999 9000 4 leader 0 member 10.67.38.50:9999 linux wqdcsrv3352.cn.infra 10.67.38.50 9999 9000 10 shutdown 0 member 10.67.39.49:9999 linux wqdcsrv3353.cn.infra 10.67.39.49 9999 9000 7 standby 0 member

我们手动起来 10.67.38.50 节点上的 pgpool:

infra [root@wqdcsrv3352 ~]# /opt/postgresql/pgpool4.3/bin/pgpool -f /opt/postgresql/pgpool4.3/etc/pgpool.conf -f /opt/postgresql/pgpool4.3/etc/pcp.conf > /tmp/pgpool.log

这个时候10.67.38.50 的节点 恢复成了 standby 的状态:

3 3 yes 10.67.39.149:9999 linux wqdcsrv3354.cn.infra 10.67.39.149 10.67.39.149:9999 linux wqdcsrv3354.cn.infra 10.67.39.149 9999 9000 4 leader 0 member 10.67.38.50:9999 linux wqdcsrv3352.cn.infra 10.67.38.50 9999 9000 7 standby 0 member 10.67.39.49:9999 linux wqdcsrv3353.cn.infra 10.67.39.49 9999 9000 7 standby 0 member

下面我们来用springboot的小程序测试一下 pgpool 的ha :

我们的程序的数据库配置是 这次是连接到 vip 10.67.39.200 上面
连接池的配置是 最大允许 200 个连接,最小空闲的连接数是10

//数据库的地址以及端口号 spring.datasource.url=jdbc:postgresql://10.67.39.200:9999/postgres spring.datasource.username=app_ha_user spring.datasource.password=app_ha_user spring.datasource.driverclassname=org.postgresql.driver spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.postgresqldialect spring.jpa.properties.hibernate.hbm2ddl.auto=update spring.datasource.maximum-pool-size=200 spring.datasource.min-idle=10

测试生成很简单,就是网页输出一下数据库的ip地址:

string sql = "select inet_server_addr()"; // 通过jdbctemplate查询数据库 string hostname = (string)jdbctemplate.queryforobject( sql, string.class); return "hello ,you are connecting to " hostname;

启动测试程序后,观察初始化的连接池:

postgres=> show pool_processes; pool_pid | start_time | client_connection_count | database | username | backend_connection_time | pool_counter | status ---------- ------------------------------------------------------ ------------------------- ---------- ------------- ------------------------- -------------- --------------------- 80665 | 2022-06-29 17:30:21 | 0 | | | | | wait for connection 80666 | 2022-06-29 17:30:21 | 0 | postgres | app_ha_user | 2022-06-30 11:29:50 | 1 | idle 64166 | 2022-06-30 11:11:51 | 2 | | | | | wait for connection 80668 | 2022-06-29 17:30:21 | 0 | postgres | app_ha_user | 2022-06-30 11:29:51 | 1 | idle 80669 | 2022-06-29 17:30:21 | 1 | | | | | wait for connection 65854 | 2022-06-30 11:27:54 | 0 | | | | | wait for connection 80671 | 2022-06-29 17:30:21 | 0 | postgres | app_ha_user | 2022-06-30 11:29:50 | 1 | idle 80672 | 2022-06-29 17:30:21 | 0 | postgres | app_ha_user | 2022-06-30 11:29:50 | 1 | idle 80673 | 2022-06-29 17:30:21 | 0 | | | | | wait for connection 80674 | 2022-06-29 17:30:21 | 0 | postgres | app_ha_user | 2022-06-30 11:29:51 | 1 | idle 80675 | 2022-06-29 17:30:21 | 0 | | | | | wait for connection 80676 | 2022-06-29 17:30:21 | 0 | | | | | wait for connection 80677 | 2022-06-29 17:30:21 | 0 | postgres | app_ha_user | 2022-06-30 11:29:51 | 1 | idle 80678 | 2022-06-29 17:30:21 | 1 | postgres | app_ha_user | 2022-06-30 11:29:50 | 1 | idle 80679 | 2022-06-29 17:30:21 | 0 | postgres | app_ha_user | 2022-06-30 11:29:50 | 1 | idle 80680 | 2022-06-29 17:30:21 (4:15 before process restarting) | 1 | | | | | wait for connection 80681 | 2022-06-29 17:30:21 | 0 | | | | | wait for connection 80682 | 2022-06-29 17:30:21 | 0 | postgres | app_ha_user | 2022-06-30 11:29:51 | 1 | idle 80683 | 2022-06-29 17:30:21 | 0 | | | | | wait for connection 80684 | 2022-06-29 17:30:21 | 2 | | | | | wait for connection 80685 | 2022-06-29 17:30:21 | 0 | | | | | wait for connection 80686 | 2022-06-29 17:30:21 | 0 | | | | | wait for connection 80687 | 2022-06-29 17:30:21 | 0 | | | | | wait for connection 80688 | 2022-06-29 17:30:21 | 0 | | | | | wait for connection 80689 | 2022-06-29 17:30:21 | 0 | | | | | wait for connection 80690 | 2022-06-29 17:30:21 | 0 | | | | | wait for connection 80691 | 2022-06-29 17:30:21 | 0 | | | | | wait for connection 80692 | 2022-06-29 17:30:21 | 0 | | | | | wait for connection 80693 | 2022-06-29 17:30:21 | 0 | postgres | app_ha_user | 2022-06-30 11:29:51 | 1 | idle 80694 | 2022-06-29 17:30:21 | 0 | | | | | wait for connection 80695 | 2022-06-29 17:30:21 | 0 | postgres | app_ha_user | 2022-06-30 11:30:30 | 1 | execute command 80696 | 2022-06-29 17:30:21 | 0 | | | | | wait for connection

尝试访问web界面: http://127.0.0.1:8066/ 我们可以看到 50这个节点是主库

image.png

关闭vip所在节点的pgpool :

infra [root@wqdcsrv3354 ~]# ip addr 1: lo: mtu 65536 qdisc noqueue state unknown group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: mtu 1500 qdisc mq state up group default qlen 1000 link/ether 00:50:56:ae:f7:e7 brd ff:ff:ff:ff:ff:ff inet 10.67.39.149/22 brd 10.67.39.255 scope global eth0 valid_lft forever preferred_lft forever inet 10.67.39.200/24 scope global eth0:0 valid_lft forever preferred_lft forever infra [root@wqdcsrv3354 ~]# /opt/postgresql/pgpool4.3/bin/pgpool -m fast stop 2022-06-30 11:34:08.821: main pid 66716: 2022-06-30 11:34:08log: stop request sent to pgpool (pid: 80655). waiting for termination... .done.

这个时候我们看到vip 漂移到了 49这个节点:

infra [postgres@wqdcsrv3353 ~]# ip addr 1: lo: mtu 65536 qdisc noqueue state unknown group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: mtu 1500 qdisc mq state up group default qlen 1000 link/ether 00:50:56:ae:99:07 brd ff:ff:ff:ff:ff:ff inet 10.67.39.49/22 brd 10.67.39.255 scope global eth0 valid_lft forever preferred_lft forever inet 10.67.39.200/24 scope global eth0:0 valid_lft forever preferred_lft forever

再次访问网页: http://127.0.0.1:8066/ 我们可以看到通过vip 依然可以连接到 主库 50 这个节点.

我们可以观察到 测试程序的连接池进行了重连

2022-06-30 11:39:01,161 warn (poolbase.java:184)- hikaripool-1 - failed to validate connection org.postgresql.jdbc.pgconnection@51849c83 (this connection has been closed.). possibly consider using a shorter maxlifetime value. 2022-06-30 11:39:01,164 warn (poolbase.java:184)- hikaripool-1 - failed to validate connection org.postgresql.jdbc.pgconnection@46432d85 (this connection has been closed.). possibly consider using a shorter maxlifetime value. 2022-06-30 11:39:01,168 warn (poolbase.java:184)- hikaripool-1 - failed to validate connection org.postgresql.jdbc.pgconnection@33d6c14a (this connection has been closed.). possibly consider using a shorter maxlifetime value. 2022-06-30 11:39:01,169 warn (poolbase.java:184)- hikaripool-1 - failed to validate connection org.postgresql.jdbc.pgconnection@64116a2c (this connection has been closed.). possibly consider using a shorter maxlifetime value. 2022-06-30 11:39:01,169 warn (poolbase.java:184)- hikaripool-1 - failed to validate connection org.postgresql.jdbc.pgconnection@6641e508 (this connection has been closed.). possibly consider using a shorter maxlifetime value. 2022-06-30 11:39:01,170 warn (poolbase.java:184)- hikaripool-1 - failed to validate connection org.postgresql.jdbc.pgconnection@3931c340 (this connection has been closed.). possibly consider using a shorter maxlifetime value. 2022-06-30 11:39:01,170 warn (poolbase.java:184)- hikaripool-1 - failed to validate connection org.postgresql.jdbc.pgconnection@3ce889bd (this connection has been closed.). possibly consider using a shorter maxlifetime value. 2022-06-30 11:39:01,171 warn (poolbase.java:184)- hikaripool-1 - failed to validate connection org.postgresql.jdbc.pgconnection@5347c406 (this connection has been closed.). possibly consider using a shorter maxlifetime value. 2022-06-30 11:39:01,172 warn (poolbase.java:184)- hikaripool-1 - failed to validate connection org.postgresql.jdbc.pgconnection@3a4712c3 (this connection has been closed.). possibly consider using a shorter maxlifetime value. 2022-06-30 11:39:01,172 warn (poolbase.java:184)- hikaripool-1 - failed to validate connection org.postgresql.jdbc.pgconnection@ddbed08 (this connection has been closed.). possibly consider using a shorter maxlifetime value.
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【米乐app官网下载的版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

网站地图