专注,勤学,慎思。戒骄戒躁,谦虚谨慎

just do it

导航

pg_auto_failover 在多种场景下自动故障转移的验证

 

前面总结了pgautofailover作为Postgresql高可用的自动故障转移相关的参数,这里实测一下各种故障转移的场景以及耗时

 

1,场景一:主节点数据库宕机OS正常场景下auto failover

在OS层面,手动停止pgautofailover服务,以下是monitor节点自动Failover的步骤:
1,监控节点开始发现主节点故障  
  1649|2026-04-13 16:36:36.460084 +0800
2,开始对主节点重试,
  1650|2026-04-13 16:36:48.646945 +0800
  受如下参数控制
  (a)health_check_period,
  (b)health_check_retry_delay,
  (c)health_check_max_retries  
  (d)health_check_timeout
  这里耗时12秒≈health_check_max_retries(2次)*health_check_timeout(5秒)+health_check_retry_delay(2秒)
3,经过重试之后,确认主节点故障,从节点开始停止复制
  1655|2026-04-13 16:36:49.968158 +0800
4,经过primary_demote_timeout(30秒)之后,从节点开始进入wait_primary
  1656|2026-04-13 16:37:18.966193 +0800
  1657|2026-04-13 16:37:18.966193 +0800
5,正式更新从节点状态
  1658|2026-04-13 16:37:19.088174 +0800
故障转移总耗时约43秒

root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  34: 0/2D9169D0 |   read-write |             primary |             primary
ubuntu13 |   123 | ubuntu13:9300 |  34: 0/2D9169D0 |    read-only |           secondary |           secondary


root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  34: 0/2D9169D0 | read-write ! |             primary |             primary
ubuntu13 |   123 | ubuntu13:9300 |  34: 0/2D9169D0 |    read-only |           secondary |           secondary



root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  34: 0/2D9169D0 | read-write ! |             primary |      demote_timeout
ubuntu13 |   123 | ubuntu13:9300 |  35: 0/2D916B98 |    read-only |    stop_replication |    stop_replication



root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  34: 0/2D9169D0 | read-write ! |             primary |             demoted
ubuntu13 |   123 | ubuntu13:9300 |  35: 0/2D916B98 |   read-write |        wait_primary |        wait_primary



日志:如下是一个完整故障转移日志,笔者仅添加备注,没有做任何修改
select * from event;

eventid|eventtime                       |formationid|nodeid|groupid|nodename|nodehost|nodeport|reportedstate    |goalstate        |reportedrepstate|reportedtli|reportedlsn|candidatepriority|replicationquorum|description                                                                                                                                                                                                      |
-------+--------------------------------+-----------+------+-------+--------+--------+--------+-----------------+-----------------+----------------+-----------+-----------+-----------------+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   1649|2026-04-13 16:36:36.460084 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|primary          |primary          |quorum          |         34|0/2D9169D0 |               50|true             |Node node 113 "ubuntu12" (ubuntu12:9300) is marked as unhealthy by the monitor                                                                                                                                   |
   1650|2026-04-13 16:36:48.646945 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|secondary        |prepare_promotion|unknown         |         34|0/2D9169D0 |               50|true             |Setting goal state of node 113 "ubuntu12" (ubuntu12:9300) to draining and node 123 "ubuntu13" (ubuntu13:9300) to prepare_promotion after node 113 "ubuntu12" (ubuntu12:9300) became unhealthy.                   |
   1651|2026-04-13 16:36:48.646945 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|primary          |draining         |quorum          |         34|0/2D9169D0 |               50|true             |Setting goal state of node 113 "ubuntu12" (ubuntu12:9300) to draining and node 123 "ubuntu13" (ubuntu13:9300) to prepare_promotion after node 113 "ubuntu12" (ubuntu12:9300) became unhealthy.                   |
   1652|2026-04-13 16:36:48.747582 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|prepare_promotion|prepare_promotion|unknown         |         34|0/2D9169D0 |               50|true             |New state is reported by node 123 "ubuntu13" (ubuntu13:9300): "prepare_promotion"                                                                                                                                |
   1653|2026-04-13 16:36:48.747582 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|prepare_promotion|stop_replication |unknown         |         34|0/2D9169D0 |               50|true             |Setting goal state of node 113 "ubuntu12" (ubuntu12:9300) to demote_timeout and node 123 "ubuntu13" (ubuntu13:9300) to stop_replication after node 123 "ubuntu13" (ubuntu13:9300) converged to prepare_promotion.|
   1654|2026-04-13 16:36:48.747582 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|primary          |demote_timeout   |quorum          |         34|0/2D9169D0 |               50|true             |Setting goal state of node 113 "ubuntu12" (ubuntu12:9300) to demote_timeout and node 123 "ubuntu13" (ubuntu13:9300) to stop_replication after node 123 "ubuntu13" (ubuntu13:9300) converged to prepare_promotion.|
   1655|2026-04-13 16:36:49.968158 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|stop_replication |stop_replication |unknown         |         34|0/2D916B60 |               50|true             |New state is reported by node 123 "ubuntu13" (ubuntu13:9300): "stop_replication"                                                                                                                                 |					   		
   1656|2026-04-13 16:37:18.966193 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|stop_replication |wait_primary     |unknown         |         35|0/2D916B98 |               50|true             |Setting goal state of node 123 "ubuntu13" (ubuntu13:9300) to wait_primary and node 113 "ubuntu12" (ubuntu12:9300) to demoted after the demote timeout expired.                                                   |
   1657|2026-04-13 16:37:18.966193 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|primary          |demoted          |quorum          |         34|0/2D9169D0 |               50|true             |Setting goal state of node 123 "ubuntu13" (ubuntu13:9300) to wait_primary and node 113 "ubuntu12" (ubuntu12:9300) to demoted after the demote timeout expired.                                                   |				   		
   1658|2026-04-13 16:37:19.088174 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|wait_primary     |wait_primary     |unknown         |         35|0/2D916B98 |               50|true             |New state is reported by node 123 "ubuntu13" (ubuntu13:9300): "wait_primary"                                                                                                                                     |
   

 

2,场景二:主节点OS宕机场景下的auto failover

此种情况与场景一类型,此次故障转移总时间为39秒

root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  35: 0/2D916C80 |    read-only |           secondary |           secondary
ubuntu13 |   123 | ubuntu13:9300 |  35: 0/2D916C80 |   read-write |             primary |             primary

root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  35: 0/2D916C80 |    read-only |           secondary |           secondary
ubuntu13 |   123 | ubuntu13:9300 |  35: 0/2D916C80 | read-write ! |             primary |             primary

root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  35: 0/2D916C80 |    read-only |           secondary |           secondary
ubuntu13 |   123 | ubuntu13:9300 |  35: 0/2D916C80 | read-write ! |             primary |             primary

root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  36: 0/2D916E48 |    read-only |    stop_replication |    stop_replication
ubuntu13 |   123 | ubuntu13:9300 |  35: 0/2D916C80 | read-write ! |             primary |      demote_timeout

root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  36: 0/2D916E80 |    read-only |    stop_replication |    stop_replication
ubuntu13 |   123 | ubuntu13:9300 |  35: 0/2D916C80 | read-write ! |             primary |      demote_timeout

root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  36: 0/2D916E80 |   read-write |        wait_primary |        wait_primary
ubuntu13 |   123 | ubuntu13:9300 |  35: 0/2D916C80 | read-write ! |             primary |             demoted

root@ubuntu11:/usr/local/redis/sentinel18000/bin#

清空之前日志,纯故障转移日志
select * from event;
eventid|eventtime                       |formationid|nodeid|groupid|nodename|nodehost|nodeport|reportedstate    |goalstate        |reportedrepstate|reportedtli|reportedlsn|candidatepriority|replicationquorum|description                                                                                                                                                                                                      |
-------+--------------------------------+-----------+------+-------+--------+--------+--------+-----------------+-----------------+----------------+-----------+-----------+-----------------+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   1667|2026-04-13 17:07:43.991924 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|primary          |primary          |quorum          |         35|0/2D916C80 |               50|true             |Node node 123 "ubuntu13" (ubuntu13:9300) is marked as unhealthy by the monitor                                                                                                                                   |
   1668|2026-04-13 17:07:51.683497 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|secondary        |prepare_promotion|unknown         |         35|0/2D916C80 |               50|true             |Setting goal state of node 123 "ubuntu13" (ubuntu13:9300) to draining and node 113 "ubuntu12" (ubuntu12:9300) to prepare_promotion after node 123 "ubuntu13" (ubuntu13:9300) became unhealthy.                   |
   1669|2026-04-13 17:07:51.683497 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|primary          |draining         |quorum          |         35|0/2D916C80 |               50|true             |Setting goal state of node 123 "ubuntu13" (ubuntu13:9300) to draining and node 113 "ubuntu12" (ubuntu12:9300) to prepare_promotion after node 123 "ubuntu13" (ubuntu13:9300) became unhealthy.                   |
   1670|2026-04-13 17:07:51.738728 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|prepare_promotion|prepare_promotion|unknown         |         35|0/2D916C80 |               50|true             |New state is reported by node 113 "ubuntu12" (ubuntu12:9300): "prepare_promotion"                                                                                                                                |
   1671|2026-04-13 17:07:51.738728 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|prepare_promotion|stop_replication |unknown         |         35|0/2D916C80 |               50|true             |Setting goal state of node 123 "ubuntu13" (ubuntu13:9300) to demote_timeout and node 113 "ubuntu12" (ubuntu12:9300) to stop_replication after node 113 "ubuntu12" (ubuntu12:9300) converged to prepare_promotion.|
   1672|2026-04-13 17:07:51.738728 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|primary          |demote_timeout   |quorum          |         35|0/2D916C80 |               50|true             |Setting goal state of node 123 "ubuntu13" (ubuntu13:9300) to demote_timeout and node 113 "ubuntu12" (ubuntu12:9300) to stop_replication after node 113 "ubuntu12" (ubuntu12:9300) converged to prepare_promotion.|
   1673|2026-04-13 17:07:52.898181 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|stop_replication |stop_replication |unknown         |         35|0/2D916E48 |               50|true             |New state is reported by node 113 "ubuntu12" (ubuntu12:9300): "stop_replication"                                                                                                                                 |
   1674|2026-04-13 17:08:21.923866 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|stop_replication |wait_primary     |unknown         |         36|0/2D916E80 |               50|true             |Setting goal state of node 113 "ubuntu12" (ubuntu12:9300) to wait_primary and node 123 "ubuntu13" (ubuntu13:9300) to demoted after the demote timeout expired.                                                   |
   1675|2026-04-13 17:08:21.923866 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|primary          |demoted          |quorum          |         35|0/2D916C80 |               50|true             |Setting goal state of node 113 "ubuntu12" (ubuntu12:9300) to wait_primary and node 123 "ubuntu13" (ubuntu13:9300) to demoted after the demote timeout expired.                                                   |
   1676|2026-04-13 17:08:22.037573 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|wait_primary     |wait_primary     |unknown         |         36|0/2D916E80 |               50|true             |New state is reported by node 113 "ubuntu12" (ubuntu12:9300): "wait_primary"                                                                                                                                     |

 

3,场景三:主节点重启pgautofailover服务场景下的auto failover

重启主节点服务,会执行故障转移表,故障转移加上原始主节点加入集群(作为从节点运行)仅需6秒。
这里其实是超出笔者的预料的,这种场景下,与前面两种故障转移,有一下差异
1,如果说主节点故障的检测机制为health_check_max_retries(2次)*health_check_timeout(5秒)+health_check_retry_delay(2秒),那么不应该故障转移才对。
2,新的主节点不需要等待primary_demote_timeout
所以这两点还不是很理解

root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  36: 0/2D916F68 |   read-write |             primary |             primary
ubuntu13 |   123 | ubuntu13:9300 |  36: 0/2D916F68 |    read-only |           secondary |           secondary

root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  36: 0/2D916FE0 |         none |             demoted |          catchingup
ubuntu13 |   123 | ubuntu13:9300 |  37: 0/2D917130 |   read-write |        wait_primary |        wait_primary


root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  36: 0/2D916FE0 |         none |             demoted |          catchingup
ubuntu13 |   123 | ubuntu13:9300 |  37: 0/2D917130 |   read-write |        wait_primary |        wait_primary


root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  37: 0/2D9171E0 |    read-only |           secondary |           secondary
ubuntu13 |   123 | ubuntu13:9300 |  37: 0/2D9171E0 |   read-write |             primary |             primary



故障转移详细日志
select * from event;

1685|2026-04-13 17:19:53.322751 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|secondary        |prepare_promotion|unknown         |         36|0/2D916F68 |               50|true             |Setting goal state of node 113 "ubuntu12" (ubuntu12:9300) to draining and node 123 "ubuntu13" (ubuntu13:9300) to prepare_promotion after node 113 "ubuntu12" (ubuntu12:9300) became unhealthy.                   |
1686|2026-04-13 17:19:53.322751 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|primary          |draining         |unknown         |         36|0/2D916F68 |               50|true             |Setting goal state of node 113 "ubuntu12" (ubuntu12:9300) to draining and node 123 "ubuntu13" (ubuntu13:9300) to prepare_promotion after node 113 "ubuntu12" (ubuntu12:9300) became unhealthy.                   |
1687|2026-04-13 17:19:53.390311 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|prepare_promotion|prepare_promotion|unknown         |         36|0/2D916F68 |               50|true             |New state is reported by node 123 "ubuntu13" (ubuntu13:9300): "prepare_promotion"                                                                                                                                |
1688|2026-04-13 17:19:53.390311 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|prepare_promotion|stop_replication |unknown         |         36|0/2D916F68 |               50|true             |Setting goal state of node 113 "ubuntu12" (ubuntu12:9300) to demote_timeout and node 123 "ubuntu13" (ubuntu13:9300) to stop_replication after node 123 "ubuntu13" (ubuntu13:9300) converged to prepare_promotion.|
1689|2026-04-13 17:19:53.390311 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|primary          |demote_timeout   |unknown         |         36|0/2D916FE0 |               50|true             |Setting goal state of node 113 "ubuntu12" (ubuntu12:9300) to demote_timeout and node 123 "ubuntu13" (ubuntu13:9300) to stop_replication after node 123 "ubuntu13" (ubuntu13:9300) converged to prepare_promotion.|
1690|2026-04-13 17:19:53.517438 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|draining         |demote_timeout   |unknown         |         36|0/0        |               50|true             |New state is reported by node 113 "ubuntu12" (ubuntu12:9300): "draining"                                                                                                                                         |
1691|2026-04-13 17:19:53.582463 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|demote_timeout   |demote_timeout   |unknown         |         36|0/0        |               50|true             |New state is reported by node 113 "ubuntu12" (ubuntu12:9300): "demote_timeout"                                                                                                                                   |
1692|2026-04-13 17:19:54.542813 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|stop_replication |stop_replication |unknown         |         36|0/2D917130 |               50|true             |New state is reported by node 123 "ubuntu13" (ubuntu13:9300): "stop_replication"                                                                                                                                 |
1693|2026-04-13 17:19:54.542813 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|stop_replication |wait_primary     |unknown         |         36|0/2D917130 |               50|true             |Setting goal state of node 123 "ubuntu13" (ubuntu13:9300) to wait_primary and node 113 "ubuntu12" (ubuntu12:9300) to demoted after the demote timeout expired.                                                   |
1694|2026-04-13 17:19:54.542813 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|demote_timeout   |demoted          |unknown         |         36|0/2D916FE0 |               50|true             |Setting goal state of node 123 "ubuntu13" (ubuntu13:9300) to wait_primary and node 113 "ubuntu12" (ubuntu12:9300) to demoted after the demote timeout expired.                                                   |
1695|2026-04-13 17:19:54.634092 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|demoted          |demoted          |unknown         |         36|0/0        |               50|true             |New state is reported by node 113 "ubuntu12" (ubuntu12:9300): "demoted"                                                                                                                                          |
1696|2026-04-13 17:19:54.654242 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|wait_primary     |wait_primary     |unknown         |         37|0/2D917130 |               50|true             |New state is reported by node 123 "ubuntu13" (ubuntu13:9300): "wait_primary"                                                                                                                                     |
1697|2026-04-13 17:19:54.699548 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|demoted          |catchingup       |unknown         |         36|0/2D916FE0 |               50|true             |Setting goal state of node 113 "ubuntu12" (ubuntu12:9300) to catchingup after it converged to demotion and node 123 "ubuntu13" (ubuntu13:9300) converged to wait_primary.                                        |
笔者备注,故障转移完成,从节点接替主节点开始工作,以下是将重启后的原始主节点,作为从节点运行
1698|2026-04-13 17:20:00.259754 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|catchingup       |catchingup       |unknown         |         36|0/2D000000 |               50|true             |New state is reported by node 113 "ubuntu12" (ubuntu12:9300): "catchingup"                                                                                                                                       |
1699|2026-04-13 17:20:00.375329 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|catchingup       |catchingup       |unknown         |         37|0/2D000000 |               50|true             |Node node 113 "ubuntu12" (ubuntu12:9300) is marked as unhealthy by the monitor                                                                                                                                   |
1700|2026-04-13 17:20:00.389549 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|catchingup       |catchingup       |unknown         |         37|0/2D000000 |               50|true             |Node node 113 "ubuntu12" (ubuntu12:9300) is marked as healthy by the monitor                                                                                                                                     |
1701|2026-04-13 17:20:00.412720 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|catchingup       |secondary        |unknown         |         37|0/2D000000 |               50|true             |Setting goal state of node 113 "ubuntu12" (ubuntu12:9300) to secondary after it caught up.                                                                                                                       |
1702|2026-04-13 17:20:00.518669 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|secondary        |secondary        |unknown         |         37|0/2D000000 |               50|true             |New state is reported by node 113 "ubuntu12" (ubuntu12:9300): "secondary"                                                                                                                                        |
1703|2026-04-13 17:20:00.552868 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|wait_primary     |primary          |async           |         37|0/2D917130 |               50|true             |Setting goal state of node 123 "ubuntu13" (ubuntu13:9300) to primary now that we have 1 healthy  secondary nodes in the quorum.                                                                                  |
1704|2026-04-13 17:20:00.657925 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|primary          |primary          |quorum          |         37|0/2D9171E0 |               50|true             |New state is reported by node 123 "ubuntu13" (ubuntu13:9300): "primary"                                                                                                                                          |

 

4,场景四:主节点OS和数据库都正常,发生网络分区场景下的auto failover

主从复制正常,禁用主节点网卡,模拟网络分区故障,经过类似于场景一和场景二的检测之后,会将从节点提升为主节点,对于网络分区中的原始主节点,会对其关闭服务,依赖于参数network_partition_timeout
pgautofailover中断Postgresql服务,防止脑裂,但是pgautofailover自身保持运行,待网络恢复后自行加入集群作为从节点运行。
从monitor的日志看,
1,从monitor节点发现主节点unhealthy,需要经过一定时间的重试
2,从stop_replication到demote timeout expired,基本上都在30秒,也就是受到primary_demote_timeout参数的影响

root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  40: 0/2D917A28 |   read-write |             primary |             primary
ubuntu13 |   123 | ubuntu13:9300 |  40: 0/2D917A28 |    read-only |           secondary |           secondary

root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  40: 0/2D917A28 |   read-write |             primary |             primary
ubuntu13 |   123 | ubuntu13:9300 |  40: 0/2D917A28 |    read-only |           secondary |           secondary

root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  40: 0/2D917A28 |   read-write |             primary |             primary
ubuntu13 |   123 | ubuntu13:9300 |  40: 0/2D917A28 |    read-only |           secondary |           secondary

root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  40: 0/2D917A28 | read-write ! |             primary |      demote_timeout
ubuntu13 |   123 | ubuntu13:9300 |  41: 0/2D917BF0 |    read-only |    stop_replication |    stop_replication

root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  40: 0/2D917A28 | read-write ! |             primary |      demote_timeout
ubuntu13 |   123 | ubuntu13:9300 |  41: 0/2D917BF0 |    read-only |    stop_replication |    stop_replication

root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  40: 0/2D917A28 | read-write ! |             primary |      demote_timeout
ubuntu13 |   123 | ubuntu13:9300 |  41: 0/2D917BF0 |    read-only |    stop_replication |    stop_replication

root@ubuntu11:/usr/local/redis/sentinel18000/bin# pg_autoctl show state
    Name |  Node |     Host:Port |        TLI: LSN |   Connection |      Reported State |      Assigned State
---------+-------+---------------+-----------------+--------------+---------------------+--------------------
ubuntu12 |   113 | ubuntu12:9300 |  40: 0/2D917A28 | read-write ! |             primary |             demoted
ubuntu13 |   123 | ubuntu13:9300 |  41: 0/2D917BF0 |   read-write |        wait_primary |        wait_primary



故障转移详细日志
select * from event

eventid|eventtime                       |formationid|nodeid|groupid|nodename|nodehost|nodeport|reportedstate    |goalstate        |reportedrepstate|reportedtli|reportedlsn|candidatepriority|replicationquorum|description                                                                                                                                                                                                      |
-------+--------------------------------+-----------+------+-------+--------+--------+--------+-----------------+-----------------+----------------+-----------+-----------+-----------------+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   1764|2026-04-13 17:45:00.508678 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|primary          |primary          |quorum          |         40|0/2D9179F0 |               50|true             |New state is reported by node 113 "ubuntu12" (ubuntu12:9300): "primary"                                                                                                                                          |
   1765|2026-04-13 17:48:56.839615 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|primary          |primary          |quorum          |         40|0/2D917A28 |               50|true             |Node node 113 "ubuntu12" (ubuntu12:9300) is marked as unhealthy by the monitor                                                                                                                                   |
   1766|2026-04-13 17:48:56.877554 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|secondary        |prepare_promotion|unknown         |         40|0/2D917A28 |               50|true             |Setting goal state of node 113 "ubuntu12" (ubuntu12:9300) to draining and node 123 "ubuntu13" (ubuntu13:9300) to prepare_promotion after node 113 "ubuntu12" (ubuntu12:9300) became unhealthy.                   |
   1767|2026-04-13 17:48:56.877554 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|primary          |draining         |quorum          |         40|0/2D917A28 |               50|true             |Setting goal state of node 113 "ubuntu12" (ubuntu12:9300) to draining and node 123 "ubuntu13" (ubuntu13:9300) to prepare_promotion after node 113 "ubuntu12" (ubuntu12:9300) became unhealthy.                   |
   1768|2026-04-13 17:48:56.928691 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|prepare_promotion|prepare_promotion|unknown         |         40|0/2D917A28 |               50|true             |New state is reported by node 123 "ubuntu13" (ubuntu13:9300): "prepare_promotion"                                                                                                                                |
   1769|2026-04-13 17:48:56.928691 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|prepare_promotion|stop_replication |unknown         |         40|0/2D917A28 |               50|true             |Setting goal state of node 113 "ubuntu12" (ubuntu12:9300) to demote_timeout and node 123 "ubuntu13" (ubuntu13:9300) to stop_replication after node 123 "ubuntu13" (ubuntu13:9300) converged to prepare_promotion.|
   1770|2026-04-13 17:48:56.928691 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|primary          |demote_timeout   |quorum          |         40|0/2D917A28 |               50|true             |Setting goal state of node 113 "ubuntu12" (ubuntu12:9300) to demote_timeout and node 123 "ubuntu13" (ubuntu13:9300) to stop_replication after node 123 "ubuntu13" (ubuntu13:9300) converged to prepare_promotion.|
   1771|2026-04-13 17:48:58.085859 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|stop_replication |stop_replication |unknown         |         40|0/2D917BF0 |               50|true             |New state is reported by node 123 "ubuntu13" (ubuntu13:9300): "stop_replication"                                                                                                                                 |
   1772|2026-04-13 17:49:27.240530 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|stop_replication |wait_primary     |unknown         |         41|0/2D917BF0 |               50|true             |Setting goal state of node 123 "ubuntu13" (ubuntu13:9300) to wait_primary and node 113 "ubuntu12" (ubuntu12:9300) to demoted after the demote timeout expired.                                                   |
   1773|2026-04-13 17:49:27.240530 +0800|default    |   113|      0|ubuntu12|ubuntu12|    9300|primary          |demoted          |quorum          |         40|0/2D917A28 |               50|true             |Setting goal state of node 123 "ubuntu13" (ubuntu13:9300) to wait_primary and node 113 "ubuntu12" (ubuntu12:9300) to demoted after the demote timeout expired.                                                   |
   1774|2026-04-13 17:49:27.310817 +0800|default    |   123|      0|ubuntu13|ubuntu13|    9300|wait_primary     |wait_primary     |unknown         |         41|0/2D917BF0 |               50|true             |New state is reported by node 123 "ubuntu13" (ubuntu13:9300): "wait_primary"                                                                                                                                     |

1,用ip link set ens33 down关闭网卡,模拟集群中的原始主节点发生网络分区,pgautofailover服务会中断Postgresql进程,防止多主脑裂,网络分区后pgautofailover自身保持运行,并不是关闭整个进程
2,用ip link set ens33 up开启网卡,模拟网络分区恢复,如下所示,待网络恢复后,pgautofailover会启动Postgresql进程,随后pgautofailover会自行将Postgresql加入集群作为从节点运行。

image

 集群中的原始主节点发生网络分区后,以及网络恢复后的日志

2026-04-13 10:06:13.549 UTC [3008553] FATAL:  could not receive data from WAL stream: FATAL:  terminating connection due to administrator command
2026-04-13 10:06:13.549 UTC [3008552] LOG:  invalid record length at 0/2D918288: expected at least 24, got 0
2026-04-13 10:06:13.561 UTC [3008920] FATAL:  could not connect to the primary server: connection to server at "ubuntu13" (192.168.152.123), port 9300 failed: server closed the connection unexpectedly
		This probably means the server terminated abnormally
		before or while processing the request.
2026-04-13 10:06:13.561 UTC [3008552] LOG:  waiting for WAL to become available at 0/2D9182A0
2026-04-13 10:06:14.859 UTC [3008548] LOG:  received SIGHUP, reloading configuration files
2026-04-13 10:06:14.861 UTC [3008548] LOG:  parameter "default_transaction_read_only" changed to "on"
2026-04-13 10:06:14.877 UTC [3008552] LOG:  received promote request
2026-04-13 10:06:14.878 UTC [3008928] FATAL:  terminating walreceiver process due to administrator command
2026-04-13 10:06:14.878 UTC [3008552] LOG:  waiting for WAL to become available at 0/2D9182A0
2026-04-13 10:06:14.878 UTC [3008552] LOG:  redo done at 0/2D918250 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 98.28 s
2026-04-13 10:06:14.885 UTC [3008552] LOG:  selected new timeline ID: 44
2026-04-13 10:06:14.940 UTC [3008552] LOG:  archive recovery complete
2026-04-13 10:06:14.946 UTC [3008550] LOG:  checkpoint starting: force
2026-04-13 10:06:14.949 UTC [3008548] LOG:  database system is ready to accept connections
2026-04-13 10:06:14.957 UTC [3008550] LOG:  checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.004 s, sync=0.001 s, total=0.012 s; sync files=2, longest=0.001 s, average=0.001 s; distance=1 kB, estimate=1 kB; lsn=0/2D9182F0, redo lsn=0/2D9182B8
2026-04-13 10:06:15.889 UTC [3008550] LOG:  checkpoint starting: immediate force wait
2026-04-13 10:06:15.893 UTC [3008550] LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.004 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=0 kB; lsn=0/2D9183D8, redo lsn=0/2D9183A0
2026-04-13 10:06:15.917 UTC [3008548] LOG:  received SIGHUP, reloading configuration files
2026-04-13 10:06:15.917 UTC [3008548] LOG:  parameter "recovery_target_timeline" cannot be changed without restarting the server
2026-04-13 10:06:15.917 UTC [3008548] LOG:  parameter "primary_conninfo" removed from configuration file, reset to default
2026-04-13 10:06:15.917 UTC [3008548] LOG:  parameter "primary_slot_name" removed from configuration file, reset to default
2026-04-13 10:06:15.918 UTC [3008548] LOG:  configuration file "/usr/local/pgsql16/pg9300/data/postgresql.conf" contains errors; unaffected changes were applied
2026-04-13 10:06:16.010 UTC [3008548] LOG:  received SIGHUP, reloading configuration files
2026-04-13 10:06:16.011 UTC [3008548] LOG:  parameter "recovery_target_timeline" cannot be changed without restarting the server
2026-04-13 10:06:16.011 UTC [3008548] LOG:  parameter "default_transaction_read_only" changed to "off"
2026-04-13 10:06:16.011 UTC [3008548] LOG:  configuration file "/usr/local/pgsql16/pg9300/data/postgresql.conf" contains errors; unaffected changes were applied
2026-04-13 10:06:16.030 UTC [3008548] LOG:  received SIGHUP, reloading configuration files
2026-04-13 10:06:16.030 UTC [3008548] LOG:  parameter "recovery_target_timeline" cannot be changed without restarting the server
2026-04-13 10:06:16.030 UTC [3008548] LOG:  configuration file "/usr/local/pgsql16/pg9300/data/postgresql.conf" contains errors; unaffected changes were applied


--pgautofailover会自动切换日志,下一个日志
2026-04-13 10:06:20.355 UTC [9605] LOG:  starting PostgreSQL 16.4 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0, 64-bit
2026-04-13 10:06:20.364 UTC [9605] LOG:  listening on IPv4 address "0.0.0.0", port 9300
2026-04-13 10:06:20.364 UTC [9605] LOG:  listening on IPv6 address "::", port 9300
2026-04-13 10:06:20.371 UTC [9605] LOG:  listening on Unix socket "/tmp/.s.PGSQL.9300"
2026-04-13 10:06:20.376 UTC [9609] LOG:  database system was interrupted while in recovery at log time 2026-04-13 10:06:15 UTC
2026-04-13 10:06:20.376 UTC [9609] HINT:  If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
2026-04-13 10:06:20.392 UTC [9609] LOG:  entering standby mode
2026-04-13 10:06:20.392 UTC [9609] LOG:  starting backup recovery with redo LSN 0/2D9181A0, checkpoint LSN 0/2D9181D8, on timeline ID 43
2026-04-13 10:06:20.394 UTC [9609] LOG:  redo starts at 0/2D9181A0
2026-04-13 10:06:20.398 UTC [9609] LOG:  completed backup recovery with redo LSN 0/2D9181A0 and end LSN 0/2D918450
2026-04-13 10:06:20.398 UTC [9609] LOG:  consistent recovery state reached at 0/2D918450
2026-04-13 10:06:20.398 UTC [9609] LOG:  invalid record length at 0/2D918450: expected at least 24, got 0
2026-04-13 10:06:20.398 UTC [9605] LOG:  database system is ready to accept read-only connections
2026-04-13 10:06:20.419 UTC [9610] LOG:  started streaming WAL from primary at 0/2D000000 on timeline 44
2026-04-13 10:07:09.147 UTC [9605] LOG:  received SIGHUP, reloading configuration files
2026-04-13 10:07:09.147 UTC [9605] LOG:  parameter "default_transaction_read_only" changed to "on"
2026-04-13 10:07:09.155 UTC [9609] LOG:  received promote request
2026-04-13 10:07:09.156 UTC [9610] FATAL:  terminating walreceiver process due to administrator command
2026-04-13 10:07:09.156 UTC [9609] LOG:  invalid record length at 0/2D918538: expected at least 24, got 0
2026-04-13 10:07:09.156 UTC [9609] LOG:  redo done at 0/2D918500 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 48.76 s
2026-04-13 10:07:09.158 UTC [9609] LOG:  selected new timeline ID: 45
2026-04-13 10:07:09.355 UTC [9609] LOG:  archive recovery complete
2026-04-13 10:07:09.359 UTC [9607] LOG:  checkpoint starting: force
2026-04-13 10:07:09.360 UTC [9605] LOG:  database system is ready to accept connections
2026-04-13 10:07:09.364 UTC [9607] LOG:  checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.005 s; sync files=2, longest=0.001 s, average=0.001 s; distance=0 kB, estimate=0 kB; lsn=0/2D9185A0, redo lsn=0/2D918568
2026-04-13 10:07:10.200 UTC [9607] LOG:  checkpoint starting: immediate force wait
2026-04-13 10:07:10.205 UTC [9607] LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.005 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=0 kB; lsn=0/2D918688, redo lsn=0/2D918650
2026-04-13 10:07:10.216 UTC [9605] LOG:  received SIGHUP, reloading configuration files
2026-04-13 10:07:10.217 UTC [9605] LOG:  parameter "recovery_target_timeline" cannot be changed without restarting the server
2026-04-13 10:07:10.217 UTC [9605] LOG:  parameter "primary_conninfo" removed from configuration file, reset to default
2026-04-13 10:07:10.217 UTC [9605] LOG:  parameter "primary_slot_name" removed from configuration file, reset to default
2026-04-13 10:07:10.217 UTC [9605] LOG:  configuration file "/usr/local/pgsql16/pg9300/data/postgresql.conf" contains errors; unaffected changes were applied
2026-04-13 10:07:39.274 UTC [9605] LOG:  received SIGHUP, reloading configuration files
2026-04-13 10:07:39.275 UTC [9605] LOG:  parameter "recovery_target_timeline" cannot be changed without restarting the server
2026-04-13 10:07:39.275 UTC [9605] LOG:  parameter "default_transaction_read_only" changed to "off"
2026-04-13 10:07:39.275 UTC [9605] LOG:  configuration file "/usr/local/pgsql16/pg9300/data/postgresql.conf" contains errors; unaffected changes were applied
2026-04-13 10:07:39.300 UTC [9605] LOG:  received SIGHUP, reloading configuration files
2026-04-13 10:07:39.300 UTC [9605] LOG:  parameter "recovery_target_timeline" cannot be changed without restarting the server
2026-04-13 10:07:39.301 UTC [9605] LOG:  configuration file "/usr/local/pgsql16/pg9300/data/postgresql.conf" contains errors; unaffected changes were applied


--pgautofailover会自动切换日志,下一个日志
2026-04-13 10:08:07.482 UTC [3009134] LOG:  starting PostgreSQL 16.4 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0, 64-bit
2026-04-13 10:08:07.482 UTC [3009134] LOG:  listening on IPv4 address "0.0.0.0", port 9300
2026-04-13 10:08:07.482 UTC [3009134] LOG:  listening on IPv6 address "::", port 9300
2026-04-13 10:08:07.486 UTC [3009134] LOG:  listening on Unix socket "/tmp/.s.PGSQL.9300"
2026-04-13 10:08:07.494 UTC [3009140] FATAL:  the database system is starting up
2026-04-13 10:08:07.494 UTC [3009139] LOG:  database system was interrupted while in recovery at log time 2026-04-13 10:07:10 UTC
2026-04-13 10:08:07.494 UTC [3009139] HINT:  If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
2026-04-13 10:08:07.510 UTC [3009139] LOG:  entering standby mode
2026-04-13 10:08:07.510 UTC [3009139] LOG:  starting backup recovery with redo LSN 0/2D918450, checkpoint LSN 0/2D918488, on timeline ID 44
2026-04-13 10:08:07.522 UTC [3009139] LOG:  redo starts at 0/2D918450
2026-04-13 10:08:07.526 UTC [3009139] LOG:  completed backup recovery with redo LSN 0/2D918450 and end LSN 0/2D918738
2026-04-13 10:08:07.526 UTC [3009139] LOG:  consistent recovery state reached at 0/2D918738
2026-04-13 10:08:07.526 UTC [3009139] LOG:  invalid record length at 0/2D918738: expected at least 24, got 0
2026-04-13 10:08:07.526 UTC [3009134] LOG:  database system is ready to accept read-only connections
2026-04-13 10:08:07.540 UTC [3009141] LOG:  started streaming WAL from primary at 0/2D000000 on timeline 45

 

总结:
1,pgautofailover可以实现各种场景下的自动故障转移
2,故障转移之后,原始从节点会自动加入集群作为从节点运行,非常方便
3,pgautofailover完全可以应对网络分区,防脑裂,网络分区后会中断Postgresql服务而保留pgautofailover服务(这个也许你会明白,为什么pgautofailover扩展要替代Postgresql自身的服务,生成一个单独的服务的原因)
4,可以根据需求,设置pgautofailover的故障转移的灵敏程度,有相关参数可以调整
5,pgautofailover简洁,清晰,完整,可靠,可信,可行,我喜欢

posted on 2026-04-13 18:22  MSSQL123  阅读(11)  评论(0)    收藏  举报