PostgreSQL 數據庫高可用——patroni REST API[翻譯]

    Patroni 有豐富的 REST API,Patroni 在領導者競爭,patronictl 工具用於執行故障轉移 / 切換 / 重新初始化 / 重新啓動 / 重新加載,由 HAProxy 或任何其他類型的負載平衡器執行 HTTP 健康檢查,監控期間使用該 API。您將在下面找到 Patroni REST API endpoints。Patroni has a rich REST API, which is used by Patroni itself during the leader race, by the patronictl tool in order to perform failovers/switchovers/reinitialize/restarts/reloads, by HAProxy or any other kind of load balancer to perform HTTP health checks, and of course could also be used for monitoring. Below you will find the list of Patroni REST API endpoints.

Health check endpoints

    對於所有健康檢查 GET 請求,Patroni 返回一個帶有節點狀態的 JSON 文檔以及 HTTP 狀態代碼。如果您不想要或不需要 JSON 文檔,您可以考慮使用 OPTIONS 方法而不是 GET(For all health check GET requests Patroni returns a JSON document with the status of the node, along with the HTTP status code. If you don’t want or don’t need the JSON document, you might consider using the OPTIONS method instead of GET.)。

、GET /standby-leader?tag_key1=value1&tag_key2=value2

    readiness 和 liveness 端點都非常輕量級,不執行任何 SQL。探針的配置方式應使其在領導密鑰到期時開始失敗。使用默認值 ttl,即 30s 示例探針將如下所示(Both, readiness and liveness endpoints are very light-weight and not executing any SQL. Probes should be configured in such a way that they start failing about time when the leader key is expiring. With the default value of ttl, which is 30s example probes would look like):

readinessProbe:
  httpGet:
    scheme: HTTP
    path: /readiness
    port: 8008
  initialDelaySeconds: 3
  periodSeconds: 10
  timeoutSeconds: 5
  successThreshold: 1
  failureThreshold: 3
livenessProbe:
  httpGet:
    scheme: HTTP
    path: /liveness
    port: 8008
  initialDelaySeconds: 3
  periodSeconds: 10
  timeoutSeconds: 5
  successThreshold: 1
  failureThreshold: 3

Monitoring endpoint

    Patroni 在領先者競爭中使用 GET /patroni。您的監控系統也可以使用它。此端點生成的 JSON 文檔與運行狀況檢查端點生成的 JSON 具有相同的結構(The GET /patroni is used by Patroni during the leader race. It also could be used by your monitoring system. The JSON document produced by this endpoint has the same structure as the JSON produced by the health check endpoints)。

$ curl -s http://localhost:8008/patroni | jq .
{
  "state": "running",
  "postmaster_start_time": "2019-09-24 09:22:32.555 CEST",
  "role": "master",
  "server_version": 110005,
  "cluster_unlocked": false,
  "xlog": {
    "location": 25624640
  },
  "timeline": 3,
  "database_system_identifier": "6739877027151648096",
  "patroni": {
    "version": "1.6.0",
    "scope": "batman"
  }
}

Cluster status endpoints

GET /cluster 端點產生描述當前集羣拓撲和狀態的 JSON 文檔:

$ curl -s http://localhost:8008/cluster | jq .
{
  "members": [
    {
      "name": "postgresql0",
      "host": "127.0.0.1",
      "port": 5432,
      "role": "leader",
      "state": "running",
      "api_url": "http://127.0.0.1:8008/patroni",
      "timeline": 5,
      "tags": {
        "clonefrom": true
      }
    },
    {
      "name": "postgresql1",
      "host": "127.0.0.1",
      "port": 5433,
      "role": "replica",
      "state": "running",
      "api_url": "http://127.0.0.1:8009/patroni",
      "timeline": 5,
      "tags": {
        "clonefrom": true
      },
      "lag": 0
    }
  ],
  "scheduled_switchover": {
    "at": "2019-09-24T10:36:00+02:00",
    "from": "postgresql0"
  }
}

    GET /history 端點提供集羣 switchovers / failovers 的歷史觀點。格式與目錄中歷史文件的內容非常相似。唯一的區別是顯示新時間線創建時間的時間戳字段(The GET /history endpoint provides a view on the history of cluster switchovers/failovers. The format is very similar to the content of history files in the pg_wal directory. The only difference is the timestamp field showing when the new timeline was created)。

$ curl -s http://localhost:8008/history | jq .
[
  [
    1,
    25623960,
    "no recovery target specified",
    "2019-09-23T16:57:57+02:00"
],
  [
    2,
    25624344,
    "no recovery target specified",
    "2019-09-24T09:22:33+02:00"
],
  [
    3,
    25624752,
    "no recovery target specified",
    "2019-09-24T09:26:15+02:00"
],
  [
    4,
    50331856,
    "no recovery target specified",
    "2019-09-24T09:35:52+02:00"
]
]

Config endpoint

GET /config:獲取當前版本的動態配置

$ curl -s localhost:8008/config | jq .
{
  "ttl": 30,
  "loop_wait": 10,
  "retry_timeout": 10,
  "maximum_lag_on_failover": 1048576,
  "postgresql": {
    "use_slots": true,
    "use_pg_rewind": true,
    "parameters": {
      "hot_standby": "on",
      "wal_log_hints": "on",
      "wal_level": "hot_standby",
      "max_wal_senders": 5,
      "max_replication_slots": 5,
      "max_connections": "100"
    }
  }
}

PATCH /config:更改現有配置

$ curl -s -XPATCH -d \
        '{"loop_wait":5,"ttl":20,"postgresql":{"parameters":{"max_connections":"101"}}}' \
        http://localhost:8008/config | jq .
{
  "ttl": 20,
  "loop_wait": 5,
  "maximum_lag_on_failover": 1048576,
  "retry_timeout": 10,
  "postgresql": {
    "use_slots": true,
    "use_pg_rewind": true,
    "parameters": {
      "hot_standby": "on",
      "wal_log_hints": "on",
      "wal_level": "hot_standby",
      "max_wal_senders": 5,
      "max_replication_slots": 5,
      "max_connections": "101"
    }
  }
}

上述 REST API 調用修補現有配置並返回新配置(The above REST API call patches the existing configuration and returns the new configuration)。

    讓我們檢查節點是否處理了這個配置。首先,它應該每 5 秒開始打印日誌行(loop_wait=5)。“max_connections”的改變需要重啓,所以應該暴露 “pending_restart” 標誌(Let’s check that the node processed this configuration. First of all it should start printing log lines every 5 seconds (loop_wait=5). The change of “max_connections” requires a restart, so the “pending_restart” flag should be exposed):

$ curl -s http://localhost:8008/patroni | jq .
{
  "pending_restart": true,
  "database_system_identifier": "6287881213849985952",
  "postmaster_start_time": "2016-06-13 13:13:05.211 CEST",
  "xlog": {
    "location": 2197818976
  },
  "patroni": {
    "scope": "batman",
    "version": "1.0"
  },
  "state": "running",
  "role": "master",
  "server_version": 90503
}

刪除參數:
如果您想刪除(重置)某些設置,只需使用以下命令對其進行修補 null:

$ curl -s -XPATCH -d \
        '{"postgresql":{"parameters":{"max_connections":null}}}' \
        http://localhost:8008/config | jq .
{
  "ttl": 20,
  "loop_wait": 5,
  "retry_timeout": 10,
  "maximum_lag_on_failover": 1048576,
  "postgresql": {
    "use_slots": true,
    "use_pg_rewind": true,
    "parameters": {
      "hot_standby": "on",
      "unix_socket_directories": ".",
      "wal_level": "hot_standby",
      "wal_log_hints": "on",
      "max_wal_senders": 5,
      "max_replication_slots": 5
    }
  }
}

上述調用 postgresql.parameters.max_connections 從動態配置中刪除。

PUT /config:也可以無條件地完全重寫現有的動態配置:

$ curl -s -XPUT -d \
        '{"maximum_lag_on_failover":1048576,"retry_timeout":10,"postgresql":{"use_slots":true,"use_pg_rewind":true,"parameters":{"hot_standby":"on","wal_log_hints":"on","wal_level":"hot_standby","unix_socket_directories":".","max_wal_senders":5}},"loop_wait":3,"ttl":20}' \
        http://localhost:8008/config | jq .
{
  "ttl": 20,
  "maximum_lag_on_failover": 1048576,
  "retry_timeout": 10,
  "postgresql": {
    "use_slots": true,
    "parameters": {
      "hot_standby": "on",
      "unix_socket_directories": ".",
      "wal_level": "hot_standby",
      "wal_log_hints": "on",
      "max_wal_senders": 5
    },
    "use_pg_rewind": true
  },
  "loop_wait": 3
}

Switchover and failover endpoints

POST /switchover 或 POST /failover。這些端點彼此非常相似。但是有一些細微的差別:

$ curl -s http://localhost:8009/failover -XPOST -d '{"candidate":"postgresql1"}'
Successfully failed over to "postgresql1"

示例:在特定時間安排從領導者到集羣中任何其他健康副本的切換(schedule a switchover from the leader to any other healthy replica in the cluster at a specific time):

$ curl -s http://localhost:8008/switchover -XPOST -d \
        '{"leader":"postgresql0","scheduled_at":"2019-09-24T12:00+00"}'
Switchover scheduled

根據具體情況,請求可能以不同的 HTTP 狀態代碼和正文結束。切換或故障切換成功完成時返回狀態碼 200 。如果切換成功安排,Patroni 將返回 HTTP 狀態代碼 202。如果出現問題,將在響應正文中返回錯誤狀態代碼(400、412 或 503 之一)以及一些詳細信息。有關更多信息,請查看 patroni/api.py:do_POST_failover() 方法的源代碼(Depending on the situation the request might finish with a different HTTP status code and body. The status code 200 is returned when the switchover or failover successfully completed. If the switchover was successfully scheduled, Patroni will return HTTP status code 202. In case something went wrong, the error status code (one of 400, 412 or 503) will be returned with some details in the response body. For more information please check the source code of patroni/api.py:do_POST_failover() method)。

POST /switchover 和 POST 端點分別由 patronictl switchover 和 patronictl switchover 使用。DELETE /switchover 由 patronictl flush switchover 使用。(The POST /switchover and POST failover endpoints are used by patronictl switchover and patronictl failover, respectively. The DELETE /switchover is used by patronictl flush switchover)

Restart endpoint

POST /restart:您可以通過執行 POST /restartPOST 調用在特定節點上重新啓動 Postgres。在請求的 JSON 正文中,可以選擇指定一些重啓條件(You can restart Postgres on the specific node by performing the POST /restart call. In the JSON body of POST request it is possible to optionally specify some restart conditions):

DELETE /restart: 刪除預定重啓

POST /restart 和 DELETE /restart 端點分別由 patronictl restart 和 patronictl flush restart 使用。

Reload endpoint

    POST /reload 調用將命令 Patroni 重新讀取並應用配置文件。這相當於將信號 SIGHUP 發送到 Patroni 進程。如果您更改了一些需要重新啓動的 Postgres 參數(例如 shared_buffers),您仍然必須通過調用 POST /restart 端點或使用 patronictl restart。

(The POST /reload call will order Patroni to re-read and apply the configuration file. This is the equivalent of sending the SIGHUP signal to the Patroni process. In case you changed some of the Postgres parameters which require a restart (like shared_buffers), you still have to explicitly do the restart of Postgres by either calling the POST /restart endpoint or with the help of patronictl restart)

.patronictl reload 重新加載端點。

Reinitialize endpoint

    POST /reinitialize: 重新初始化指定節點上的 PostgreSQL 數據目錄。它只允許在副本上執行。一旦被調用,它將刪除數據目錄並啓動 pg_basebackup 或一些替代的副本創建方法(reinitialize the PostgreSQL data directory on the specified node. It is allowed to be executed only on replicas. Once called, it will remove the data directory and start pg_basebackup or some alternative replica creation method)。

    如果 Patroni 處於試圖恢復(重新啓動)失敗的 Postgres 的循環中,則調用可能會失敗。爲了克服這個問題,可以 {“force”:true} 在請求正文中指定(The call might fail if Patroni is in a loop trying to recover (restart) a failed Postgres. In order to overcome this problem one can specify {“force”:true} in the request body)。

    重新初始化端點由 patronictl reinit 使用(The reinitialize endpoint is used by patronictl reinit)。

本文由 Readfog 進行 AMP 轉碼,版權歸原作者所有。
來源https://mp.weixin.qq.com/s/TaKXTeNSuyIw1RTS7mONJw