用 sed-awk 處理文本小記

一、背景

由於我們將監控系統從 Thanos 架構遷移到 VictoriaMetrics 架構（以下簡稱 VM），需要將原來的告警規則 apply 到新的 VM 集羣中。但是不同的地方在於，需要將告警規則（559 條）從原來的多個配置文件中全部梳理出來後整理成以單條規則的形式錄入到 etcd 中，以 key-values 的形式存放。然後通過 confd 自動生成 yaml 文件後再 apply 到 VM 集羣中生效。這麼做的原因是公司有個開發的監控前端，對接 etcd 來實現告警規則的增刪改查功能。對比如下：

Thanos 架構下的規則，直接編輯 yaml 文件增刪改查告警規則後 apply 生效：

VM 架構下的規則，將如上圖的 yaml 文件規則一條條剝離出來後轉換爲 json 格式後，再修改 label 對應的字段。最終 put 到 etcd 中。

最終，通過 confd 生成 yaml 文件後 apply 到 VM 集羣。

二、需求

根據如上背景，需求如下：

1）將原來包含多條告警規則的多個 yaml 文件的規則進行彙總到一個 yaml 文件中；

2）將這個 yaml 文件的告警規則根據 alert 名稱生成單獨的文件，即：每條告警規則一個 yaml 文件；

3）將生成的單條告警規則的 yaml 文件轉換成對應的 json 文件；

4）將 json 文件進行內容替換，達到符合 confd 模板的格式；

5）將修改後的 json 文件批量 PUT 到 etcd 中，驗證規則是否生效。

** 第一步：**比較簡單但是繁瑣，需要粗略看下 yaml 文件，手工統一處理中文字符和特殊字符，修改 alert 的名字爲唯一：

# sed -i '/name:/d' *
# sed -i '/rules:/d' *
# sed -i -e 's/- alert:/  alertname:/g' -e '/annotations:/d' -e 's/  description:/description:/g' *.yaml
# sed -i '/summary:/d' *.yaml
# sed -i '/runbook_url:/d' *.yaml
# sed -i 's/  message:/message:/g' *.yaml
# sed -i 's/message:/description:/g' *.yaml

最後直接將多個 yaml 文件 cat 到一起即可。主要是中間的三步。

三、實現

_第二步：將這個 yaml 文件的告警規則根據 alert 名稱生成單獨的文件，即：每條告警規則一個 yaml 文件：_

這一步的實現也相對簡單，通過 awk 工具一次性即可解決：

// 生成以告警名稱爲文件名的單條規則的yaml文件
# awk '/alert:/{close(f); f=$3".yaml"; print $0 > f;next} {print $0 > f}' *.yaml
// 或者生成隨機文件名
# awk '/alert:/{close(f); f=srand(3)".yaml"; print $0 > f;next} {print $0 > f}' *.yaml
// 查看生成後的其中一條告警規則
# cat xpu-node-oss-ceph-mount-state.yaml 
        alertname: xpu-node-oss-ceph-mount-state
        description: XPU Cluster：{{ $labels.cluster }} node：{{ $labels.host }} mount
            folder：{{ $labels.path }} failure，Please check！
        expr: xpu_mount_ossorceph_dir{host=~"orinci.*"} == 0
        for: 1m
        labels:
          level: P1
          service: xpu
          severity: error

說明：awk 循環整個 yaml 文件逐行讀取，匹配到 alert 關鍵字後取出告警名稱作爲文件名後將後續內容寫入到改文件中。如此循環即可。也可以生成隨機文件名。

__** 第三步：**_**將生成的單條告警規則的 yaml 文件轉換成對應的 json 文件：_**

這一步中在網上找了一個 yaml 和 json 格式互轉的 python 程序，直接拿來主義使用了：

https://blog.csdn.net/aidijava/article/details/125630629

程序來源：https://blog.csdn.net/aidijava/article/details/125630629

#!/usr/bin/python
# -*- coding: UTF-8 -*-
import yaml
import json
import os
from pathlib import Path
from fnmatch import fnmatchcase
class Yaml_Interconversion_Json:
    def __init__(self):
        self.filePathList = []
    # yaml文件內容轉換成json格式
    def yaml_to_json(self, yamlPath):
        with open(yamlPath, encoding="utf-8") as f:
            datas = yaml.load(f,Loader=yaml.FullLoader)  
        jsonDatas = json.dumps(datas, indent=5)
        # print(jsonDatas)
        return jsonDatas
    # json文件內容轉換成yaml格式
    def json_to_yaml(self, jsonPath):
        with open(jsonPath, encoding="utf-8") as f:
            datas = json.load(f)
        yamlDatas = yaml.dump(datas, indent=5)
        # print(yamlDatas)
        return yamlDatas
    # 生成文件
    def generate_file(self, filePath, datas):
        if os.path.exists(filePath):
            os.remove(filePath)  
        with open(filePath,'w') as f:
            f.write(datas)
    # 清空列表
    def clear_list(self):
        self.filePathList.clear()
    # 修改文件後綴
    def modify_file_suffix(self, filePath, suffix):
        dirPath = os.path.dirname(filePath)
        fileName = Path(filePath).stem + suffix
        newPath = dirPath + '/' + fileName
        # print('{}_path：{}'.format(suffix, newPath))
        return newPath
    # 原yaml文件同級目錄下，生成json文件
    def generate_json_file(self, yamlPath, suffix ='.json'):
        jsonDatas = self.yaml_to_json(yamlPath)
        jsonPath = self.modify_file_suffix(yamlPath, suffix)
        # print('jsonPath：{}'.format(jsonPath))
        self.generate_file(jsonPath, jsonDatas)
    # 原json文件同級目錄下，生成yaml文件
    def generate_yaml_file(self, jsonPath, suffix ='.yaml'):
        yamlDatas = self.json_to_yaml(jsonPath)
        yamlPath = self.modify_file_suffix(jsonPath, suffix)
        # print('yamlPath：{}'.format(yamlPath))
        self.generate_file(yamlPath, yamlDatas)
    # 查找指定文件夾下所有相同名稱的文件
    def search_file(self, dirPath, fileName):
        dirs = os.listdir(dirPath) 
        for currentFile in dirs: 
            absPath = dirPath + '/' + currentFile 
            if os.path.isdir(absPath): 
                self.search_file(absPath, fileName)
            elif currentFile == fileName:
                self.filePathList.append(absPath)
    # 查找指定文件夾下所有相同後綴名的文件
    def search_file_suffix(self, dirPath, suffix):
        dirs = os.listdir(dirPath) 
        for currentFile in dirs: 
            absPath = dirPath + '/' + currentFile 
            if os.path.isdir(absPath):
                if fnmatchcase(currentFile,'.*'): 
                    pass
                else:
                    self.search_file_suffix(absPath, suffix)
            elif currentFile.split('.')[-1] == suffix: 
                self.filePathList.append(absPath)
    # 批量刪除指定文件夾下所有相同名稱的文件
    def batch_remove_file(self, dirPath, fileName):
        self.search_file(dirPath, fileName)
        print('The following files are deleted：{}'.format(self.filePathList))
        for filePath in self.filePathList:
            if os.path.exists(filePath):
                os.remove(filePath)  
        self.clear_list()
    # 批量刪除指定文件夾下所有相同後綴名的文件
    def batch_remove_file_suffix(self, dirPath, suffix):
        self.search_file_suffix(dirPath, suffix)
        print('The following files are deleted：{}'.format(self.filePathList))
        for filePath in self.filePathList:
            if os.path.exists(filePath):
                os.remove(filePath)  
        self.clear_list()
    # 批量將目錄下的yaml文件轉換成json文件
    def batch_yaml_to_json(self, dirPath):
        self.search_file_suffix(dirPath, 'yaml')
        print('The converted yaml file is as follows：{}'.format(self.filePathList))
        for yamPath in self.filePathList:
            try:
                self.generate_json_file(yamPath)
            except Exception as e:
                print('YAML parsing error：{}'.format(e))         
        self.clear_list()
    # 批量將目錄下的json文件轉換成yaml文件
    def batch_json_to_yaml(self, dirPath):
        self.search_file_suffix(dirPath, 'json')
        print('The converted json file is as follows：{}'.format(self.filePathList))
        for jsonPath in self.filePathList:
            try:
                self.generate_yaml_file(jsonPath)
            except Exception as e:
                print('JSON parsing error：{}'.format(jsonPath))
                print(e)
        self.clear_list()
if __name__ == "__main__":
    dirPath = '/mnt/vm-operator/vm-monitoring-install-and-configure/vmalert/alert_rules/per_rules_all'
    fileName = 'rules_json.yaml'
    suffix = 'yaml'
    filePath = dirPath + '/' + fileName
    yaml_interconversion_json = Yaml_Interconversion_Json()
    yaml_interconversion_json.batch_yaml_to_json(dirPath)
    # yaml_interconversion_json.batch_json_to_yaml(dirPath)
    # yaml_interconversion_json.batch_remove_file_suffix(dirPath, suffix)

轉換如下：

// 將yaml轉換成json
# python3 yaml_to_json.py
# cat xpu-node-oss-ceph-mount-state.json 
{
     "alertname": "xpu-node-oss-ceph-mount-state",
     "description": "XPU Cluster\uff1a{{ $labels.cluster }} node\uff1a{{ $labels.host }} mount folder\uff1a{{ $labels.path }} failure\uff0cPlease check\uff01",
     "expr": "xpu_mount_ossorceph_dir{host=~\"orinci.*\"} == 0",
     "for": "1m",
     "labels": {
          "level": "P1",
          "service": "xpu",
          "severity": "error"
     }
}

_第四步：將 json 文件進行內容替換，達到符合 confd 模板的格式：_

最後，也是這個文本處理中最難的一部分。需要將 json 格式內容：

{
     "alertname": "xpu-node-oss-ceph-mount-state",
     "description": "XPU Cluster\uff1a{{ $labels.cluster }} node\uff1a{{ $labels.host }} mount folder\uff1a{{ $labels.path }} failure\uff0cPlease check\uff01",
     "expr": "xpu_mount_ossorceph_dir{host=~\"orinci.*\"} == 0",
     "for": "1m",
     "labels": {
          "level": "P1",
          "service": "xpu",
          "severity": "error"
     }
}

轉換爲：

{
     "alertname": "xpu-node-oss-ceph-mount-state",
     "description": "XPU Cluster\uff1a{{ $labels.cluster }} node\uff1a{{ $labels.host }} mount folder\uff1a{{ $labels.path }} failure\uff0cPlease check\uff01",
     "expr": "xpu_mount_ossorceph_dir{host=~\"orinci.*\"} == 0",
     "for": "1m",
     "labels": [
          {"key": "level","val": "P1"},
          {"key": "service,"val": "xpu"},
          {"key": "severity,"val": "error"}
     ]
}

這裏需要說明一下的，每條規則的 labels 個數可能不一樣，並且內容可能也不一樣。比如：

使用 sed 工具來實現批量替換：

// 將所有json文件內容進行sed替換
# sed -i -n '
{
H
:loop
n;H
/^}/{
x;p
}
/labels":\s*{/!b loop
x
s/labels":\(.*\){/labels":\1\[/g;H
:loop1
n
s/"\(.*\): \(.*\)"\(.*\)/{"key": "\1,"val": \2"}\3/
H
/     }/!b loop1
x
s/     }/     \]/g;H
b loop
} 
' *.json
// 查看生成後的文件
# cat xpu-node-oss-ceph-mount-state.json
     }
     "labels": {
{
     "alertname": "xpu-node-oss-ceph-mount-state",
     "description": "XPU Cluster\uff1a{{ $labels.cluster }} node\uff1a{{ $labels.host }} mount folder\uff1a{{ $labels.path }} failure\uff0cPlease check\uff01",
     "expr": "xpu_mount_ossorceph_dir{host=~\"orinci.*\"} == 0",
     "for": "1m",
     "labels": [
          {"key": "level","val": "P1"},
          {"key": "service,"val": "xpu"},
          {"key": "severity,"val": "error"}
     ]
}
// 刪除每個文件多餘的前3行
# sed -i '1,3d' xpu-node-oss-ceph-mount-state.json
{
     "alertname": "xpu-node-oss-ceph-mount-state",
     "description": "XPU Cluster\uff1a{{ $labels.cluster }} node\uff1a{{ $labels.host }} mount folder\uff1a{{ $labels.path }} failure\uff0cPlease check\uff01",
     "expr": "xpu_mount_ossorceph_dir{host=~\"orinci.*\"} == 0",
     "for": "1m",
     "labels": [
          {"key": "level","val": "P1"},
          {"key": "service,"val": "xpu"},
          {"key": "severity,"val": "error"}
     ]
}

最終實現了格式的轉換。這裏延伸一下，簡單介紹一下 sed 的原理和說明這部分 sed 的實現：

sed 有兩個空間：模式空間 (pattern space)；交換空間（hold space 保持空間）

模式空間：容納當前行的緩衝區，即通過模式匹配到的行被讀入該空間中
保持空間：一個輔助緩衝區，可以和模式空間進行交互（通過 h,H,g,G），但命令不能直接作用於該空間，在進行數據處理時作爲 “暫存區域”

執行步驟如下：

指令：h,H,p,P,x

交換空間

h: 用模式空間內容覆蓋交換空間

H: 將模式空間內容追加到交換空間

p/P: 模式空間輸出

x: 交換空間 & 模式空間內容交換

g: 將交換空間的內容，覆蓋到模式空間

G: 將交換空間的內容，追加到模式空間

註釋說明：

# sed -i -n '
{
H                          // 將首行追加到hold空間
:loop                      // 開啓外層循環
n;H                        // 循環讀取下一行並追加到hold空間
/^}/{                      // 如果是最後一行的} 則說明讀取結束，打印最終內容
x;p                        // 將hold空間內容交換到模式空間後打印出來
}
/labels":\s*{/!b loop      // 如果匹配到labels,則暫停外層循環，爲進入內循環準備
x                          // 交換hold空間內容到模式空間，此時labels那一行被進入到hold空間了
s/labels":\(.*\){/labels":\1\[/g;H       // 替換內容後追加到hold空間
:loop1                     // 開啓內層循環
n                          // 逐行讀取
s/"\(.*\): \(.*\)"\(.*\)/{"key": "\1,"val": \2"}\3/    // 替換內容
H                          // 追加到hold空間
/     }/!b loop1           // 內循環結束
x                          // 交換hold空間內容到模式空間，此時上一個}進入到hold空間
s/     }/     \]/g;H       // 替換後追加到hold空間
b loop                     // 外層循環結束
} 
' *.json

如上的 11 和 18 行前面的 x 交換就是導致最終生成的文件內容前三行多餘的內容。可以將 11 和 18 行末尾的 H 修改爲 h 覆蓋掉則沒有多餘的內容了。如下：

# sed -n '
> {
> H
> :loop
> n;H
> /^}/{
> x;p
> }
> /labels":\s*{/!b loop
> x
> s/labels":\(.*\){/labels":\1\[/g;h
> :loop1
> n
> s/"\(.*\): \(.*\)"\(.*\)/{"key": "\1,"val": \2"}\3/
> H
> /     }/!b loop1
> x
> s/     }/     \]/g;h
> b loop
> } 
> ' abc
{
     "alertname": "xpu-node-oss-ceph-mount-state",
     "description": "XPU Cluster\uff1a{{ $labels.cluster }} node\uff1a{{ $labels.host }} mount folder\uff1a{{ $labels.path }} failure\uff0cPlease check\uff01",
     "expr": "xpu_mount_ossorceph_dir{host=~\"orinci.*\"} == 0",
     "for": "1m",
     "labels": [
          {"key": "level","val": "P1"},
          {"key": "service","val": "xpu"},
          {"key": "severity","val": "error"}
     ]
}

_第五步：將修改後的 json 文件批量 PUT 到 etcd 中，驗證規則生效_

// 將告警規則批量導入到etcd
# for i in $(ls -lrth *.json | awk '{print $NF}')
do
  ETCDCTL_API=3 etcdctl --endpoints=xxx.xxx.xxx.xxx:2379 --user=user:xxx put /prom/alertmanager/promethuesrule/$i "$(cat $i)"
done

最終，實現了告警規則的改造。自動應用生效。

本文由 Readfog 進行 AMP 轉碼，版權歸原作者所有。
來源：https://mp.weixin.qq.com/s/Sv7hNXGWXuYna1-bdZ4jUw

指令：h,H,p,P,x

猜你喜歡