golang pprof label 使用

trace vs pprof

go tool trace 和 go tool pprof 兩個工具的使用方法類似，但是兩者的原理和側重點不同：

1，go tool trace 更側重於記錄分析採樣時間內運行時系統具體幹了什麼。原理是監聽 golang 調度器的事件，記錄操作耗時等相關信息。Tracer 爲每個 event 打的時間戳都精確到納秒（nanosecond）級精度。性能損耗大。

與 goroutine 調度有關的事件信息：goroutine 的創建、啓動和結束；goroutine 在同步原語（包括 mutex、channel 收發操作）上的阻塞與解鎖。
與網絡有關的事件：goroutine 在網絡 I/O 上的阻塞和解鎖；
與系統調用有關的事件：goroutine 進入系統調用與從系統調用返回；
與垃圾回收器有關的事件：GC 的開始 / 停止，併發標記、清掃的開始 / 停止。

2，go tool pprof 便是一個基於採樣（sampling）的性能剖析 (profiing) 輔助工具。它基於定時器對運行的 go 程序進行各種採樣，包括諸如 CPU 時間、內存分配等方面。

在 Go 運行時內部，CPU 分析使用操作系統計時器來定期（每秒約 100 次，即 10ms 一次）中斷執行。在每個中斷（也稱爲樣本）上，它同時收集當時的調用堆棧。當爲了實現更高頻度採樣時（比如微秒級別的採樣），目前的 go profile 無法支持（爲此 uber 工程師提了一個名爲 pprof++ 的高精度、更精確並支持硬件監控的提案）。

trace 性能差；pprof 採樣粒度太大，基於統計意義上的分析，能夠定位到熱點函數，但是缺少上下文信息，如果想讓 pprof 帶上上下文信息，pprof label 就誕生了

pprof label

https://rakyll.org/profiler-labels/

可以給我們的 pprof 加上上下文信息，幫助定位同一個函數在不同調用場景下，看看具體哪個場景會出現熱點。使用也非常簡單

labels := pprof.Labels("worker", "purge")
pprof.Do(ctx, labels, func(ctx context.Context) {
    // Do some work...
go update(ctx) // propagates labels in ctx.
})

func Do(ctx context.Context, labels LabelSet, f func(context.Context))

在調用 f 時, 將給定的標籤添加到父上下文的標籤映射中。在執行 f 時產生的 Goroutine 將繼承增強的標籤集。標籤中的每一個鍵 / 值對都會按照給定的順序插入到標籤圖中, 並覆蓋之前相同鍵的任何值。增強的標籤圖將在調用 f 的期間被設置, 並在 f 返回後恢復。

需要注意的是定義 Labels 的時候一定要傳偶數個 label 否則會 panic

 //panic: uneven number of arguments to pprof.Labels

func ForLabels(ctx context.Context, f func(key, value string) bool)

ForLabels對上下文上設置的每個標籤調用f。函數f應該返回true來繼續迭代,或者返回false來提前停止迭代。

func Label(ctx context.Context, key string) (string, bool)

Label 返回ctx上帶有給定鍵的標籤的值,以及一個表示該標籤是否存在的布爾值。

func SetGoroutineLabels(ctx context.Context)

SetGoroutineLabels 設置當前 goroutine 的標籤以匹配 ctx。新的 goroutine 會繼承創建它的 goroutine 的標籤。這是一個比 Do 更低級的 API, 在可能的情況下應該使用 Do。

具體使用

package main
import (
  "context"
  "fmt"
  "log"
  "os"
  "runtime/pprof"
  "strings"
  "github.com/google/uuid"
)
func main() {
  // Setup CPU profiling.
  f, err := os.Create("cpu.prof")
  if err != nil {
    log.Fatal("could not create CPU profile: ", err)
  }
  defer f.Close() // error handling omitted for example
  if err := pprof.StartCPUProfile(f); err != nil {
    log.Fatal("could not start CPU profile: ", err)
  }
  defer pprof.StopCPUProfile()
  ctx := context.Background()
  uuidWithHyphen := uuid.New()
  uuid := strings.Replace(uuidWithHyphen.String(), "-", "", -1)
  // Run workload for tenant1
  iteratePerTenant(ctx, "tenant1", uuid)
  // Run workload for tenant2
  iteratePerTenant(ctx, "tenant2", uuidWithHyphen.String())
}
var (
  iterationsPerTenant = map[string]int{
    "tenant1": 10_000_000_000,
    "tenant2": 1_000_000_000,
  }
)
func iteratePerTenant(ctx context.Context, tenant, uuid string) {
  pprof.Do(ctx, pprof.Labels("tenant", tenant, "uuid", uuid), func(ctx context.Context) {
    iterate(iterationsPerTenant[tenant])
  })
}
func iterate(iterations int) {
  for i := 0; i < iterations; i++ {
  }
}

執行

go run main.go
go tool pprof -http=:8080 cpu.prof
http://127.0.0.1:8000/debug/pprof/allocs?debug=1

可以看到下面的結果

可以看到，整體的耗時中，區分了每一個不同 guid 的耗時（對應了立體框）

當然也可以配合 bentch mark 來做

func Add(str string) string {
  data := []byte(str)
  sData := string(data)
  datas = append(datas, sData)
  return sData
}

針對上面的函數，加 label 之前

package main
import "testing"
const url = "Hello World!"
func TestAdd(t *testing.T) {
  s := Add(url)
  if s == "" {
    t.Errorf("Test.Add error!")
  }
}
func BenchmarkAdd(b *testing.B) {
  for i := 0; i < b.N; i++ {
    Add(url)
  }
}

運行

go test -bench=. -cpuprofile=cpu.prof
go tool pprof -http=:8080 cpu.prof

加 label

package main
import (
  "context"
  "fmt"
  "runtime/pprof"
  "testing"
)
const url = "Hello World!"
func TestAdd(t *testing.T) {
  s := Add(context.Background(), url)
  if s == "" {
    t.Errorf("Test.Add error!")
  }
}
func BenchmarkAdd(b *testing.B) {
  ctx := context.Background()
  for i := 0; i < b.N; i++ {
    labels := pprof.Labels("worker", "purge", "number", fmt.Sprint(i%10))
    pprof.Do(ctx, labels, func(ctx context.Context) {
      Add(ctx, url)
    })
  }
}

我們可以通過 pprof label 做更細粒度的分析，方便我們能更具體的定位熱點，label 的數據可以根據業務場景自己決定，比如 traceId，比如請求 url 等等。

本文由 Readfog 進行 AMP 轉碼，版權歸原作者所有。
來源：https://mp.weixin.qq.com/s/N7hVRy8fgL93D5CQ50yZhw

猜你喜歡