go benchmark 性能測試

go benchmark 性能測試, 基準測試, 單元測試, 覆蓋測試

編寫基準測試

func BenchmarkSprintf(b *testing.B){
	num:=10
	b.ResetTimer()
	for i:=0;i<b.N;i++{
		fmt.Sprintf("%d",num)
	}
}

// 加上 -bench= 標記，接受一個表達式作爲參數, .表示運行所有的基準測試
// -run=匹配一個從來沒有的單元測試方法，過濾掉單元測試的輸出，我們這裏使用none
// 也可以使用 -run=^$, 匹配這個規則的
go test -bench=. -run=none
go test -bench=. -run=^$

併發基準測試

func BenchmarkCombinationParallel(b *testing.B) {
    // 測試一個對象或者函數在多線程的場景下面是否安全
    b.RunParallel(func(pb *testing.PB) {
        for pb.Next() {
            m := rand.Intn(100) + 1
            n := rand.Intn(m)
            combination(m, n)
        }
    })
}

性能對比

func BenchmarkSprintf(b *testing.B){
	num:=10
	b.ResetTimer()
	for i:=0;i<b.N;i++{
		fmt.Sprintf("%d",num)
	}
}

func BenchmarkFormat(b *testing.B){
	num:=int64(10)
	b.ResetTimer()
	for i:=0;i<b.N;i++{
		strconv.FormatInt(num,10)
	}
}

func BenchmarkItoa(b *testing.B){
	num:=10
	b.ResetTimer()
	for i:=0;i<b.N;i++{
		strconv.Itoa(num)
	}
}

➜  hello go test -bench=. -run=none              
BenchmarkSprintf-8      20000000               117 ns/op
BenchmarkFormat-8       50000000                33.3 ns/op
BenchmarkItoa-8         50000000                34.9 ns/op
PASS
ok      flysnow.org/hello       5.951s

從結果上看 strconv.FormatInt 函數是最快的，其次是 strconv.Itoa，然後是 fmt.Sprintf 最慢，前兩個函數性能達到了最後一個的 3 倍多。那麼最後一個爲什麼這麼慢的，我們再通過 - benchmem 找到根本原因。

➜  hello go test -bench=. -benchmem -run=none
BenchmarkSprintf-8      20000000               110 ns/op              16 B/op          2 allocs/op
BenchmarkFormat-8       50000000                31.0 ns/op             2 B/op          1 allocs/op
BenchmarkItoa-8         50000000                33.1 ns/op             2 B/op          1 allocs/op
PASS
ok      flysnow.org/hello       5.610s

-benchmem 可以提供每次操作分配內存的次數，以及每次操作分配的字節數。從結果我們可以看到，性能高的兩個函數，每次操作都是進行 1 次內存分配，而最慢的那個要分配 2 次；性能高的每次操作分配 2 個字節內存，而慢的那個函數每次需要分配 16 字節的內存。從這個數據我們就知道它爲什麼這麼慢了，內存分配都佔用都太高。

在代碼開發中，對於我們要求性能的地方，編寫基準測試非常重要，這有助於我們開發出性能更好的代碼。不過性能、可用性、複用性等也要有一個相對的取捨，不能爲了追求性能而過度優化。

結合 pprof

package bench
import "testing"
func Fib(n int) int {
    if n < 2 {
      return n
    }
    return Fib(n-1) + Fib(n-2)
}
func BenchmarkFib10(b *testing.B) {
    // run the Fib function b.N times
    for n := 0; n < b.N; n++ {
      Fib(10)
    }
}

go test -bench=. -benchmem -cpuprofile profile.out

還可以同時看內存

go test -bench=. -benchmem -memprofile memprofile.out -cpuprofile profile.out

然後就可以用輸出的文件使用 pprof

go tool pprof profile.out
File: bench.test
Type: cpu
Time: Apr 5, 2018 at 4:27pm (EDT)
Duration: 2s, Total samples = 1.85s (92.40%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 1.85s, 100% of 1.85s total
      flat  flat%   sum%        cum   cum%
     1.85s   100%   100%      1.85s   100%  bench.Fib
         0     0%   100%      1.85s   100%  bench.BenchmarkFib10
         0     0%   100%      1.85s   100%  testing.(*B).launch
         0     0%   100%      1.85s   100%  testing.(*B).runN

這個是使用 cpu 文件，也可以使用內存文件

然後你也可以用 list 命令檢查函數需要的時間

(pprof) list Fib
     1.84s      2.75s (flat, cum) 148.65% of Total
         .          .      1:package bench
         .          .      2:
         .          .      3:import "testing"
         .          .      4:
     530ms      530ms      5:func Fib(n int) int {
     260ms      260ms      6:   if n < 2 {
     130ms      130ms      7:           return n
         .          .      8:   }
     920ms      1.83s      9:   return Fib(n-1) + Fib(n-2)
         .          .     10:}

或者使用 web 命令生成圖像（png,pdf,...）

brew install graphviz

火焰圖

火焰圖（Flame Graph）是 Bredan Gregg 創建的一種性能分析圖表，因爲它的樣子近似火焰而得名。

火焰圖 svg 文件可以通過瀏覽器打開，它對於調用圖的最優點是它是動態的：可以通過點擊每個方塊來 zoom in 分析它上面的內容。

火焰圖的調用順序從下到上，每個方塊代表一個函數，它上面一層表示這個函數會調用哪些函數，方塊的大小代表了佔用 CPU 使用的長短。火焰圖的配色並沒有特殊的意義，默認的紅、黃配色是爲了更像火焰而已。

runtime/pprof 分析項目, 會在當前文件夾內導出 profile 文件。然後用火焰圖去分析，就不能指定域名了，要指定文件。

go-torch

網上介紹大部分使用 uber 的開源工具

go-torch。這是 uber 開源的一個工具，可以直接讀取 golang profiling 數據，並生成一個火焰圖的 svg 文件。

go-torch 工具的使用非常簡單，沒有任何參數的話，它會嘗試從 http://localhost:8080/debug/pprof/profile 獲取 profiling 數據。它有三個常用的參數可以調整：

-u --url：要訪問的 URL，這裏只是主機和端口部分
-s --suffix：pprof profile 的路徑，默認爲 /debug/pprof/profile
--seconds：要執行 profiling 的時間長度，默認爲 30s

原生支持

從 Go 1.11 開始, 火焰圖被集成進入 Go 官方的 pprof 庫.

# This will listen on :8081 and open a browser.
# Change :8081 to a port of your choice.
$ go tool pprof -http=":8081" [binary] [profile]

一個 web 小例子

package main

import (
	"fmt"
	"log"
	"net/http"
	_ "net/http/pprof"
	"time"
)

func sayHelloHandler(w http.ResponseWriter, r *http.Request) {
	hellowold(10000)
	fmt.Println("path", r.URL.Path)
	fmt.Println("scheme", r.URL.Scheme)
	fmt.Fprintf(w, "Hello world!\n") //這個寫入到w的是輸出到客戶端的
}

func main() {
	http.HandleFunc("/", sayHelloHandler) //	設置訪問路由
	log.Fatal(http.ListenAndServe(":8080", nil))
}

func hellowold(times int) {
	time.Sleep(time.Second)
	var counter int
	for i := 0; i < times; i++ {
		for j := 0; j < times; j++ {
			counter++
		}
	}
}

使用下面的命令開啓監控，然後訪問幾次 localhost:8080

go tool pprof -http=":8081" http://localhost:8080/debug/pprof/profile

過一會兒會產生個 web 窗口, 選擇 VIEW->Flame Graph 得到火焰圖形

http://localhost:8081/ui/flamegraph

Testing flags
go 測試後面可以跟哪些參數

Testing flags

常用 flag

-bench regexp: 性能測試，支持表達式對測試函數進行篩選。-bench . 則是對所有的 benchmark 函數測試
-benchmem: 性能測試的時候顯示測試函數的內存分配的統計信息
－count n: 運行測試和性能多少此，默認一次
-run regexp: 只運行特定的測試函數，比如 - run ABC 只測試函數名中包含 ABC 的測試函數
-timeout t: 測試時間如果超過 t, panic, 默認 10 分鐘
-v: 顯示測試的詳細信息，也會把 Log、Logf 方法的日誌顯示出來

go test -v -bench=. -benchmem main_test.go
go test -v -bench=BenchmarkTrie -benchmem -run=none ./
go test -bench=. -benchmem -memprofile memprofile.out -cpuprofile profile.out example_test.go 
go tool pprof -http=":8081" profile.out

覆蓋測試

測試覆蓋率就是測試運行到的被測試代碼的代碼數目。其中以語句的覆蓋率最爲簡單和廣泛，語句的覆蓋率指的是在測試中至少被運行一次的代碼佔總代碼數的比例。
測試整個包: go test -cover=true pkg_name
測試單個測試函數: go test -cover=true pkg_name -run TestSwap。

生成 HTML 報告

go test -cover=true pkg_name -coverprofile=out.out 將在當前目錄生成覆蓋率數據
配合 go tool cover -html=out.out 在瀏覽器中打開 HTML 報告。
或者使用 go tool cover -html=out.out -o=out.html 生成 HTML 文件。

批量收集 go pkg 覆蓋測試數據

#!/bin/bash
    
set -e
    
profile="cover.out"
htmlfile="cover.html"
mergecover="merge_cover"
mode="set"
    
for package in $(go list ./... | grep -v src); do
    coverfile="$(echo $package | tr / -).cover"
    go test -covermode="$mode" -coverprofile="$coverfile" -coverpkg="$package" "$package"
done
    
# merge all profiles
grep -h -v "^mode:" *.cover | sort > $mergecover
    
# aggregate duplicated code-block data
echo "mode: $mode" > $profile
current=""
count=0
while read line; do
    block=$(echo $line | cut -d ' ' -f1-2)
    num=$(echo $line | cut -d ' ' -f3)
    if [ "$current" == "" ]; then
        current=$block
        count=$num
    elif [ "$block" == "$current" ]; then
        count=$(($count + $num))
    else
        echo $current $count >> $profile
        current=$block
        count=$num
    fi
done < $mergecover
    
if [ "$current" != "" ]; then
    echo $current $count >> $profile
fi
    
# save result
go tool cover -html=$profile -o $htmlfile
go tool cover -func=$profile
rm $mergecover
rm $profile
rm *.cover

作者：百里求一
出處：http://www.cnblogs.com/bergus/
我的語雀: https://www.yuque.com/barry.bai
本文版權歸作者和博客園共有，歡迎轉載，但未經作者同意必須保留此段聲明，且在文章頁面明顯位置給出原文連接，否則保留追究法律責任的權利。

本文由 Readfog 進行 AMP 轉碼，版權歸原作者所有。
來源：https://www.cnblogs.com/bergus/articles/go-benchmark-xing-neng-ce-shi.html