Go Slice Append 原理剖析

在讀者討論羣，有人舉了以下例子，並想得到一個合理的回答。

1package main
2
3func main() {
4    s := []int{1,2}
5    s = append(s, 3,4,5)
6    println(cap(s))
7}
8
9// output: 6

爲什麼結果不是 5，不是 8，而是 6 呢？由於小菜刀在該文中關於擴容的描述不夠準確，讓讀者產生了疑惑。因此本文想借此機會細緻分析一下append函數及其背後的擴容機制。

我們知道，append是一種用戶在使用時，並不需要引入相關包而可直接調用的函數。它是內置函數，其定義位於源碼包 builtin 的builtin.go。

 1// The append built-in function appends elements to the end of a slice. If
 2// it has sufficient capacity, the destination is resliced to accommodate the
 3// new elements. If it does not, a new underlying array will be allocated.
 4// Append returns the updated slice. It is therefore necessary to store the
 5// result of append, often in the variable holding the slice itself:
 6//    slice = append(slice, elem1, elem2)
 7//    slice = append(slice, anotherSlice...)
 8// As a special case, it is legal to append a string to a byte slice, like this:
 9//    slice = append([]byte("hello "), "world"...)
10func append(slice []Type, elems ...Type) []Type
11

append 會追加一個或多個數據至 slice 中，這些數據會存儲至 slice 的底層數組。其中，底層數組長度是固定的，如果數組的剩餘空間足以容納追加的數據，則可以正常地將數據存入該數組。一旦追加數據後總長度超過原數組長度，原數組就無法滿足存儲追加數據的要求。此時會怎麼處理呢？

同時我們發現，該文件中僅僅定義了函數簽名，並沒有包含函數實現的任何代碼。這裏我們不免好奇，append 究竟是如何實現的呢？

編譯過程

爲了回答上述問題，我們不妨從編譯入手。Go 編譯可分爲四個階段：詞法與語法分析、類型檢查與抽象語法樹（AST）轉換、中間代碼生成和生成最後的機器碼。

我們主要需要關注的是第二和第三階段的代碼，分別是位於src/cmd/compile/internal/gc/typecheck.go下的類型檢查邏輯

1func typecheck1(n *Node, top int) (res *Node) {
2    ...
3    switch n.Op {
4    case OAPPEND:
5    ...
6}

位於src/cmd/compile/internal/gc/walk.go下的抽象語法樹轉換邏輯

 1func walkexpr(n *Node, init *Nodes) *Node {
 2    ...
 3    case OAPPEND:
 4            // x = append(...)
 5            r := n.Right
 6            if r.Type.Elem().NotInHeap() {
 7                yyerror("%v can't be allocated in Go; it is incomplete (or unallocatable)", r.Type.Elem())
 8            }
 9            switch {
10            case isAppendOfMake(r):
11                // x = append(y, make([]T, y)...)
12                r = extendslice(r, init)
13            case r.IsDDD():
14                r = appendslice(r, init) // also works for append(slice, string).
15            default:
16                r = walkappend(r, init, n)
17            }
18    ...
19}

和位於src/cmd/compile/internal/gc/ssa.go下的中間代碼生成邏輯

1// append converts an OAPPEND node to SSA.
2// If inplace is false, it converts the OAPPEND expression n to an ssa.Value,
3// adds it to s, and returns the Value.
4// If inplace is true, it writes the result of the OAPPEND expression n
5// back to the slice being appended to, and returns nil.
6// inplace MUST be set to false if the slice can be SSA'd.
7func (s *state) append(n *Node, inplace bool) *ssa.Value {
8    ...
9}

其中，中間代碼生成階段的state.append方法，是我們重點關注的地方。入參 inplace 代表返回值是否覆蓋原變量。如果爲 false，展開邏輯如下（注意：以下代碼只是爲了方便理解的僞代碼，並不是 state.append 中實際的代碼）。同時，小菜刀注意到如果寫成 append(s, e1, e2, e3) 不帶接收者的形式，並不能通過編譯，所以暫未明白它的場景在哪。

 1    // If inplace is false, process as expression "append(s, e1, e2, e3)": 
 2   ptr, len, cap := s
 3     newlen := len + 3
 4     if newlen > cap {
 5         ptr, len, cap = growslice(s, newlen)
 6         newlen = len + 3 // recalculate to avoid a spill
 7     }
 8     // with write barriers, if needed:
 9     *(ptr+len) = e1
10     *(ptr+len+1) = e2
11     *(ptr+len+2) = e3
12     return makeslice(ptr, newlen, cap)

如果是 true，例如 slice = append(slice, 1, 2, 3) 語句，那麼返回值會覆蓋原變量。展開方式邏輯如下

 1    // If inplace is true, process as statement "s = append(s, e1, e2, e3)":
 2
 3     a := &s
 4     ptr, len, cap := s
 5     newlen := len + 3
 6     if uint(newlen) > uint(cap) {
 7        newptr, len, newcap = growslice(ptr, len, cap, newlen)
 8        vardef(a)       // if necessary, advise liveness we are writing a new a
 9        *a.cap = newcap // write before ptr to avoid a spill
10        *a.ptr = newptr // with write barrier
11     }
12     newlen = len + 3 // recalculate to avoid a spill
13     *a.len = newlen
14     // with write barriers, if needed:
15     *(ptr+len) = e1
16     *(ptr+len+1) = e2
17     *(ptr+len+2) = e3

不管 inpalce 是否爲 true，我們均會獲取切片的數組指針、大小和容量，如果在追加元素後，切片新的大小大於原始容量，就會調用 runtime.growslice 對切片進行擴容，並將新的元素依次加入切片。

因此，通過 append 向元素類型爲 int 的切片（已包含元素 1，2，3）追加元素 1， slice=append(slice,1)可分爲兩種情況。

情況 1，切片的底層數組還有可容納追加元素的空間。

情況 2，切片的底層數組已無可容納追加元素的空間，需調用擴容函數，進行擴容。

擴容函數

前面我們提到，追加操作時，當切片底層數組的剩餘空間不足以容納追加的元素，就會調用 growslice，其調用的入參 cap 爲追加元素後切片的總長度。

growslice 的代碼較長，我們可以根據邏輯分爲三個部分。

初步確定切片容量

 1func growslice(et *_type, old slice, cap int) slice {
 2  ...
 3  newcap := old.cap
 4    doublecap := newcap + newcap
 5    if cap > doublecap {
 6        newcap = cap
 7    } else {
 8        if old.len < 1024 {
 9            newcap = doublecap
10        } else {
11            // Check 0 < newcap to detect overflow
12            // and prevent an infinite loop.
13            for 0 < newcap && newcap < cap {
14                newcap += newcap / 4
15            }
16            // Set newcap to the requested cap when
17            // the newcap calculation overflowed.
18            if newcap <= 0 {
19                newcap = cap
20            }
21        }
22    }
23  ...
24}

在該環節中，如果需要的容量 cap 超過原切片容量的兩倍 doublecap，會直接使用需要的容量作爲新容量newcap。否則，當原切片長度小於 1024 時，新切片的容量會直接翻倍。而當原切片的容量大於等於 1024 時，會反覆地增加 25%，直到新容量超過所需要的容量。

計算容量所需內存大小

 1    var overflow bool
 2    var lenmem, newlenmem, capmem uintptr
 3
 4    switch {
 5    case et.size == 1:
 6        lenmem = uintptr(old.len)
 7        newlenmem = uintptr(cap)
 8        capmem = roundupsize(uintptr(newcap))
 9        overflow = uintptr(newcap) > maxAlloc
10        newcap = int(capmem)
11    case et.size == sys.PtrSize:
12        lenmem = uintptr(old.len) * sys.PtrSize
13        newlenmem = uintptr(cap) * sys.PtrSize
14        capmem = roundupsize(uintptr(newcap) * sys.PtrSize)
15        overflow = uintptr(newcap) > maxAlloc/sys.PtrSize
16        newcap = int(capmem / sys.PtrSize)
17    case isPowerOfTwo(et.size):
18        var shift uintptr
19        if sys.PtrSize == 8 {
20            // Mask shift for better code generation.
21            shift = uintptr(sys.Ctz64(uint64(et.size))) & 63
22        } else {
23            shift = uintptr(sys.Ctz32(uint32(et.size))) & 31
24        }
25        lenmem = uintptr(old.len) << shift
26        newlenmem = uintptr(cap) << shift
27        capmem = roundupsize(uintptr(newcap) << shift)
28        overflow = uintptr(newcap) > (maxAlloc >> shift)
29        newcap = int(capmem >> shift)
30    default:
31        lenmem = uintptr(old.len) * et.size
32        newlenmem = uintptr(cap) * et.size
33        capmem, overflow = math.MulUintptr(et.size, uintptr(newcap))
34        capmem = roundupsize(capmem)
35        newcap = int(capmem / et.size)
36    }

在該環節，通過判斷切片元素的字節大小是否爲 1，系統指針大小（32 位爲 4，64 位爲 8）或 2 的倍數，進入相應所需內存大小的計算邏輯。

這裏需要注意的是 roundupsize 函數，它根據輸入期望大小 size ，返回 mallocgc 實際將分配的內存塊的大小。

 1func roundupsize(size uintptr) uintptr {
 2    if size < _MaxSmallSize {
 3        if size <= smallSizeMax-8 {
 4            return uintptr(class_to_size[size_to_class8[divRoundUp(size, smallSizeDiv)]])
 5        } else {
 6            return uintptr(class_to_size[size_to_class128[divRoundUp(size-smallSizeMax, largeSizeDiv)]])
 7        }
 8    }
 9
10  // Go的內存管理虛擬地址頁大小爲 8k（_PageSize）
11  // 當size的大小即將溢出時，就不採用向上取整的做法，直接用當前期望size值。
12    if size+_PageSize < size {
13        return size
14    }
15    return alignUp(size, _PageSize)
16}

根據內存分配中的大小對象原則，如果期望分配內存非大對象 ( <_MaxSmallSize )，即小於 32k，則需要根據 divRoundUp 函數將待申請的內存向上取整，取整時會使用 class_to_size 以及 size_to_class8 和 size_to_class128 數組。這些數組方便於內存分配器進行分配，以提高分配效率並減少內存碎片。

1// _NumSizeClasses = 67 代表67種特定大小的對象類型
2var class_to_size = [_NumSizeClasses]uint16{0, 8, 16, 32, 48, 64, 80, 96, 112,...}

當期望分配內存爲大對象時，會通過 alignUp 將該 size 的大小向上取值爲虛擬頁大小（_PageSize）的倍數。

內存分配

 1    if overflow || capmem > maxAlloc {
 2        panic(errorString("growslice: cap out of range"))
 3    }
 4
 5    var p unsafe.Pointer
 6    if et.ptrdata == 0 {
 7        p = mallocgc(capmem, nil, false)
 8        memclrNoHeapPointers(add(p, newlenmem), capmem-newlenmem)
 9    } else {
10        p = mallocgc(capmem, et, true)
11        if lenmem > 0 && writeBarrier.enabled {
12            bulkBarrierPreWriteSrcOnly(uintptr(p), uintptr(old.array), lenmem-et.size+et.ptrdata)
13        }
14    }
15    memmove(p, old.array, lenmem)
16
17    return slice{p, old.len, newcap}

如果在第二個環節中，造成了溢出或者期望分配的內存超過最大分配限制，會引起 panic。

mallocgc 分配一個大小爲前面計算得到的 capmem 對象。如果是小對象，則直接從當前 G 所在 P 的緩存空閒列表中分配；如果是大對象，則從堆上進行分配。同時，如果切片中的元素不是指針類型，那麼會調用 memclrNoHeapPointers將超出切片當前長度的位置清空；如果是元素是指針類型，且原有切片元素個數不爲 0 並可以打開寫屏障時，需要調用 bulkBarrierPreWriteSrcOnly 將舊切片指針標記隱藏，在新切片中保存爲 nil 指針。

在最後使用memmove將原數組內存中的內容拷貝到新申請的內存中，並將新的內存指向指針p 和舊的長度值，新的容量值賦值給新的 slice 並返回。

注意，在 growslice 完成後，只是把舊有數據拷貝到了新的內存中去，且計算得到新的 slice 容量大小，並沒有完成最終追加數據的操作。如果 slice 當前 len =3，cap=3，slice=append(slice,1)，那它完成的工作如下圖所示。

growslice之後，此時新的 slice 已經拷貝了舊的 slice 數據，並且其底層數組有充足的剩餘空間追加數據。後續只需拷貝追加數據至剩餘空間，並修改 len 值即可，這一部分就不再深究了。

總結

這裏回到文章開頭中的例子

1package main
2
3func main() {
4    s := []int{1,2}
5    s = append(s, 3,4,5)
6    println(cap(s))
7}

由於初始 s 的容量是 2，現需要追加 3 個元素，所以通過 append 一定會觸發擴容，並調用 growslice 函數，此時他的入參 cap 大小爲 2+3=5。通過翻倍原有容量得到 doublecap = 2+2，doublecap 小於 cap 值，所以在第一階段計算出的期望容量值 newcap=5。在第二階段中，元素類型大小 int 和 sys.PtrSize 相等，通過 roundupsize 向上取整內存的大小到 capmem = 48 字節，所以新切片的容量newcap 爲 48 / 8 = 6 ，成功解釋！

在切片 append 操作時，如果底層數組已無可容納追加元素的空間，則需擴容。擴容並不是在原有底層數組的基礎上增加內存空間，而是新分配一塊內存空間作爲切片的底層數組，並將原有數據和追加數據拷貝至新的內存空間中。

在擴容的容量確定上，相對比較複雜，它與 CPU 位數、元素大小、是否包含指針、追加個數等都有關係。當我們看完擴容源碼邏輯後，發現去糾結它的擴容確切值並沒什麼必要。

在實際使用中，如果能夠確定切片的容量範圍，比較合適的做法是：切片初始化時就分配足夠的容量空間，在 append 追加操作時，就不用再考慮擴容帶來的性能損耗問題。

 1func BenchmarkAppendFixCap(b *testing.B) {
 2    for i := 0; i < b.N; i++ {
 3        a := make([]int, 0, 1000)
 4        for i := 0; i < 1000; i++ {
 5            a = append(a, i)
 6        }
 7    }
 8}
 9
10func BenchmarkAppend(b *testing.B) {
11    for i := 0; i < b.N; i++ {
12        a := make([]int, 0)
13        for i := 0; i < 1000; i++ {
14            a = append(a, i)
15        }
16    }
17}

它們的壓測結果如下，孰優孰劣，一目瞭然。

1 $ go test -bench=. -benchmem
2
3BenchmarkAppendFixCap-8          1953373               617 ns/op               0 B/op          0 allocs/op
4BenchmarkAppend-8                 426882              2832 ns/op           16376 B/op         11 allocs/op

Golang 技術分享

專注於 Go 語言知識分享

本文由 Readfog 進行 AMP 轉碼，版權歸原作者所有。
來源：https://mp.weixin.qq.com/s/C8Dauv3xt_83Cg1p8wC9fA

猜你喜歡