深入理解 Golang Channel 結構

Golang 使用 Groutine 和 channels 實現了 CSP(Communicating Sequential Processes) 模型，channles 在 goroutine 的通信和同步中承擔着重要的角色。

在 GopherCon 2017 中，Golang 專家 Kavya 深入介紹了 Go Channels 的內部機制，以及運行時調度器和內存管理系統是如何支持 Channel 的，本文根據 Kavya 的 ppt 學習和分析一下 go channels 的原理，希望能夠對以後正確高效使用 golang 的併發帶來一些啓發。

以一個簡單的 channel 應用開始，使用 goroutine 和 channel 實現一個任務隊列，並行處理多個任務。

func main(){
    //帶緩衝的 channel
    ch := make(chan Task, 3)

    //啓動固定數量的 worker
    for i := 0; i< numWorkers; i++ {
        go worker(ch)
    }

    //發送任務給 worker
    hellaTasks := getTaks()

    for _, task := range hellaTasks {
        ch <- task
    }

    ...
}

func worker(ch chan Task){
    for {
       //接受任務
       task := <- ch
       process(task)
    }
}

從上面的代碼可以看出，使用 golang 的 goroutine 和 channel 可以很容易的實現一個生產者 - 消費者模式的任務隊列，相比 Java, c++ 簡潔了很多。channel 可以天然的實現了下面四個特性：

goroutine 安全
在不同的 goroutine 之間存儲和傳輸值 - 提供 FIFO 語義 (buffered channel 提供）
可以讓 goroutine block/unblock

那麼 channel 是怎麼實現這些特性的呢？下面我們看看當我們調用 make 來生成一個 channel 的時候都做了些什麼。

make chan

上述任務隊列的例子第三行，使用 make 創建了一個長度爲 3 的帶緩衝的 channel，channel 在底層是一個 hchan 結構體，位於 src/runtime/chan.go 裏。其定義如下：

type hchan struct {
    qcount   uint           // total data in the queue
    dataqsiz uint           // size of the circular queue
    buf      unsafe.Pointer // points to an array of dataqsiz elements
    elemsize uint16
    closed   uint32
    elemtype *_type // element type
    sendx    uint   // send index
    recvx    uint   // receive index
    recvq    waitq  // list of recv waiters
    sendq    waitq  // list of send waiters

    // lock protects all fields in hchan, as well as several
    // fields in sudogs blocked on this channel.
    //
    // Do not change another G's status while holding this lock
    // (in particular, do not ready a G), as this can deadlock
    // with stack shrinking.
    lock mutex
}

make 函數在創建 channel 的時候會在該進程的 heap 區申請一塊內存，創建一個 hchan 結構體，返回執行該內存的指針，所以獲取的的 ch 變量本身就是一個指針，在函數之間傳遞的時候是同一個 channel。

hchan 結構體使用一個環形隊列來保存 groutine 之間傳遞的數據（如果是緩存 channel 的話），使用 ** 兩個 list ** 保存像該 chan 發送和從該 chan 接收數據的 goroutine，還有一個 mutex 來保證操作這些結構的安全。

發送和接收

向 channel 發送和從 channel 接收數據主要涉及 hchan 裏的四個成員變量，借用 Kavya ppt 裏的圖示，來分析發送和接收的過程。

還是以前面的任務隊列爲例：

//G1
func main(){
    ...

    for _, task := range hellaTasks {
        ch <- task    //sender
    }

    ...
}

//G2
func worker(ch chan Task){
    for {
       //接受任務
       task := <- ch  //recevier
       process(task)
    }
}

其中 G1 是發送者，G2 是接收，因爲 ch 是長度爲 3 的帶緩衝 channel，初始的時候 hchan 結構體的 buf 爲空，sendx 和 recvx 都爲 0，當 G1 向 ch 裏發送數據的時候，會首先對 buf 加鎖，然後將要發送的數據 copy 到 buf 裏，並增加 sendx 的值，最後釋放 buf 的鎖。然後 G2 消費的時候首先對 buf 加鎖，然後將 buf 裏的數據 copy 到 task 變量對應的內存裏，增加 recvx，最後釋放鎖。整個過程，G1 和 G2 沒有共享的內存，底層通過 hchan 結構體的 buf，使用 copy 內存的方式進行通信，最後達到了共享內存的目的，這完全符合 CSP 的設計理念

Do not comminute by sharing memory;instead, share memory by communicating

一般情況下，G2 的消費速度應該是慢於 G1 的，所以 buf 的數據會越來越多，這個時候 G1 再向 ch 裏發送數據，這個時候 G1 就會阻塞，那麼阻塞到底是發生了什麼呢？

Goroutine Pause/Resume

goroutine 是 Golang 實現的用戶空間的輕量級的線程，有 runtime 調度器調度，與操作系統的 thread 有多對一的關係，相關的數據結構如下圖：

其中 M 是操作系統的線程，G 是用戶啓動的 goroutine，P 是與調度相關的 context，每個 M 都擁有一個 P，P 維護了一個能夠運行的 goutine 隊列，用於該線程執行。

當 G1 向 buf 已經滿了的 ch 發送數據的時候，當 runtine 檢測到對應的 hchan 的 buf 已經滿了，會通知調度器，調度器會將 G1 的狀態設置爲 waiting, 移除與線程 M 的聯繫，然後從 P 的 runqueue 中選擇一個 goroutine 在線程 M 中執行，此時 G1 就是阻塞狀態，但是不是操作系統的線程阻塞，所以這個時候只用消耗少量的資源。

那麼 G1 設置爲 waiting 狀態後去哪了？怎們去 resume 呢？我們再回到 hchan 結構體，注意到 hchan 有個 sendq 的成員，其類型是 waitq，查看源碼如下：

type hchan struct { 
    ... 
    recvq waitq // list of recv waiters 
    sendq waitq // list of send waiters 
    ... 
} 
// 
type waitq struct { 
    first *sudog 
    last *sudog 
}

實際上，當 G1 變爲 waiting 狀態後，會創建一個代表自己的 sudog 的結構，然後放到 sendq 這個 list 中，sudog 結構中保存了 channel 相關的變量的指針（如果該 Goroutine 是 sender，那麼保存的是待發送數據的變量的地址，如果是 receiver 則爲接收數據的變量的地址，之所以是地址，前面我們提到在傳輸數據的時候使用的是 copy 的方式）

當 G2 從 ch 中接收一個數據時，會通知調度器，設置 G1 的狀態爲 runnable，然後將加入 P 的 runqueue 裏，等待線程執行。

wait empty channel

前面我們是假設 G1 先運行，如果 G2 先運行會怎麼樣呢？如果 G2 先運行，那麼 G2 會從一個 empty 的 channel 裏取數據，這個時候 G2 就會阻塞，和前面介紹的 G1 阻塞一樣，G2 也會創建一個 sudog 結構體，保存接收數據的變量的地址，但是該 sudog 結構體是放到了 recvq 列表裏，當 G1 向 ch 發送數據的時候，runtime 並沒有對 hchan 結構體題的 buf 進行加鎖，而是直接將 G1 裏的發送到 ch 的數據 copy 到了 G2 sudog 裏對應的 elem 指向的內存地址！

總結

Golang 的一大特色就是其簡單高效的天然併發機制，使用 goroutine 和 channel 實現了 CSP 模型。

理解 channel 的底層運行機制對靈活運用 golang 開發併發程序有很大的幫助，看了 Kavya 的分享，然後結合 golang runtime 相關的源碼（源碼開源並且也是 golang 實現簡直良心！), 對 channel 的認識更加的深刻，當然還有一些地方存在一些疑問，比如 goroutine 的調度實現相關的，還是要潛心膜拜大神們的源碼！

轉自：

zhuanlan.zhihu.com/p/27917262

Go 開發大全

參與維護一個非常全面的 Go 開源技術資源庫。日常分享 Go, 雲原生、k8s、Docker 和微服務方面的技術文章和行業動態。

本文由 Readfog 進行 AMP 轉碼，版權歸原作者所有。
來源：https://mp.weixin.qq.com/s/7BL9RsnW5fSEcgR3PxJr3A

make chan

發送和接收

Goroutine Pause/Resume

wait empty channel

總結

猜你喜歡