從 0 到 1 實現一個前端監控系統（附源碼）

在我已有的職業生涯中，前端確實大多數時候是在裸奔的，這是這篇文章被我寫出來的理由。而且，現在是一個數據時代，沒有數據很多時候就沒有反饋沒有下一步，也就沒有進步。

一個完整的前端監控平臺包括三個部分：數據採集與上報、數據整理和存儲、數據展示。本文只寫數據採集與上報部分。

一、從 0 開始

名字很重要，一個好的名字會讓使用者很容易記住，會讓使用者莫名產生一種自豪感，會更容易傳播；如果名字能夠切合某種大道，更能帶來順理成章的效果。本文的監控 SDK 就叫四維吧，寓意上帝視角。全寫 four-dimension，簡寫爲 FD。

class FourDimension {
    constructor() {
       this.init()
    }

    // 初始化
    init() {
    
    }
}

定義一個 FourDimension 類，目前有構造函數，構造函數無參數。只有init方法，用以初始化類。init方法用以性能、錯誤、行爲數據收集。

二、上報數據方法

業界比較成熟的方案：使用 1x1 像素的 gif 圖片上報，本文也是。同時也可以使navigator.sendBeacon，navigator.sendBeacon是一個用於發送少量數據到服務器的瀏覽器 API。它有以下幾個優點

異步和非阻塞：navigator.sendBeacon 是異步的，它不會阻塞瀏覽器的其他操作。這對於性能監控來說非常重要，因爲都不希望監控的過程影響到頁面的性能。
在頁面卸載時仍然可以發送數據：當用戶離開頁面（例如關閉頁面或者導航到其他頁面）時，navigator.sendBeacon仍然可以發送數據。這對於捕獲和上報頁面卸載前的最後一些性能數據來說非常有用。
低優先級：navigator.sendBeacon 發送的請求是低優先級的，它不會影響到頁面的其他網絡請求。
簡單易用：navigator.sendBeacon 的 API 非常簡單，只需要提供上報的 URL 和數據，就可以發送請求。

與此同時，navigator.sendBeacon 也有一些限制。例如，它只能發送 POST 請求，不能發送 GET 請求。而且，它發送的請求沒有返回值，不能接收服務器的響應。

最後，一些舊的瀏覽器可能不支持 navigator.sendBeacon。因此，在使用 navigator.sendBeacon 時，需要根據實際情況進行兼容性處理。

本文的方案是，優先navigator.sendBeacon，降級使用 1x1 像素 gif 圖片，根據實際情況需要採用 xhr。

創建 report.js 增加上傳方法

import {isSupportSendBeacon} from './util'


// 如果瀏覽器不支持 sendBeacon，就使用圖片打點
const sendBeacon = (function(){
    if(isSupportSendBeacon()){
      return window.navigator.sendBeacon.bind(window.navigator)
    }
    const reportImageBeacon = function(url, data){
        reportImage(url, data)
    }
    return reportImageBeacon
})()

export function reportImage(url, data) {
    const img = new Image();
    img.src = url + '?reportData=' + encodeURIComponent(JSON.stringify(data));
}

三、上報時機

參考其它文章，上報時機選擇對當前頁面影響最小的方案

上報時機有三種：

採用 requestIdleCallback/setTimeout 延時上報。

在 beforeunload 回調函數里上報。

緩存上報數據，達到一定數量後再上報。

將三種方式結合一起上報：

先緩存上報數據，緩存到一定數量後，利用 requestIdleCallback/setTimeout 延時上報。在頁面離開時統一將未上報的數據進行上報。

創建緩存文件 cache.js

import { deepCopy } from './util'

const cache = []

export function getCache() {
    return deepCopy(cache)
}

export function addCache(data) {
    cache.push(data)
}

export function clearCache() {
    cache.length = 0
}

其中 deepCopy

export function deepCopy(target) {
    if (typeof target === 'object') {
        const result = Array.isArray(target) ? [] : {}
        for (const key in target) {
            if (typeof target[key] == 'object') {
                result[key] = deepCopy(target[key])
            } else {
                result[key] = target[key]
            }
        }

        return result
    }

    return target
}

修改上傳文件 report.js

import { addCache, getCache, clearCache } from './cache'
import config from '../config'
import {isSupportSendBeacon, generateUniqueID} from './util'

const sendBeacon = (function(){
    if(isSupportSendBeacon()){
      return window.navigator.sendBeacon.bind(window.navigator)
    }
    const reportImageBeacon = function(url, data){
        reportImage(url, data)
    }
    return reportImageBeacon
})()

const sessionID = generateUniqueID()
export function report(data, isImmediate = false) {
    if (!config.reportUrl) {
        console.error('請設置上傳 url 地址')
    }

    const reportData = JSON.stringify({
        id: sessionID,
        appID: config.appID,
        userID: config.userID,
        data,
    })

    if (isImmediate) {
        sendBeacon(config.reportUrl, reportData)
        return
    }

    if (window.requestIdleCallback) {
        window.requestIdleCallback(() => {
            sendBeacon(config.reportUrl, reportData)
        }, { timeout: 3000 })
    } else {
        setTimeout(() => {
            sendBeacon(config.reportUrl, reportData)
        })
    }
}

let timer = null
export function lazyReportCache(data, timeout = 3000) {
    addCache(data)

    clearTimeout(timer)
    timer = setTimeout(() => {
        const data = getCache()
        if (data.length) {
            report(data)
            clearCache()
        }
    }, timeout)
}

export function reportWithXHR(data) {
    // 1. 創建 xhr 對象
    let xhr = new XMLHttpRequest()
    // 2. 調用 open 函數
    xhr.open('POST', config.reportUrl)
    // 3. 調用 send 函數
    xhr.send(JSON.stringify(data))
}

export function reportImage(url, data) {
    const img = new Image();
    img.src = url + '?reportData=' + encodeURIComponent(JSON.stringify(data));
}

其中 config.js 文件

const config = {
    reportUrl: 'http://localhost:8000/reportData',
    projectName: 'fd-example'
}

export function setConfig(options) {
    for (const key in config) {
        if (options[key]) {
            config[key] = options[key]
        }
    }
}
export default config

四、性能數據收集上報

根據最初的規劃，性能監控需要收集的數據指標需要有 FP、FCP、LCP、DOMContentLoaded、onload、資源加載時間、接口請求時間。

收集 FP、FCP、LCP、資源加載時間具體是利用瀏覽器 Performance API。關於 Performance API 可以參考：Performance[1]

收集上報 FP

FP（First Paint）首次繪製，即瀏覽器開始繪製頁面的時間點。這包括了任何用戶自定義的繪製，它是渲染任何文本、圖像、SVG 等的開始時間

import { getPageURL, isSupportPerformanceObserver } from '../utils/util'
import { lazyReportCache } from '../utils/report'

export default function observePaint() {
    if (!isSupportPerformanceObserver()) return
    
    const entryHandler = (list) => {        
        for (const entry of list.getEntries()) {
            if (entry.name === 'first-paint') {
                observer.disconnect()
            }
    
            const json = entry.toJSON()
            delete json.duration
    
            const reportData = {
                ...json,
                subType: entry.name,
                type: 'performance',
                pageURL: getPageURL(),
            }

            lazyReportCache(reportData)
        }
    }
    
    const observer = new PerformanceObserver(entryHandler)
    // buffered 屬性表示是否觀察緩存數據，也就是說觀察代碼添加時機比事情觸發時機晚也沒關係。
    observer.observe({ type: 'paint', buffered: true })

}

代碼中observer.disconnect()是 PerformanceObserver 對象的一個方法，用於停止觀察性能指標並斷開與回調函數的連接。

事實上

observer.observe({ type: 'paint', buffered: true })

包含兩種性能指標：first-contentful-paint 和 first-paint。

當調用observer.disconnect()方法時，PerformanceObserver 對象將停止觀察性能指標，並且不再接收任何性能指標的更新。與此同時，與回調函數的連接也會被斷開，即使有新的性能指標數據產生，也不會再觸發回調函數。

這個方法通常在不再需要觀察性能指標時調用，以避免不必要的資源消耗。

收集上報 FCP

FCP（First Contentful Paint）：首次內容繪製，即瀏覽器首次繪製 DOM 內容的時間點，如文本、圖像、SVG 等。

看起來 FCP 和 FP 一致，其實還是有區別的

FCP（First Contentful Paint）：FCP 是指頁面上首次渲染任何文本、圖像、非空白的 canvas 或 SVG 的時間點。它表示了用戶首次看到頁面有實際內容的時間，即頁面開始呈現有意義的內容的時間點。
FP（First Paint）：FP 是指頁面上首次渲染任何內容的時間點，包括背景顏色、圖片、文本等。它表示了頁面開始呈現任何可視化內容的時間，但不一定是有意義的內容。

簡而言之，FCP 關注的是頁面上首次呈現有意義內容的時間點，而 FP 關注的是頁面上首次呈現任何可視化內容的時間點。FCP 更關注用戶感知的頁面加載時間，因爲它表示用戶可以開始閱讀或與頁面進行交互的時間點。而 FP 則更關注頁面開始渲染的時間點，無論內容是否有意義

import { getPageURL, isSupportPerformanceObserver } from '../utils/util'
import { lazyReportCache } from '../utils/report'

export default function observePaint() {
    if (!isSupportPerformanceObserver()) return
    
    const entryHandler = (list) => {        
        for (const entry of list.getEntries()) {
            if (entry.name === 'first-contentful-paint') {
                observer.disconnect()
            }
    
            const json = entry.toJSON()
            delete json.duration
    
            const reportData = {
                ...json,
                subType: entry.name,
                type: 'performance',
                pageURL: getPageURL(),
            }

            lazyReportCache(reportData)
        }
    }
    
    const observer = new PerformanceObserver(entryHandler)
    // buffered 屬性表示是否觀察緩存數據，也就是說觀察代碼添加時機比事情觸發時機晚也沒關係。

    observer.observe({ type: 'paint', buffered: true })

}

收集上報 LCP

LCP（Largest Contentful Paint）：最大內容繪製，即視口中最大的圖像或文本塊的渲染完成的時間點

import { getPageURL, isSupportPerformanceObserver } from '../utils/util'
import { lazyReportCache } from '../utils/report'

export default function observeLCP() {
    if (!isSupportPerformanceObserver()) {
        return
    }
    
    const entryHandler = (list) => {

        if (observer) {
            observer.disconnect()
        }
        
        for (const entry of list.getEntries()) {
            const json = entry.toJSON()
            delete json.duration

            const reportData = {
                ...json,
                target: entry.element?.tagName,
                name: entry.entryType,
                subType: entry.entryType,
                type: 'performance',
                pageURL: getPageURL(),
            }
            
            lazyReportCache(reportData)
        }
    }

    const observer = new PerformanceObserver(entryHandler)
    observer.observe({ type: 'largest-contentful-paint', buffered: true })
}

收集上報 DOMContentLoaded

DOMContentLoaded：當 HTML 文檔被完全加載和解析完成後，DOMContentLoaded事件被觸發，無需等待樣式表、圖像和子框架的完成加載

import { lazyReportCache } from '../utils/report'

export default function observerLoad() {
    ['DOMContentLoaded'].forEach(type => onEvent(type))
}

function onEvent(type) {
    function callback() {
        lazyReportCache({
            type: 'performance',
            subType: type.toLocaleLowerCase(),
            startTime: performance.now(),
        })

        window.removeEventListener(type, callback, true)
    }

    window.addEventListener(type, callback, true)
}

收集上報 onload 數據

onload：當所有需要立即加載的資源（如圖片和樣式表）已加載完成時的時間點

import { lazyReportCache } from '../utils/report'

export default function observerLoad() {
    ['load'].forEach(type => onEvent(type))
}

function onEvent(type) {
    function callback() {
        lazyReportCache({
            type: 'performance',
            subType: type.toLocaleLowerCase(),
            startTime: performance.now(),
        })

        window.removeEventListener(type, callback, true)
    }

    window.addEventListener(type, callback, true)
}

收集上報資源加載時間

收集資源加載時間

observer.observe({ type: 'resource', buffered: true })

我在想什麼是資源加載時間？應該就是下面的entry.duration的。我覺得寫監控 SDK 很有意義，可以更加深入的學習瀏覽器模型。瞭解瀏覽器是怎麼看待各種 html 文件資源的

import { executeAfterLoad, isSupportPerformanceObserver} from '../utils/util'
import { lazyReportCache } from '../utils/report'

export default function observeEntries() {
    executeAfterLoad(() => {
        observeEvent('resource')
    })
}

export function observeEvent(entryType) {
    function entryHandler(list) {
        const data = list.getEntries()
        for (const entry of data) {
            if (observer) {
                observer.disconnect()
            }

            lazyReportCache({
                name: entry.name, // 資源名稱
                subType: entryType,
                type: 'performance',
                sourceType: entry.initiatorType, // 資源類型
                duration: entry.duration, // 資源加載耗時
                dns: entry.domainLookupEnd - entry.domainLookupStart, // DNS 耗時
                tcp: entry.connectEnd - entry.connectStart, // 建立 tcp 連接耗時
                redirect: entry.redirectEnd - entry.redirectStart, // 重定向耗時
                ttfb: entry.responseStart, // 首字節時間
                protocol: entry.nextHopProtocol, // 請求協議
                responseBodySize: entry.encodedBodySize, // 響應內容大小
                responseHeaderSize: entry.transferSize - entry.encodedBodySize, // 響應頭部大小
                resourceSize: entry.decodedBodySize, // 資源解壓後的大小
                startTime: performance.now(),
            })
        }
    }

    let observer
    if (isSupportPerformanceObserver()) {
        observer = new PerformanceObserver(entryHandler)
        observer.observe({ type: entryType, buffered: true })
    }
}

收集上報接口請求時間

這裏通過覆寫原生 xhr 對象方法，對方法做攔截實現接口時間收集以及上報

import { originalOpen, originalSend, originalProto } from '../utils/xhr'
import { lazyReportCache } from '../utils/report'

function overwriteOpenAndSend() {
    originalProto.open = function newOpen(...args) {
        this.url = args[1]
        this.method = args[0]
        originalOpen.apply(this, args)
    }

    originalProto.send = function newSend(...args) {
        this.startTime = Date.now()

        const onLoadend = () => {
            this.endTime = Date.now()
            this.duration = this.endTime - this.startTime

            const { status, duration, startTime, endTime, url, method } = this
            const reportData = {
                status,
                duration,
                startTime,
                endTime,
                url,
                method: (method || 'GET').toUpperCase(),
                success: status >= 200 && status < 300,
                subType: 'xhr',
                type: 'performance',
            }

            lazyReportCache(reportData)
            
            this.removeEventListener('loadend', onLoadend, true)
        }

        this.addEventListener('loadend', onLoadend, true)
        originalSend.apply(this, args)
    }
}

export default function xhr() {
    overwriteOpenAndSend()
}

五、錯誤數據收集上報

根據最初的規劃需要收集資源加載錯誤、js 錯誤和 promise 錯誤。

收集上報資源加載錯誤

收集 JavaScript、CSS 和圖片的加載錯誤，使用window.addEventListener監聽錯誤

import { lazyReportCache } from '../utils/report'
import { getPageURL } from '../utils/util'

export default function error() {

    // 捕獲資源加載失敗錯誤 js css img...
    window.addEventListener('error', e => {
        const target = e.target
        if (!target) return

        if (target.src || target.href) {
            const url = target.src || target.href
            lazyReportCache({
                url,
                type: 'error',
                subType: 'resource',
                startTime: e.timeStamp,
                html: target.outerHTML,
                resourceType: target.tagName,
                paths: e.path.map(item => item.tagName).filter(Boolean),
                pageURL: getPageURL(),
            })
        }
    }, true)
}

收集上報 js 錯誤

收集 JavaScript 錯誤，可以使用 window.onerror 或者 window.addEventListener('error', callback)

import { lazyReportCache } from '../utils/report'
import { getPageURL } from '../utils/util'

export default function error() {

    // 監聽 js 錯誤
    window.onerror = (msg, url, line, column, error) => {
        lazyReportCache({
            msg,
            line,
            column,
            error: error.stack,
            subType: 'js',
            pageURL: url,
            type: 'error',
            startTime: performance.now(),
        })
    }

}

說明一下window.onerror無法捕獲資源加載錯誤，所以這裏可以單獨拿來監聽 js 錯誤。

收集上報 promise 錯誤

收集 Promise 錯誤，可以使用 window.addEventListener('unhandledrejection', callback)

import { lazyReportCache } from '../utils/report'
import { getPageURL } from '../utils/util'

export default function error() {

    // 監聽 promise 錯誤 缺點是獲取不到列數據
    window.addEventListener('unhandledrejection', e => {
        lazyReportCache({
            reason: e.reason?.stack,
            subType: 'promise',
            type: 'error',
            startTime: e.timeStamp,
            pageURL: getPageURL(),
        })
    })

}

爲了減少對 html 文件代碼的干擾，錯誤收集可以添加一個緩存代理，具體參考字節前端監控實踐 [2]。

六、行爲數據收集上報

根據最初的規劃，行爲數據收集 pv、uv，頁面停留時長，用戶點擊。

收集上報 pv、uv

收集 pv（Page View，頁面瀏覽量）和 uv（Unique Visitor，獨立訪客）數據，需要在每次頁面加載時發送一個請求到服務器，然後在服務器端進行統計

import { lazyReportCache } from '../utils/report'
import getUUID from './getUUID'
import { getPageURL } from '../utils/util'

export default function pv() {
    lazyReportCache({
        type: 'behavior',
        subType: 'pv',
        startTime: performance.now(),
        pageURL: getPageURL(),
        referrer: document.referrer,
        uuid: getUUID(),
    })
}

這裏只能收集了 pv 數據，uv 數據統計需要在服務端進行。

頁面上報停留時長

收集頁面停留時長，可以在頁面加載時記錄一個開始時間，然後在頁面卸載時記錄一個結束時間，兩者的差就是頁面的停留時長。這個計算邏輯可以放在beforeunload事件裏做

import { report } from '../utils/report'
import { onBeforeunload, getPageURL } from '../utils/util'
import getUUID from './getUUID'

export default function pageAccessDuration() {
    onBeforeunload(() => {
        report({
            type: 'behavior',
            subType: 'page-access-duration',
            startTime: performance.now(),
            pageURL: getPageURL(),
            uuid: getUUID(),
        }, true)
    })
}

用戶點擊上報

收集用戶點擊事件，可以使用 addEventListener 來監聽 click 事件，這裏藉助了冒泡

import { lazyReportCache } from '../utils/report'
import { getPageURL } from '../utils/util'
import getUUID from './getUUID'

export default function onClick() {
    ['mousedown', 'touchstart'].forEach(eventType => {
        let timer
        window.addEventListener(eventType, event => {
            clearTimeout(timer)
            timer = setTimeout(() => {
                const target = event.target
                const { top, left } = target.getBoundingClientRect()
                
                lazyReportCache({
                    top,
                    left,
                    eventType,
                    pageHeight: document.documentElement.scrollHeight || document.body.scrollHeight,
                    scrollTop: document.documentElement.scrollTop || document.body.scrollTop,
                    type: 'behavior',
                    subType: 'click',
                    target: target.tagName,
                    paths: event.path?.map(item => item.tagName).filter(Boolean),
                    startTime: event.timeStamp,
                    pageURL: getPageURL(),
                    outerHTML: target.outerHTML,
                    innerHTML: target.innerHTML,
                    width: target.offsetWidth,
                    height: target.offsetHeight,
                    viewport: {
                        width: window.innerWidth,
                        height: window.innerHeight,
                    },
                    uuid: getUUID(),
                })
            }, 500)
        })
    })
}

七、改造完善四維監控類

將性能數據、錯誤數據、行爲數據入口文件的收集方法在監控類四維init方法內初始化

import performance from './performance/index'
import behavior from './behavior/index'
import error from './error/index'

class FourDimension {
    constructor() {
        this.init()
    }
      // 初始化
    init() {
      performance()
      error()
      behavior()
    }
}

new FourDimension().init()

在具體使用過程中，採用異步加載的方式引入。

總結

如果沒有具體數據能夠證明這個策略是優的，那麼就從理論上選優的。這也是我寫這篇文章的理論支撐之一。因爲畢竟沒有真實數據做驗證。

還有一個支撐是先模仿，理解別人的再理出自己的思路；而且寫文章也是督促自己學習的一種方式。本文大量參考了前端監控 SDK 的一些技術要點原理分析 [3] 這篇文章。寫着寫着發現關鍵是數據收集和上報方式，具體上報數據模型以及上報方式需要在真實場景中研究迭代。😭

寫這篇文章的一個收穫：正如前文說的，可以更加深入的瞭解到瀏覽器對 html 文件各種資源是如何看待的。前端開發或者開發這件事一直在和邏輯、數據打交道，但是涉及到底層邏輯的量化指標我認爲並不多。而寫監控就必須要探索底層的量化指標，這是一個很好的意義可以深入下去的意義。

當然另一個意義也說了，任何事沒有反饋則沒有進步。監控就是反饋。

另外，我想如果這種監控如果可視化，就如同對人的監控一樣，就算是沒有警報事件，也能記錄被監控對象的各種行爲數據。一定會很有有意思。即使沒有錯誤也能有一種可視化畫面。

代碼地址：github.com/zhensg123/r…[4]

本文完。

可參考文章

前端監控 SDK 的一些技術要點原理分析 [5]

一篇講透自研的前端錯誤監控 [6]

字節前端監控實踐 [7]

參考資料

[1]

https://developer.mozilla.org/zh-CN/docs/Web/API/Performance: https://link.juejin.cn?target=https%3A%2F%2Fdeveloper.mozilla.org%2Fzh-CN%2Fdocs%2FWeb%2FAPI%2FPerformance

[2]

https://juejin.cn/post/7195496297150709821#heading-17: https://juejin.cn/post/7195496297150709821#heading-17

[3]

https://juejin.cn/post/7017974567943536671: https://juejin.cn/post/7017974567943536671

[4]

https://github.com/zhensg123/rareRecord/tree/main/fourDemension: https://link.juejin.cn?target=https%3A%2F%2Fgithub.com%2Fzhensg123%2FrareRecord%2Ftree%2Fmain%2FfourDemension

[5]