Rust 宏

關於編譯

什麼是 “Rust 語言符號”？當編譯器開始編譯一個程序時，它首先讀入源代碼文件。爲了簡單起見，我們假設編譯器將源代碼存儲在一個字符串中。下一步是逐個字符地遍歷字符串，並將其劃分爲 “標記”。

例如，像這樣的 Rust 代碼片段：

let foo: u32 = 30;

可以 “標記化” 爲：

[
    KeywordLet,
    Identifier("foo"),
    Colon,
    Identifier("u32"),
    SingleEquals,
    NumericLiteral("30"),
    Semicolon,
]

注意，這是一個完全虛構的例子。

宏接受類似於上面的標記流作爲輸入，也輸出標記流。這有一些主要的含義：

Rust 宏可以添加新代碼：添加 trait 實現、創建新結構、編寫新函數等。
Rust 宏不能與代碼中的邏輯交互 (例如，查看一個類型是否實現了一個特徵，調用源代碼中聲明的函數，等等)，因爲邏輯還沒有真正構建起來。

Rust 宏主要有兩類: 聲明宏和過程宏。

聲明宏

聲明性宏可以與其他代碼一起聲明和使用。它們是使用 macro_rules! 定義結構，並有一些獨特的語法：

 1macro_rules! my_macro {
 2    ($a: ident => $b: expr) => {
 3        fn $a() {
 4            println!("{}", $b);
 5        }
 6    };
 7    ($a: ident, $b: expr) => {
 8        println!("{} {}", $a, $b);
 9    };
10}

聲明式宏接受 Rust 標記作爲輸入，並對它們執行模式匹配。在上面的例子中，宏 my_macro 匹配兩個不同的模式：

一個標識符和一個用箭頭 (=>) 分隔的表達式
用逗號分隔的標識符和表達式

這個宏可以像這樣調用：

1my_macro!(foo, 45);
2my_macro!(bar => "hello");
3my_macro!(baz => 9 * 8);

嵌套宏和遞歸

最流行的 Rust crate 之一 serde_json 包含聲明式宏 json!()，它允許你在 Rust 代碼中編寫類似 json 的語法。它返回一個 serde_json::Value。

1json!({
2    "id": 42,
3    "name": {
4        "first": "John",
5        "last": "Zoidberg",
6    },
7});

事實證明，你可以將任何有效的 Rust 表達式作爲值：

1json!({
2    "id": 21 + 21,  // Computed expression
3    "name": json!({ // This is another macro invocation!
4        "first": "John",
5        "last": "Zoidberg",
6    }),
7});

這種能力也擴展到宏生成的代碼。例如，我們可以編寫一個基本的解析器，遞歸地翻譯 AND(∧;^ 在代碼中) 和 OR(∨;v 在代碼中) 轉化爲 Rust 的等價物。

 1macro_rules! andor {
 2    ($a: ident ^ $b: ident $($tail: tt)*) => {
 3        $a && andor!($b $($tail)*) // Recursive invocation
 4    };
 5    ($a: ident v $b: ident $($tail: tt)*) => {
 6        $a || andor!($b $($tail)*) // Recursive invocation
 7    };
 8    ($($a: tt)*) => {
 9        $($a)*
10    }
11}
12
13andor!(true ^ false v false ^ true) // true && false || false && true
14// => false

因爲它是一個潛在的無限操作，所以宏遞歸具有 Rust 編譯器定義的最大深度。

過程宏

過程宏使用普通的 Rust 代碼 (不是唯一的語法) 編寫，經過編譯，然後在調用時由編譯器運行。由於這個原因，過程宏有時也被稱爲“編譯器插件”。

因爲 crates = 編譯單元，爲了在執行過程宏之前編譯過程宏，過程宏必須在不同的 crate 中定義 (並隨後從 crate 導出)。這些必須在 Cargo.toml 設置：

[lib]
proc-macro = true

過程宏有三種形式，調用方式都不同：

1，屬性宏

輸入：註釋項。

輸出取代輸入。(原始輸入在最終的標記流中不存在。)

#[my_attribute_macro]
struct MyStruct; // This struct is the input to the macro
struct AnotherStruct; // This struct is not part of the macro's input

2，派生宏

輸入：註釋項。

輸出被附加到輸入。(原始輸入仍然存在於最終的標記流中。)

#[derive(MyDeriveMacro)]
struct MyStruct; // This struct is the input to the macro
struct AnotherStruct; // This struct is not part of the macro's input

3，函數宏

輸入：包含的令牌流。分隔符是 []、{} 或()。

輸出取代輸入。(原始輸入在最終的標記流中不存在。)

my_function_like_macro!(arbitrary + token : stream 00);
// is the same as
my_function_like_macro![arbitrary + token : stream 00];
// is the same as
my_function_like_macro!{arbitrary + token : stream 00};

編寫過程宏

乍一看，從零開始編寫一個過程宏確實令人生畏：

1use proc_macro::TokenStream;
2
3#[proc_macro_attribute]
4pub fn my_attribute_macro(attr: TokenStream, item: TokenStream) -> TokenStream {
5    todo!("Good luck!")
6}

這個宏可以像這樣調用：

#[my_attribute_macro]
struct AnnotatedItem;

在本例中，attr 標記流將爲空，而 item 標記流將包含 AnnotatedItem struct。

如果你像這樣調用宏：

#[my_attribute_macro(attribute_tokens)]
fn my_function() {}

在本例中，attr 標記流將包含 attribute_tokens，而 item 標記流將包含 my_function 函數。

我們已經建立了基本的基礎設施，現在我們只需要解析輸入的標記流。

Rust 編譯器還沒有爲我們創建語法樹。我們只獲得一個標記流，然後必須以某種方式將其解析爲合理的內容 (如結構定義、impl 塊等)，以某種方式操作它，然後合成一個編譯器可以理解的有效代碼作爲輸出。可以參考 << 深入瞭解 Rust 過程宏 - 2>> 這篇文章。

本文翻譯自：

https://geeklaunch.net/blog/fathomable-rust-macros/#procedural-macros

本文由 Readfog 進行 AMP 轉碼，版權歸原作者所有。
來源：https://mp.weixin.qq.com/s/VvdvxRiafBcI7cJNPlGSEQ

猜你喜歡