Code generation in Rust
Code generation in Rust
So you want to write Rust code that writes Rust code? Here are your options.
- Macros
- Build script (
build.rs
) - Source generation
When to use each of them?
- Macros
- Available to external users in their Rust code
- Easy to incorporate anywhere in the Rust codebase
- Build script
- Use external non-Rust tools
- Generate files that can be included in Rust code
- Source generation
- Custome code generation
- Large code generation with custom logic
Preliminaries
Let me explain each approach first. Each approach has its own use.
Macros
Macros is an advanced feature of Rust to generate code in place. There are three types of procedural macros:
- Derive macros (
#[derive(MyMacro)]
) - Attribute macros (
#[my_macro]
) - Functional macros (
my_macro!()
)
See its chapter in the Rust book. This post discusses declarative macros in a friendly way too.
Build script
A build script is a piece of code that runs before a crate is built. Some example use cases are:
- Building a bundled C library.
- Finding a C library on the host system.
- Generating a Rust module from a specification.
- Performing any platform-specific configuration needed for the crate.
See the reference.
Source generation
We call source generation the process of generating Rust code, as a build script may do, outside of the usual compilation process, so without the help of cargo
.
Some reasons to do this are:
- Does not take compilation time for consumers of your crate.
- Compilation does not depend on non-Rust code used for generation.
- Full source code is available for other tools.
- Full control (and responsability) over the synchronization of generated source code and the rest of the crate.
Considerations
Macros
Usually they require their own separate crate
TODO
Build script
- They might not play well with other tools that analyze your crate.
- For example, rustdoc.
- Check out some guidelines
- For example, rustdoc.
Typically you do something like this in build.rs
:
let out_path = PathBuf::from(env::var("OUT_DIR").unwrap());
std::fs::write(out_path.join("generated.rs"), contents).unwrap();
Then include it in your code (e.g. in lib.rs
) like this:
include!(concat!(env!("OUT_DIR"), "/generated.rs"));
Source generation
Keeping things in sync is now your responsability.
Examples
Macros
Build script
Source generation
- Web interface in Rust
- web-sys crate is generated from web-idl specification files
- wasm-bindgen-webidl crate is the Rust code source generator that parses the web-idl specification files
- Rust-analyzer grammar
- internal
sourcegen
crate is the infraestructure to generate the Rust code - Rust-analyzer generating code from
ungrammar
files (grammar specification) that uses its internal sourcegen crate- https://github.com/rust-lang/rust-analyzer/blob/master/crates/syntax/src/tests/sourcegen_ast.rs
- internal
- Rome grammar generator
- Rome generating code from
ungrammar
files- https://github.com/rome/tools/tree/main/xtask/codegen
- Rome generating code from
Tools for writing Rust code
Parsing input
syn
- Parses a stream of Rust tokens into a syntax tree of Rust source code.
- Any parser library
- pest, nom, pom, combine, etc…
Writing Rust code
duplicate
quote
- Turns Rust syntax tree data structures into tokens of source code.
codegen
- Builder pattern for Rust code, resulting in a
String
- Builder pattern for Rust code, resulting in a
- Templates
format!
,write!
, etc…- Write your code by interpolating strings
Formatting Rust code
prettyplease
- Crate available in crates.io
rustfmt
- Format Rust code
- Shell out to the
rustfmt
command - rustfmt-wrapper crate