bernsteinbear.com
Value numbering
Welcome back to compiler land. Today we’re going to talk about value numbering , which is like SSA, but more. Static single assignment (SSA) gives names to values: every expression has a name, and each name corresponds to exactly one expression. It transforms programs like this: x = 0 x = x + 1 x = x + 1 where the variable x is assigned more than once in the program text, into programs like this: v0 = 0 v1 = v0 + 1 v2 = v1 + 1 where each assignment to x has been replaced with an assignment to a ...
Using Perfetto in ZJIT
Originally published on Rails At Scale . Look! A trace of slow events in a benchmark! Hover over the image to see it get bigger. img { max-width: 100%; } img:hover { transform: scale(2); transition: transform 0.1s ease-in; } img:not(:hover) { transition: transform 0.1s ease-out; } A sneak preview of what the trace looks like. Now read on to see what the slow events are and how we got this pretty picture. The rules The first rule of just-in-time compilers is: you stay in JIT code. The second rule...
A fuzzer for the Toy Optimizer
It’s hard to get optimizers right. Even if you build up a painstaking test suite by hand, you will likely miss corner cases, especially corner cases at the interactions of multiple components or multiple optimization passes. I wanted to see if I could write a fuzzer to catch some of these bugs automatically. But a fuzzer alone isn’t much use without some correctness oracle—in this case, we want a more interesting bug than accidentally crashing the optimizer. We want to see if the optimizer intro...
Type-based alias analysis in the Toy Optimizer
Another entry in the Toy Optimizer series . Last time, we did load-store forwarding in the context of our Toy Optimizer. We managed to cache the results of both reads from and writes to the heap—at compile-time! We were careful to mind object aliasing: we separated our heap information into alias classes based on what offset the reads/writes referenced. This way, if we didn’t know if object a and b aliased, we could at least know that different offsets would never alias (assuming our objects don...
A multi-entry CFG design conundrum
Background and bytecode design The ZJIT compiler compiles Ruby bytecode (YARV) to machine code. It starts by transforming the stack machine bytecode into a high-level graph-based intermediate representation called HIR. We use a more or less typical 1 control-flow graph (CFG) in HIR. We have a compilation unit, Function , which has multiple basic blocks, Block . Each block contains multiple instructions, Insn . HIR is always in SSA form, and we use the variant of SSA with block parameters instead...
The GDB JIT interface
GDB is great for stepping through machine code to figure out what is going on. It uses debug information under the hood to present you with a tidy backtrace and also determine how much machine code to print when you type disassemble . This debug information comes from your compiler. Clang, GCC, rustc, etc all produce debug data in a format called DWARF and then embed that debug information inside the binary (ELF, Mach-O, …) when you do -ggdb or equivalent. Unfortunately, this means that by defau...
ZJIT is now available in Ruby 4.0
Originally published on Rails At Scale . ZJIT is a new just-in-time (JIT) Ruby compiler built into the reference Ruby implementation, YARV , by the same compiler group that brought you YJIT. We (Aaron Patterson, Aiden Fox Ivey, Alan Wu, Jacob Denbeaux, Kevin Menard, Max Bernstein, Maxime Chevalier-Boisvert, Randy Stauner, Stan Lo, and Takashi Kokubun) have been working on ZJIT since the beginning of this year. In case you missed the last post, we’re building a new compiler for Ruby because we wa...
Load and store forwarding in the Toy Optimizer
Another entry in the Toy Optimizer series . A long, long time ago (two years!) CF Bolz-Tereick and I made a video about load/store forwarding and an accompanying GitHub Gist about load/store forwarding (also called load elimination) in the Toy Optimizer. I said I would write a blog post about it, but never found the time—it got lost amid a sea of large life changes. It’s a neat idea: do an abstract interpretation over the trace, modeling the heap at compile-time, eliminating redundant loads and ...
How to annotate JITed code for perf/samply
Brief one today. I got asked “does YJIT/ZJIT have support for [Linux] perf?” The answer is yes, and it also works with samply (including on macOS!), because both understand the perf map interface . This is the entirety of the implementation in ZJIT 1 : fn register_with_perf ( iseq_name : String , start_ptr : usize , code_size : usize ) { use std :: io :: Write ; let perf_map = format! ( "/tmp/perf-{}.map" , std :: process :: id ()); let Ok ( file ) = std :: fs :: OpenOptions :: new () .create ( ...
Sorry for marking all the posts as unread
I noticed that the URLs were all a little off (had two slashes instead of one) and went in and fixed it. I did not think everyone's RSS software was going to freak out the way it did. PS: this is a special RSS-only post that is not visible on the site. Enjoy.