It's not being abandoned, we're just shifting focus to evaluate a new style of compiler. YJIT will still get bug fixes and performance improvements.
ZJIT is a method based JIT (the type of compiler traditionally taught in schools) where YJIT is a lazy basic block versioning (LBBV) compiler. We're using what we learned developing and deploying YJIT to build an even better JIT compiler. IOW we're going to fold some of YJIT's techniques in to ZJIT.
> And if so, will these YJIT features likes Fast Allocations be brought to ZJIT?
It may not have been clear from the post, but this fast allocation strategy is actually implemented in the byte code interpreter. You will get a speedup without using any JIT compiler. We've already ported this fast-path to YJIT and are in the midst of implementing it in ZJIT.
Thanks for all the work you all are putting into Ruby! The improvements in the past few years have been incredible and I'm excited to see the continuous efforts in this area.
Why is a traditional method based JIT better than an LBBV JIT? I thought YJIT is LBBV because it's a better fit for Ruby, whereas traditional method based JIT is more suitable for static languages like Java.
One reason is that we think we can make better use of registers. Since LBBV doesn't "see" all blocks in a particular method all at once, it's much more challenging to optimize register use across basic blocks. We've added type profiling, so ZJIT can "learn" types from the runtime.
>For this reason, we will continue maintaining YJIT for now and Ruby 3.5 will ship with both YJIT and ZJIT. In parallel, we will improve ZJIT until it is on par (features and performance) with YJIT.
I guess YJIT will always be faster in warmup and minimal increase of memory usage. ZJIT being more traditional should bring more speedup than YJIT.
But most of the speedup right now is still coming from rewriting C into Ruby.
The sibling comments mention that C is used in a lot of places in Ruby that incur cross-language overheads, which is true, but it's also just true that in general, even ignoring this overhead, JIT'd functions are going to be faster then their comparable C functions, because 1) they have more profiling information to be able to work from, 2) they have more type information, and (as a consequence of 1&2) 3) they're more likely to be monomorphized, and the compiler is more able to inline specialized variants of them into different chunks of the code. Among other optimizations!
> ...they have more profiling information to be able to work from... more type information... more likely to be monomorphized, and the compiler is more able to inline specialized variants of them into different chunks of the code.
this is fascinating to me. i always assumed C had everything in the language that was needed for the compiler to use. in other words, the compiler may have a lot to work through, but the pieces are all available. but this makes it sound like JIT'd functions provide more info to the compiler (more pieces to work with). is there another language besides C that does have language features to indicate to the compiler how to make things as performant as possible?
Unless your JIT can analyse the full code, a transition between byte code and native code is often costly because the JIT won't be able to optimize the full path. Once your JIT generates good enough code, it then becomes faster to avoid that transition even in cases when in isolation native code might still be faster.
EDIT: Note that this isn't an inherent limit. You could write a JIT that could analyze the compiled C code too. It's just that it's much harder to do.
C itself is fast; it's calls to C from Ruby that are slow. [1]
Crossing the Ruby -> C boundary means that a JIT compiler cannot optimize the code as much; because it cannot alter or inline the C code methods. Counterintuitively this means that rewriting (certain?) built-in methods in Ruby leads to performance gains when using YJIT. [2]
> It’s very rare for code to allocate exactly the same type of object many times in a row, so the class of the instance local variable will change quite frequently.
That’s dangerous thinking because constructors will be a bimodal distribution.
Either a graph of calls or objects will contain a large number of unique objects, layers of alternating objects, or a lot of one type of object. Any map function for instance will tend to return a bunch of the same object. When the median and the mean diverge like this your thinking about perf gets muddy. An inline cache will make bulk allocations in list comprehensions faster. It won’t make creating DAGs faster. One is better than none.
> Any map function for instance will tend to return a bunch of the same object.
Yes, but if it ends up creating any ephemeral objects in the process of determining those returned objects, then the allocation sequence is still not homogeneous. In Ruby, according to the article, even calling a constructor with named arguments allocates, so it's very easy to still end up cycling through allocating different types.
At the same time, the callsite for any given `.new()` invocation will almost always be creating an instance of the exact same class. The target expression is nearly always just a constant name. That makes it a prime candidate for good inline caching at those callsites.
Not necessarily. An inline cache is cheap but it's not free, even less so when it also comes with the expense of moving Class#new from C to Ruby. It's probably not worth speeding up the 1% at the expense of the 99%.
> An inline cache will make bulk allocations in list comprehensions faster.
Only if such comprehensions create exactly one type of object, if they create two it's going to slow them down, and if they create zero (just do data extraction) it won't do anything.
> Only if such comprehensions create exactly one type of object,
We just had this conversation maybe a month ago. If it’s 50-50 then you are correct. However if it’s skewed then it depends. I can’t recall what ratio was discovered to be workable, it was more than 50% and less than or equal to 90%.
> I’ve been interested in speeding up allocations for quite some time. We know that calling a C function from Ruby incurs some overhead, and that the overhead depends on the type of parameters we pass.
> it seemed quite natural to use the triple-dot forwarding syntax (...).
> Unfortunately I found that using ... was quite expensive
> This lead me to implement an optimization for ... .
That’s some excellent yak shaving. And speaking up … in any language is good news even if allocation is not faster.
It seems to me like all languages are converging towards something like WASM. I wonder if in 20 years we will see WASM become the de facto platform that all apps can compile to and all operating systems can run near-natively with only a thin like WASI but more convenient.
Java bytecode was originally never intended to be used with anything other than Java - unlike WASM it's very much designed to describe programs using virtual dispatch and automatic memory management. Sun eventually added stuff like invokedynamic to make it easier to implement dynamic languages (at the time, stuff like Ruby and Python), but it was always a bit of round peg in square hole.
By comparison, WASM is really more like traditional assembly, only running inside a sandbox.
I think so, but that was the 90s where we needed a lot more hindsight to get it right. Plus that was mostly just Sun, right? WASM is backed by all browsers and it looks like MS might be looking at bridging it with its own kernel or something?
Sure, I just think that's a very odd way to characterize the project. Basically anything can be universal vm if you put enough effort to reimplementing the languages. Much of what sets Parrot aside is its support for frontend tooling.
I certainly think the humor in parrot/rakudo (and why they come up today still) is how little of their own self image the proponents could perceive. The absolute irony of thinking that perl's strength was due to familiarity with text-manipulation rather than the cultural mass....
I know I may be jumping the gun a little here but I wonder what percentage speedup could we expect on typical rails applications. Especially with Active Record.
Can someone explain, is YJIT being abandoned over the new ZJIT? [0]
And if so, will these YJIT features likes Fast Allocations be brought to ZJIT?
https://railsatscale.com/2025-05-14-merge-zjit/
It's not being abandoned, we're just shifting focus to evaluate a new style of compiler. YJIT will still get bug fixes and performance improvements.
ZJIT is a method based JIT (the type of compiler traditionally taught in schools) where YJIT is a lazy basic block versioning (LBBV) compiler. We're using what we learned developing and deploying YJIT to build an even better JIT compiler. IOW we're going to fold some of YJIT's techniques in to ZJIT.
> And if so, will these YJIT features likes Fast Allocations be brought to ZJIT?
It may not have been clear from the post, but this fast allocation strategy is actually implemented in the byte code interpreter. You will get a speedup without using any JIT compiler. We've already ported this fast-path to YJIT and are in the midst of implementing it in ZJIT.
Thanks for all the work you all are putting into Ruby! The improvements in the past few years have been incredible and I'm excited to see the continuous efforts in this area.
Awesome, thanks for all the good work on Ruby!
Why is a traditional method based JIT better than an LBBV JIT? I thought YJIT is LBBV because it's a better fit for Ruby, whereas traditional method based JIT is more suitable for static languages like Java.
One reason is that we think we can make better use of registers. Since LBBV doesn't "see" all blocks in a particular method all at once, it's much more challenging to optimize register use across basic blocks. We've added type profiling, so ZJIT can "learn" types from the runtime.
>For this reason, we will continue maintaining YJIT for now and Ruby 3.5 will ship with both YJIT and ZJIT. In parallel, we will improve ZJIT until it is on par (features and performance) with YJIT.
I guess YJIT will always be faster in warmup and minimal increase of memory usage. ZJIT being more traditional should bring more speedup than YJIT.
But most of the speedup right now is still coming from rewriting C into Ruby.
> But most of the speedup right now is still coming from rewriting C into Ruby.
Quick glance, this statement seems backwards - shouldn't C always be faster? or maybe i'm misunderstanding how the JIT truly works
The sibling comments mention that C is used in a lot of places in Ruby that incur cross-language overheads, which is true, but it's also just true that in general, even ignoring this overhead, JIT'd functions are going to be faster then their comparable C functions, because 1) they have more profiling information to be able to work from, 2) they have more type information, and (as a consequence of 1&2) 3) they're more likely to be monomorphized, and the compiler is more able to inline specialized variants of them into different chunks of the code. Among other optimizations!
> ...they have more profiling information to be able to work from... more type information... more likely to be monomorphized, and the compiler is more able to inline specialized variants of them into different chunks of the code.
this is fascinating to me. i always assumed C had everything in the language that was needed for the compiler to use. in other words, the compiler may have a lot to work through, but the pieces are all available. but this makes it sound like JIT'd functions provide more info to the compiler (more pieces to work with). is there another language besides C that does have language features to indicate to the compiler how to make things as performant as possible?
Unless your JIT can analyse the full code, a transition between byte code and native code is often costly because the JIT won't be able to optimize the full path. Once your JIT generates good enough code, it then becomes faster to avoid that transition even in cases when in isolation native code might still be faster.
EDIT: Note that this isn't an inherent limit. You could write a JIT that could analyze the compiled C code too. It's just that it's much harder to do.
C itself is fast; it's calls to C from Ruby that are slow. [1]
Crossing the Ruby -> C boundary means that a JIT compiler cannot optimize the code as much; because it cannot alter or inline the C code methods. Counterintuitively this means that rewriting (certain?) built-in methods in Ruby leads to performance gains when using YJIT. [2]
[1]: https://railsatscale.com/2023-08-29-ruby-outperforms-c/ [2]: https://jpcamara.com/2024/12/01/speeding-up-ruby.html
after reading your source I'd say YJIT still there up until ZJIT is ready and on par with YJIT
and the features is there when its there
> It’s very rare for code to allocate exactly the same type of object many times in a row, so the class of the instance local variable will change quite frequently.
That’s dangerous thinking because constructors will be a bimodal distribution.
Either a graph of calls or objects will contain a large number of unique objects, layers of alternating objects, or a lot of one type of object. Any map function for instance will tend to return a bunch of the same object. When the median and the mean diverge like this your thinking about perf gets muddy. An inline cache will make bulk allocations in list comprehensions faster. It won’t make creating DAGs faster. One is better than none.
> Any map function for instance will tend to return a bunch of the same object.
Yes, but if it ends up creating any ephemeral objects in the process of determining those returned objects, then the allocation sequence is still not homogeneous. In Ruby, according to the article, even calling a constructor with named arguments allocates, so it's very easy to still end up cycling through allocating different types.
At the same time, the callsite for any given `.new()` invocation will almost always be creating an instance of the exact same class. The target expression is nearly always just a constant name. That makes it a prime candidate for good inline caching at those callsites.
> One is better than none.
Not necessarily. An inline cache is cheap but it's not free, even less so when it also comes with the expense of moving Class#new from C to Ruby. It's probably not worth speeding up the 1% at the expense of the 99%.
> An inline cache will make bulk allocations in list comprehensions faster.
Only if such comprehensions create exactly one type of object, if they create two it's going to slow them down, and if they create zero (just do data extraction) it won't do anything.
> Only if such comprehensions create exactly one type of object,
We just had this conversation maybe a month ago. If it’s 50-50 then you are correct. However if it’s skewed then it depends. I can’t recall what ratio was discovered to be workable, it was more than 50% and less than or equal to 90%.
> I’ve been interested in speeding up allocations for quite some time. We know that calling a C function from Ruby incurs some overhead, and that the overhead depends on the type of parameters we pass.
> it seemed quite natural to use the triple-dot forwarding syntax (...).
> Unfortunately I found that using ... was quite expensive
> This lead me to implement an optimization for ... .
That’s some excellent yak shaving. And speaking up … in any language is good news even if allocation is not faster.
It seems to me like all languages are converging towards something like WASM. I wonder if in 20 years we will see WASM become the de facto platform that all apps can compile to and all operating systems can run near-natively with only a thin like WASI but more convenient.
Like predicted in 2014 here: https://www.destroyallsoftware.com/talks/the-birth-and-death...
Wasn't this the idea of the JVM?
Java bytecode was originally never intended to be used with anything other than Java - unlike WASM it's very much designed to describe programs using virtual dispatch and automatic memory management. Sun eventually added stuff like invokedynamic to make it easier to implement dynamic languages (at the time, stuff like Ruby and Python), but it was always a bit of round peg in square hole.
By comparison, WASM is really more like traditional assembly, only running inside a sandbox.
I think so, but that was the 90s where we needed a lot more hindsight to get it right. Plus that was mostly just Sun, right? WASM is backed by all browsers and it looks like MS might be looking at bridging it with its own kernel or something?
I don't know. The integration of Java applets was way smoother than WASM.
Security wise, perhaps a different story, though let's wait until WASM is in wide use with filesystem access and bugs start to appear.
> that was the 90s
In the meantime the CLR happened too.
And - to an extent - LLVM IR.
And of course the ill-fated Parrot VM associated with the Perl 6 project.
I think that was more of a language-oriented effort rather than runtime/abi oriented effort.
Parrot was intended to be a universal VM. It wasn’t just for Perl.
https://www.slideshare.net/slideshow/the-parrot-vm/2126925
Sure, I just think that's a very odd way to characterize the project. Basically anything can be universal vm if you put enough effort to reimplementing the languages. Much of what sets Parrot aside is its support for frontend tooling.
“The Parrot VM aims to be a universal virtual machine for dynamic languages…”
That’s how the people working on the project characterized it.
I certainly think the humor in parrot/rakudo (and why they come up today still) is how little of their own self image the proponents could perceive. The absolute irony of thinking that perl's strength was due to familiarity with text-manipulation rather than the cultural mass....
did it means more speeds to all rails/active records collections?
I know I may be jumping the gun a little here but I wonder what percentage speedup could we expect on typical rails applications. Especially with Active Record.
so far no diff here (https://speed.yjit.org/). But the build is from May 14 so maybe it will show up in new build?
At this point from the outside looking in Ruby is Rails at this point.
[flagged]
There's no discussion there, so not much value other than imaginary internet points