I wonder if there is any notable difference betwee...
# compiler
t
I wonder if there is any notable difference between
a is SomeClassA
and
a.isA
(performance wise).
Copy code
interface SomeInterface {
   val isA : Boolean
}

class SomeClassA : SomeInterface {
   val isA
      get() = true
}

class SomeClassB : SomeInterface {
   val isA
      get() = false
}
d
Property access is a virtual call, and
is
check is an iteration over the supertypes hierarchy So property may be faster than
is
check, if the hierarchy is big and
SomeInterface
buried somewhere deep in it, so overhead of virtual call becomes less than this iteration But in most cases
is
will be faster
thank you color 2
There is also a thing, that interfaces are actually much slower than classes •
is
check for abstract classes checks only the line of superclasses, and for interfaces it traverses the (potentially complex) graph of interfaces • invoke virtual on classes also faster, as there are less inherited virtual tables Originally the whole FIR tree was interfaces (except actual implementations), and during performance investigations we discovered that this difference actually matters. After we replaced most of the hierarchy with
abstract classes
, we gained about 10-15% performance boost for the frontend
mind blown 1
t
Hm, this is really good info, thx for sharing.
d
But in most cases this difference is not noticeable, if there are more significant time blockers, like heavy IO (especially if it is network IO) So it makes sense to do such optimizations only if you are really sure that your program spends a lot of time actually executing the code, not waiting for something
t
I'm thinking about massive data serialization and rendering, it might matter.
d
Yeah, in such application it makes sense
When talking about performance, there are two metrics, which should be considered not on the some low level (like some algorithm optimization), but on application architecture level: code locality and memory locality • code locality is a metric of how executing code is fragmented (process all data with a one step of pipeline vs process chunk of data with all steps of pipeline) • memory locality determines how data you are processing is stored in memory (one global storage for everything vs each object contains all data related to it directly inside) Both of them are important on CPU levels, and having more locality means that there will be more hits in code cache/memory cache (L1/L2/L3), so the CPU won't spend time waiting when new data will be loaded from RAM Actually, the main reason why FIR compiler is much faster than the old one, is the better memory locality (we tried to improve the code locality too, but failed to split the compiler logic into such small chunks)
IIRC right now the CPU spends about 40-50% of time waiting to load new instructions (I mean during kotlinc compilation) It may sound scary, but actually it's a quite good result for complex modern applications
t
I've read some books about the CPU cache, so I typically try to pay attention to these. It is a bit hard to count all factors, especially on multiplatform.
d
Yeah, I agree IMO for almost all cases it's enough to just follow some simple rules like "storing everything in one huge hash map is a bad idea"
d
> invoke virtual on classes also faster, as there are less inherited virtual tables My understanding is that the main reason for this (and why there are different instructions for
invoke-interface
and
invoke-virtual
) is that the position of methods in the table is (relatively) fixed when using classes (since there can only be one superclass), but for interfaces could be anywhere and thus the table needs to be searched - is that accurate and/or what you're referring to?
d
Yes, you are right Thank you for clean description