Hi all, I was interested in doing some micro bench...
# announcements
b
Hi all, I was interested in doing some micro benchmarks for Kotlin vs Java, but also Kotlin 1.3.11 vs Kotlin 1.3.20. I found some earlier code based on Kotlin Hidden Costs (https://sites.google.com/a/athaydes.com/renato-athaydes/posts/kotlinshiddencosts-benchmarks). I've updated the code to the latest JMH, and I've ran the code for Kotlin 1.3.11 and 1.3.20. I've found some surprising results, but I might also be misinterpreting the results. Is there anyone here who has some knowledge on JMH/Microbenchmarks, and who can a look at some of my results?
k
b
Hi Karel, I'm sorry you feel that way. That was not what I was aiming for, I just want to make sure I'm not going to publish any nonsense about Kotlin without knowing the details, and an extra pair of eyes would be appreciated
k
Ah no, I'm not trying to put you down, sorry it came across that way. I'd suggest just posting those results and seeing if anyone sees something in them simple smile
b
See the results directory
I just didn't want to come across as: here are my results, please look at them. I'm more interested in learning how to interpret them myself
k
And which ones are you surprised about?
A general tip is to look at the generated bytecode, but you probably already knew that.
b
Well, for one, I'm not sure if I completely understand the error rate, and the significance of that.
Secondly, if you compare the results, they are quite similar, except that in benchmark 1, there are 2 tests which show a doubling in throughput. One could argue that for that benchmark, Kotlin 1.3.20 is now twice as fast.
But is that really true?
k
The error rate means how much the result differed between runs, it's basically the standard deviation from statistics. Things that can cause it: other OS activity, the GC, CPU throrlling, ...
The doubling in throughput, are you talking about between versions 1.3.11 and 1.3.20? I don't see any difference at all...
(for part 1)
b
you're right. I was wrong, I meant part 2.
for example, this one:
kotlinLocalFunctionWithoutCapturingLocalVariable
k
Okay, there the change is within the margin of error, so technically you can't deduce that anything has changed from that.
b
Screen Shot 2019-01-30 at 9.55.51 pm.png
k
The measurements for 1.3.11 have huge margins of errors there, something must have gone wrong while measuring.
b
Okay, that was what I was wondering about. When is an error rate 'huge' ?
Looking at the sample above: score is 502.946.478, and the error is 3 million. How bad is that?
I'm more than happy to rerun both tests, see what results they produce.
k
Pretty good, it means you can be 95% confident that the actual speed is in the interval [500m, 506m].
(I don't know the exact % JMH uses for their confidence intervals, but it's going to be somewhere between 90% and 99%)
But eg. for kotlinLocalFunctionCapturingLocalVariable in 1.3.11, it means the actual speed is with 95% confidence in [183m, 432m]. That doesn't tell you a lot, does it?
b
it doesn't indeed 🙂
that's a pretty wide gap.
k
Were you using your computer when the benchmarks were running? Did the OS updater kick in? An antivirus? Those kinds of things kill benchmarks.
b
btw, where did you get the 183m 432m from?
k
The score +- the error.
b
And no, I wasn't using my computer, but I might have different programs open in the background. I can turn those off
k
[309m - 125.8m, 309 + 125.8m]
b
right. I was looking at the 1.3.20 results. Sorry!
k
Yeah thing is the 1.3.20 result is nearly in that interval, so you can't really draw conclusions.
b
ah, now I feel like I'm getting it. I really have to imagine a standard deviation graph in my mind.
k
Yup exactly!
b
Mind blown
k
I just took a look at the generated bytecode for the 1.3.11 and 1.3.20 versions of that function, and I think it's identical.
b
Right. So there shouldn't be any performance difference.
Well, now I sort of understand what I'm looking at. I've loaded my results in JMH visualizer, and that black line, is that the error range?
k
Yes. I'm on the same site simple smile
b
Screen Shot 2019-01-30 at 10.10.42 pm.png
ah 🙂
Thanks! If I'd look at these results, I might interpret it as: wow, Kotlin 1.3.20 has some great enhancements, while realistically, the result should be: Erik, your measurement is very wrong, try again.
k
Yep!
b
This is why I was looking for a peer review, to make sure I'm not publishing anything which isn't nonsense.
is there a way to visualise this information easily into a standard deviation graph?
k
Not sure, it's hard to find information about JMH I've found.
b
ah, I thought that was just me.
Thank for the help Karel, much appreciated. I'll see what I'll do with these benchmarks for now, but I won't publish them, that's for sure 🙂
k
My pleasure. Be sure to notify me if you eventually publish something, I'm always interested in performance!
👍 1
b
sure, will do. It will be published on www.jworks.io , just like my other blogposts.