I ve heard a rumor that the `when` statement can evaluate it kotlinlang #announcements

I’ve heard a rumor that the `when` statement can e...

Grant Park

12/16/2020, 10:14 PM

I’ve heard a rumor that the

when

statement can evaluate its cases in constant time, because the compiler will change the switch structure to a map. Does anyone know if this is true?

Marc Knaup

12/16/2020, 10:17 PM

when

cases can be super complex. Some cases can be optimized, like having only

Int

and

String

cases.

Grant Park

12/16/2020, 10:19 PM

I have a feeling the rumor is false partly for that reason — most likely developers want the option to short circuit some logic, some cases should never have to run, esp if they are long and blocking

Grant Park

12/16/2020, 10:19 PM

it would cool though if there was a compiler plugin to allow the optimization in that rumor with an annotation or something

Marc Knaup

12/16/2020, 10:20 PM

I’m not sure what kind of optimization you are talking about. Do you have an example

when

Grant Park

12/16/2020, 10:22 PM

Imagine a when statement with a million

Int

cases for example. If the compiler replaced it with a map, the entire when statement can be evaluated instantly, whereas a typical switch structure would have to evaluate in linear time in the worst case.

Grant Park

12/16/2020, 10:24 PM

Super contrived example, but imagine a project that makes heavy use of a single when block with a lot of cases that can sacrifice some memory for speed (cough, redux apps). This would be useful I think.

Marc Knaup

12/16/2020, 10:31 PM

I think that’s already optimized on JVM with a switchmap internally.

Nir

12/16/2020, 10:32 PM

when you say "instantly" I assume you just mean "in constant time"

Nir

12/16/2020, 10:33 PM

the thing is that hash table lookups are pretty expensive, so you'd need quite a lot of cases for that to be worthwhile, may not be as sexy as it seems

Nir

12/16/2020, 10:33 PM

the most common case where this happens in compiled languages is when you have a contiguous range of integers, and then you can just change it into a jump table

Nir

12/16/2020, 10:33 PM

surprisingly even that is not worth it if you have less than I think 4-5 options

Kirill Grouchnikov

12/16/2020, 10:44 PM

I don't think bytecode spec will even allow a class with a "million cases" to fit into the various 65K limitations.

Grant Park

12/16/2020, 10:55 PM

There are plenty of usecases for this optimization for Kotlin frontends using unidirectional patterns. An example of a

when

that could easily grow in size to 10+ cases for a particular feature in an Android app: https://github.com/grant-park/ReduxEngineSampleProject/blob/master/app/src/main/ja[…]i/grant/reduxenginesampleproject/model/reducers/NotesReducer.kt

Kirill Grouchnikov

12/16/2020, 10:57 PM

You jumped from million down to 10+...

Grant Park

12/16/2020, 10:57 PM

The “million” was just for illustration when I was explaining what I was talking about haha

Grant Park

12/16/2020, 11:00 PM

In reality, the frontends I work with just loop through multiple series of

when

statements at a time on a single thread, so we are probably looking at 50 cases at most. That is still a lot of iterations before rendering something on a screen.

Grant Park

12/16/2020, 11:01 PM

It would be nice to even only apply this optimization for sealed classes…

Nir

12/16/2020, 11:02 PM

for 10 cases, it's extremely unlikely a hash table is better

Nir

12/16/2020, 11:02 PM

at N = 10 linear search will beat everything, except a jump table

Nir

12/16/2020, 11:03 PM

maybe about tied; looking at some benchjmarks online

Nir

12/16/2020, 11:04 PM

hard to say though, especially consider the secondary effects of creating a hash table, possible code bloat, reducing inlining, etc

Nir

12/16/2020, 11:04 PM

at 50 yes

Grant Park

12/16/2020, 11:04 PM

I can imagine real world use cases reaching even 1-3k cases.

Grant Park

12/16/2020, 11:05 PM

If you are familiar with redux, that is an unacceptable amount of iterations per render loop.

Grant Park

12/16/2020, 11:05 PM

but alas, states can get arbitrarily heavy

Nir

12/16/2020, 11:06 PM

1-3K i assume would have to be the result of codegen

Nir

12/16/2020, 11:06 PM

but once you get to these number of cases, it seems to make more sense to me to just have a hash table of key -> function anyway

Nir

12/16/2020, 11:06 PM

and then you can control the data structure exactly as you want

Marc Knaup

12/16/2020, 11:06 PM

HashMap

in JVM is incredibly fast. Creating a

HashMap

is not though.

Grant Park

12/16/2020, 11:06 PM

Not at all, you can have thousands of employees working on individual features for one frontend.

Grant Park

12/16/2020, 11:07 PM

Such a simple optimization could make certain architectures possible.

Nir

12/16/2020, 11:07 PM

any architecture where you have a when like that over thousands of cases seems pretty dubious

Nir

12/16/2020, 11:07 PM

the same comment stands in any case

Grant Park

12/16/2020, 11:07 PM

That is opinion 🙂

Nir

12/16/2020, 11:08 PM

yes, it is 🙂 but one based on large scale consensus

Nir

12/16/2020, 11:08 PM

anyway, not going to argue

Nir

12/16/2020, 11:08 PM

like I said, if you want a hash table behavior, you can just use a hash map, not sure what the issue is

Nir

12/16/2020, 11:09 PM

this is the same pattern that's used when you have a polymorphic factory, usually if you have more than a handful of derived classes you want to switch from a single centralized when, to some kind of factory that you register your derived with

Grant Park

12/16/2020, 11:09 PM

I only say that it would be nice to have a compiler plugin out there.

Grant Park

12/16/2020, 11:10 PM

It is as arbitrary as many of the other plugins available that do things we could perform with the language itself. It’s just a matter of convenience for developers like me 🙂

Nir

12/16/2020, 11:11 PM

🤷 i feel like even if it existed, I would prefer a hash table of string to function as it's literally a more robust solution

Grant Park

12/16/2020, 11:11 PM

gotcha, well we live in different worlds what can I say haha

Nir

12/16/2020, 11:11 PM

possibly I'm missing something about your specific case. in the situations I've encountered a when with a thousand cases would be pretty brittle compared to a Map of functions

Nir

12/16/2020, 11:12 PM

Can you explain what the downside of a Map of functions is?

Grant Park

12/16/2020, 11:12 PM

thats reasonable to say

Grant Park

12/16/2020, 11:12 PM

I don’t expect this to apply to a single

when

with a thousand cases

Grant Park

12/16/2020, 11:12 PM

more realistically a multitude of `when`s with much smaller amounts of cases

Grant Park

12/16/2020, 11:14 PM

downside of a Map is that it is less semantically intuitive than a

when

Grant Park

12/16/2020, 11:19 PM

I think I would be abhorred by a map of several functions over a

when

when walking into a codebase. I would flag that in a code review and ask why not use a simpler construct?

Marc Knaup

12/16/2020, 11:31 PM

btw, the compiler becomes super slow when you have a lot of (nested)

when

cases. I guess due to type inference. I had to split up my generated code into multiple functions. https://raw.githubusercontent.com/fluidsonic/fluid-i18n/master/modules/data-regions/sources/common/RegionNames_normal.generated.kt

Marc Knaup

12/16/2020, 11:33 PM

(~30 minutes build time -> seconds)

Grant Park

12/16/2020, 11:37 PM

excellent experiment 👏 how is the runtime?

Marc Knaup

12/16/2020, 11:37 PM

super fast. nanoseconds

Marc Knaup

12/16/2020, 11:37 PM

no, microseconds

Marc Knaup

12/16/2020, 11:37 PM

us 😄

Grant Park

12/16/2020, 11:38 PM

I think you had mentioned something about the interal jvm implementation using a switchmap

Grant Park

12/16/2020, 11:38 PM

if its microseconds, that’s probably the case right? do you have any more info on that tidbit?

Nir

12/16/2020, 11:38 PM

well, it really depends how big the

when

is, if its 4 cases, then yes, I'd agree the Map is weird. But Map is already a super standard technique for many situations over a when, like a polymorphic factory. The fact that it's extensible non-locally is a big advantage; in your example, you can have thousands of developers each registering their own cases into this Map, and affect the behavior of the "when" without touching a central (and perhaps incomprehensible) piece of code

Marc Knaup

12/16/2020, 11:40 PM

@Grant Park easiest is to just compile a file with a larger when and decompile the generated

.class

file. Then you can see how it was translated. The Kotlin Bytecode Viewer in IDEA doesn’t show all optimizations.

Marc Knaup

12/16/2020, 11:40 PM

👍 1

Grant Park

12/16/2020, 11:42 PM

Wow TIL. Thank you for that!

Grant Park

12/16/2020, 11:42 PM

I wonder if there are similar things in the works or already the case for sealed classes

Grant Park

12/16/2020, 11:49 PM

@Nir you make valid points about really specific patterns like the factory you mentioned, but I don’t think it would be productive to have thousands of developers working on a single map. If you are interested in my usecase, I would look into the Redux architecture with Kotlin. The usage of

when

statements gets pretty heavy when looping through a bunch of them to produce new UI states in time to render on a screen.

Grant Park

12/16/2020, 11:51 PM

One reason why this is a concern is that you probably don’t want to loop through too many cases to update your app state before inducing a change that is to be visually seen by the end user. Even a 200ms delay can be noticeable and contribute to what’s known as “Jank” (It is an official term dubbed by Google https://developer.android.com/topic/performance/vitals/render)

Grant Park

12/17/2020, 12:10 AM

It looks like there is only optimization for Strings and Enums: https://github.com/JetBrains/kotlin/tree/81b30b7399dcd9fde04b7bebceed252a6acd688f/compiler/testData/codegen/bytecodeText/whenStringOptimization and https://github.com/JetBrains/kotlin/tree/81b30b7399dcd9fde04b7bebceed252a6acd688f/compiler/testData/codegen/bytecodeText/whenEnumOptimization

Nir

12/17/2020, 12:17 AM

They are working on a single map regardless :-)

raulraja

12/17/2020, 10:57 AM

TABLESWITCH 0: L6 1: L7 default: L8

Untitled

raulraja

12/17/2020, 10:57 AM

I think the original optimization mentioned is related to when desugaring into tableswitches or similar at the JVM level when the subject cases are primitive types which can then be indexed. Something that matches on ints like this can be turned for example into a table switch

4 Views

Open in Slack

Previous Next