There is a very nice <https blog louiscad com how to return kotlinlang #language-evolution

There is a very nice <article> about returning mul...

elect

09/22/2021, 5:23 AM

There is a very nice article about returning multiple values in kotlin without any allocation whatsoever by Louis. Beyond Android, in real time 3d graphics this is also quite an important concern. So since there are several fields where this would be happily welcomed, I think the best would be to have kotlin support built-in in the language itself. Is there any KEEP in this regards? Would also other people find this useful?

ephemient

09/22/2021, 5:28 AM

Kotlin compiler already transparently wraps vars in a

kotlin.jvm.internal.Ref

as needed, I think using that to implement out-parameters would be more likely to get into the compiler than forcing a function to be inlined at every callsite just for the sake of removing some allocations

ephemient

09/22/2021, 5:37 AM

that is of course not free of allocations, but they're allocations that the Kotlin compiler will silently make under many other circumstances anyway

ephemient

09/22/2021, 5:43 AM

TBH in @louiscad's original article, I feel that the better solution to that particular issue would have been something like

Copy code

@JvmInline value class SignalLevel private constructor(private val value: ULong) {
    constructor(signal: Int, levels: Int) : this(signal.toULong() shl 32 or levels.toUInt().toULong())
    val signal: Int get() = (value shr 32).toInt()
    val levels: Int get() = value.toInt()
}

as JVM's

long

is already a supported method to use two stack slots for return value

louiscad

09/22/2021, 5:54 AM

I agree that the inline value class is a good solution. I didn't use it because I wasn't comfortable checking the correctness of the low level operations for something so simple. Would be great to have inline value classes made of 2 ints stored in a long, and same for sub-64bits numbers, so long the total doesn't exceed 64 bits.

elect

09/22/2021, 6:08 AM

I'd prefer much more an option where I'm sure there will be no allocation a priori instead of relying on the compiler optimizations, which may no work out for unknown reason(s).

ephemient

09/22/2021, 6:45 AM

it would be interesting to have built-in Int×2, Short×2/×4, Byte×4/×8 (and corresponding unsigned) value types. Float×2 could be doable as well. but I think the use case is narrow enough that it's unlikely to get into stdlib, and implementing it in an external library should work just fine

elect

09/22/2021, 7:13 AM

you keep thinking only about primitives. There are also classes and so on..

elect

09/22/2021, 7:15 AM

maybe a symmetric syntax:

Copy code

typealias Callback = (Int, Foo) -> (Long, Bar)
val callback = { int, foo ->
   ...
   -> long, bar
}

...
val (a, b) = callback(c, d)

ephemient

09/22/2021, 8:05 AM

there's simply not a good way to implement it for general classes on the JVM, only trade-offs

elect

09/22/2021, 8:33 AM

well, with the

inline

the compiler may do the work for us

ephemient

09/22/2021, 8:36 AM

callback without inlining results in allocation both the lambda and

kotlin.jvm.internal.Ref

boxes for anything it mutates. callback with inlining needs to be done with care to avoid code explosion. using a temporary immutable data structure for returning multiple values is almost always fine. it should come from the bump allocator, if it's temporary and never stored as a reference anywhere then nursery collection is fast. it's the best trade-off for most general purposes.

christophsturm

09/22/2021, 11:09 AM

but is point 2 really true? you can just return a pair and use destructuring.

✔️ 1

christophsturm

09/22/2021, 11:11 AM

Copy code

fun return2(): Pair<String, Int> = Pair("String", 10)

val (a, b) = return2()

elect

09/22/2021, 12:10 PM

sure, but with allocation if the compiler cant escape that

christophsturm

09/22/2021, 12:14 PM

yeah sorry i did not read this whole thread before posting, i was just commenting on the blog post

christophsturm

09/22/2021, 12:16 PM

and in a lot of cases this is just premature optimisation and not worth the lost readability. but there could be a Pair<Int, Int> as inline value class. that should also work with desctructuring

✔️ 1

elect

09/22/2021, 3:59 PM

we are talking about low level libs

louiscad

09/22/2021, 4:24 PM

Or hotspots (code called at high frequencies)

elect

09/22/2021, 7:52 PM

I'm tempted to file an issue on youtrack, I'm curious to see if this might get to something concrete

louiscad

09/22/2021, 8:02 PM

Go ahead, I'd love to have @JvmInline capabilities expand. BTW, a

Long

can work for storage even for

Float

types.

ephemient

09/23/2021, 2:19 AM

yeah, as long as they go through Float.toRawBits()/fromRawBits()

Youssef Shoaib [MOD]

10/04/2021, 2:01 PM

I commented something similar on the #feed post, but here it is again since it's useful here, too. This idea that Louis presented can be extended into instead returning lambdas from inline functions and having those lambdas inlined too. The idea probably seems confusing, so here's a code example:

Copy code

fun main(){
    val (stuff, bonus) = getStuff()
}

inline fun getStuff(): ZeroCostPair<Stuff, Bonus> {
    val stuff = grabStuffFromCargoBikeBasket()
    val bonus = inspirationElixir()
    return ZeroCostPair(stuff, bonus)
}

//Implementation details
typealias ZeroCostPair<F, S> = (PairCall, F?, S?) -> Any?

enum class PairCall {
  First,
  Second
}

// Mimicking a constructor for the type.
inline fun <F, S> ZeroCostPair(first: F, second: S): ZeroCostPair<F, S> =
    { call, _, _ -> 
        when (call) {
          PairCall.First -> first
          PairCall.Second -> second
        }
    }

// Again, the parameters are useless, so just pass in null for them since during runtime the JVM won't actually know what
// F and S are since they get erased.
// We can safely cast the result of invoking the function as F or S because we know that ZeroCostPairs created using
// the factory function always follow the pattern of returning an F if PairCall.First is passed in and likewise for S.
inline val <F, S> ZeroCostPair<F, S>.first get() = this(PairCall.First, null, null) as F
inline val <F, S> ZeroCostPair<F, S>.second get() = this(PairCall.Second, null, null) as S

Youssef Shoaib [MOD]

10/04/2021, 2:12 PM

And the idea is that the returned lambda would just be fully inlined at compile time. I currently have a ticket for this and a prototype compiler plugin. However, I realise that here the issue of code explosion due to inlining still persists, but I think that there's possibly some possibilities to prevent that. For example, the user themselves can do what the above example does in that it splits the function into 2 other non-inline ones (namely

grabStuffFromCargoBikeBasket

and

inspirationElixir

and so the issue of code explosion is almost eliminated here). I think this could also be implemented on a compiler level, albeit with a compile-time performance hit, by some analysis of how much each part of a function affects the rest, and sense in general programmers are shifting towards a functional style of programming, code like this is quite common:

Copy code

inline fun foo(param1: A, param2: B): Pair<X, Y>{
    val firstThing = //some calculations
    val secondThing = //even more calculations
    // do stuff with firstThing and secondThing
    // derive x and y from firstThing and secondThing
    return x to y
}

and so in that case this function could be split into 5 parts: 2 parts for the first 2 calculations. one part for the processing stuff in the middle, then 2 parts for the derivation of both x and y. In a case like this, the compiler, by noticing that foo is inline, can split foo into 5 different synthetic methods and then whenever foo gets inlined it'll only add the bytecode needed to call those 5 methods. It can even be done on a case-by-case basis by calculating what the bytecode impact of the splitting could be compared to just letting it be inlined normally. The compiler already does some pretty complex analysis stuff, and so it seems as though this wouldn't be heretic to suggest.

elect

10/04/2021, 4:26 PM

for a moment, you truly got me super excited! Then I read that needs to be implemented by kotlin 😕

Youssef Shoaib [MOD]

10/04/2021, 4:45 PM

Well the prototype currently works and it produces optimized bytecode. I think it is also multiplatform right now, though I haven't made sure it works. I'll be polishing it up soon and adding usage instructions, but yeah it should be as easy as applying one plugin in your build.gradle.kts

elect

10/04/2021, 4:50 PM

I cant way to try it out

Youssef Shoaib [MOD]

10/04/2021, 5:06 PM

Meanwhile like the issue (if you haven't already) and also if you come up with any use-cases over the next months it would be helpful to show me an example where the plugin would be helpful because I try to yk have a lot of samples that I test against to make sure that everything works properly.

elect

10/04/2021, 5:09 PM

on Vulkan, I created a library which is object oriented. Usually in C you have to pass a pointer on which you will find the newly created object to, while on return is

VkResult

. On vkk you get instead directly the object in return, which is much more useful and common, and then if you want to inspect the

result

, at the very moment you can use the inlined lambda at the end and put there your logic about that. With your plugin, I'd be able to return directly object and result

val (obj, res) = vk..

ephemient

10/04/2021, 8:14 PM

for what it's worth, I had some fun building a code generator which produces every

U?(Byte|Short|Int|Float)(Pair|Quad|Oct)

that can be packed into a

(Int|Long)

, all tested including checking that they all get returned as primitives on JVM

elect

10/05/2021, 7:58 AM

@Youssef Shoaib [MOD] could you please keep me/us updated on your progresses? That's extremely interesting

louiscad

10/05/2021, 7:59 AM

Watching the repo releases could work, no?

elect

10/05/2021, 7:59 AM

yeah, but quick updates directly from the author are much easier to get

louiscad

10/05/2021, 8:00 AM

Depends on the scale though 🙃

🙂 1

jimn

10/11/2021, 12:10 PM

https://www.youtube.com/watch?v=OFgxAFdxYAQ▾

at about minute 35 he gets done with his asm examples and summarizes that no matter what you do, your performance is based on counting the cache misses among a stack of ~10 concurrent speculative branches. however you want to fold your two parameters together for a return is a single factor in the focus of locality of reference of all things involved, hard stop. indirection serves to push your program to the "spill" threshold of too much inline and loops that are too fat to stay in cache. Cliff Click designed the C2 hotspot compiler fwiw

👀 1

christophsturm

10/11/2021, 3:21 PM

thanks thats a great link!

👍 1

elect

10/19/2021, 4:34 AM

I finally found some time to watch it, let's say I already had the feeling, but that was much lightening. thanks jim

👍 1

elect

12/12/2021, 10:47 AM

A second though about this, it'd be cool if

inline

on destructurizations would lead to automatically this kind of code, ie:

Copy code

val (theStuff, theBonus) = getStuff()

4 Views

Open in Slack

Previous Next