Derek Peirce
03/27/2021, 6:04 AMval listA = listOf("foo", "bar")
val listB = listOf("alpha", "beta", "gamma")
val firstLetters = listA.map { it.first() } + listB.map { it.first() }
+
works, and it's as concise as can be, so everything is fine, right? Well, it works, but not very efficiently. If we inline the methods, we see that each map
creates its own ArrayList
, and then a third ArrayList
is created with +
to store all of them, which is two more list objects than we actually needed, because +
respected that the individual lists shouldn't be mutated, even though they're never used again beyond this block of code.
Consider a rewritten, more efficient form:
val firstLetters = ArrayList<Char>(listA.size + listB.size).apply {
listA.mapTo(this) { it.first() }
listB.mapTo(this) { it.first() }
}
This creates exactly the single list that we need, no wasted objects. It's also terribly ugly. But what if Kotlin allowed us to use the concise expressiveness of the first functional line with the efficiency of the four lines above? It would have to recognize that map
has a corresponding mapTo
, and then that the results of map
are used only by +
. The second requirement is easy, but the first requires a bit more. What if mapTo
didn't exist as a separate method from map
?
Suppose map
were written as:
public inline fun <T, R> Iterable<T>.map(transform: (T) -> R): List<R> {
@CollectionBuilder val destination = ArrayList<R>(collectionSizeOrDefault(10))
for (item in this)
destination.add(transform(item))
return destination
}
@CollectionBuilder
would indicate to the compiler that it should build two versions of map
, the one above and a transformed one, `map_collectionBuilder`:
public inline fun <T, R, C : MutableCollection<in R>> Iterable<T>.map_collectionBuilder(destination: C, transform: (T) -> R): C {
for (item in this)
destination.add(transform(item))
return destination
}
This is, of course, the original mapTo
. As long as the result of add
is never checked, it should be possible for any method that builds its own collection to instead append to an existing one, with the type List
being effectively the default output.
The compiler can then create its own empty list and fill it with each map_collectionBuilder
call automatically. The only remaining problem is how to right-size that list. Maybe it would add the two existing collectionSizeOrDefault
calls together, but that could get tricky to find. Or maybe the map
method would need an annotation to give a hint to this collection-optimization step that this method creates a collection that is exactly as large as the input, @SameSizeOutput
, while others like filter
may indicate that as a ceiling, @SameSizeOrSmallerOutput
. In such a case, perhaps the methods would be inverted, and we write this one method instead:
public inline fun <T, R, C : MutableCollection<in R>> Iterable<T>.mapTo(@CollectionBuilder(directName = "map", size = SAME_SIZE) destination: C, transform: (T) -> R): C {
for (item in this)
destination.add(transform(item))
return destination
}
and the corresponding map
would be generated from there.
If it seems strange for the language to specifically optimize +
on collections like this, we've already done it for String
, converting repeated `+`s into `StringBuilder`s, we may as well offer something similar to collections.Youssef Shoaib [MOD]
03/27/2021, 9:26 AMval firstLetters = buildList(listA.size + listB.size) {
listA.mapTo(this) { it.first() }
listB.mapTo(this) { it.first() }
}
and personally I think this one is clearer and pretty much concise because the intention is immensely clear: I'm building a list, with an initial size, and mapping some items into it.Derek Peirce
03/27/2021, 8:34 PMArrayList<Char>(...).apply {
into buildList(...) {
, it still requires separate mapTo(this)
calls. listA.map { it.first() } + listB.map { it.first() }
is by far the most concise form, as it doesn't include manually specifying the size (which a sufficiently advanced compiler could infer on its own) and avoids having to mapTo(this)
.Youssef Shoaib [MOD]
03/27/2021, 9:23 PM// Could be named much much better, but I'm trying to make this type explicitly visible here
// The weird nullability, extra boolean param, and Int return type can be abstracted away
// in a @JvmInline value class, which is part of what KT-44530 proposes.
typealias ListInTheProcessOfBuilding<T> = (MutableList<T>?, Boolean) -> Int
inline operator fun <T> ListInTheProcessOfBuilding<T>.plus(noinline other: ListInTheProcessOfBuilding<T>): ListInTheProcessOfBuilding<T> =
{ list, shouldReturnSize ->
this@plus.invoke(list, shouldReturnSize) + other.invoke(list, shouldReturnSize)
}
/* Use this one if you dislike the call to toList at the end of the chain
inline operator fun <T> ListInTheProcessOfBuilding<T>.plus(noinline other: ListInTheProcessOfBuilding<T>): List<T> {
val operation: ListInTheProcessOfBuilding<T> = (this@plus + other)
return buildList(operation(null, true)) {
operation(this, false)
}
}
*/
fun main() {
val listA = listOf("foo", "bar")
val listB = listOf("alpha", "beta", "gamma")
// Note that you lose non-local returns here because of noinline.
val firstLetters = listA.map { it.first() } + listB.map { it.first() }
println(firstLetters.toList())
}
/* noinline is needed just to get the code compiling, but hopefully with the compiler plugin that modifier won't be needed. I haven't figured out how to suppress that error though yet lol */
inline fun <T, R> List<T>.map(noinline transform: (T) -> R): ListInTheProcessOfBuilding<R> =
{ list, shouldReturnSize -> if (shouldReturnSize) size else mapTo(list!!, transform).let { 0 } }
inline fun <T> ListInTheProcessOfBuilding<T>.toList() = buildList { this@toList(this, false) }
(or, if you want it without the toList call (playground):
// Could be named much much better, but I'm trying to make this type explicitly visible here
// The weird nullability, extra boolean param, and Int return type can be abstracted away
// in a @JvmInline value class, which is part of what KT-44530 proposes.
typealias ListInTheProcessOfBuilding<T> = (MutableList<T>?, Boolean) -> Int
inline operator fun <T> ListInTheProcessOfBuilding<T>.plus(noinline other: ListInTheProcessOfBuilding<T>): List<T> {
return buildList(this@plus(null, true) + other(null, true)) {
this@plus(this, false)
other(this, false)
}
}
fun main() {
val listA = listOf("foo", "bar")
val listB = listOf("alpha", "beta", "gamma")
// Note that you lose non-local returns here because of noinline.
val firstLetters = listA.map { it.first() } + listB.map { it.first() }
println(firstLetters)
}
/* noinline is needed just to get the code compiling, but hopefully with the compiler plugin that modifier won't be needed. I haven't figured out how to suppress that error though yet lol */
inline fun <T, R> List<T>.map(noinline transform: (T) -> R): ListInTheProcessOfBuilding<R> =
{ list, shouldReturnSize -> if (shouldReturnSize) size else mapTo(list!!, transform).let { 0 } }
Or or or if you wanted to get really creative (playground):// Could be named much much better, but I'm trying to make this type explicitly visible here
// The weird nullability, extra boolean param, and Int return type can be abstracted away
// in a @JvmInline value class, which is part of what KT-44530 proposes.
typealias ListInTheProcessOfBuilding<T> = (list: MutableList<T>?, shouldReturnSize: Boolean) -> Int
typealias FinalListInTheProcessOfBuilding<T> = (list: MutableList<T>?, shouldReturnSize: Boolean, unit: Unit) -> Int
inline val <T> ListInTheProcessOfBuilding<T>.size: Int get() = this(null, true)
inline fun <T> ListInTheProcessOfBuilding<T>.addTo(list: MutableList<T>) { this(list, false) }
inline val <T> FinalListInTheProcessOfBuilding<T>.size: Int get() = this(null, true, Unit)
inline fun <T> FinalListInTheProcessOfBuilding<T>.addTo(list: MutableList<T>) { this(list, false, Unit) }
inline operator fun <T> ListInTheProcessOfBuilding<T>.plus(noinline other: ListInTheProcessOfBuilding<T>): ListInTheProcessOfBuilding<T> =
{ list, shouldReturnSize ->
this@plus.invoke(list, shouldReturnSize) + other.invoke(list, shouldReturnSize)
}
inline operator fun <T> ListInTheProcessOfBuilding<T>.plus(noinline other: FinalListInTheProcessOfBuilding<T>): List<T> = buildList(size + other.size) {
this@plus.addTo(this)
other.addTo(this)
}
fun main() {
val listA = listOf("foo", "bar")
val listB = listOf("alpha", "beta", "gamma")
// Note that you lose non-local returns here because of noinline.
val firstLetters: List<Char> = listA.map { it.first() } + listB.mapFinal { it.first() }
val firstLettersInDifferentOrder: List<Char> = listB.map { it.first() } + listA.map { it.first() }.final
println(firstLetters)
println(firstLettersInDifferentOrder)
}
/* noinline is needed just to get the code compiling, but hopefully with the compiler plugin that modifier won't be needed. I haven't figured out how to suppress that error though yet lol */
inline fun <T, R> List<T>.map(noinline transform: (T) -> R): ListInTheProcessOfBuilding<R> =
{ list, shouldReturnSize -> if (shouldReturnSize) size else mapTo(list!!, transform).let { 0 } }
inline fun <T, R> List<T>.mapFinal(noinline transform: (T) -> R): FinalListInTheProcessOfBuilding<R> = map(transform).final
inline val <T> ListInTheProcessOfBuilding<T>.final: FinalListInTheProcessOfBuilding<T>
get() = { list, shouldReturnSize, _ ->
this(
list,
shouldReturnSize
)
}
inline fun <T> ListInTheProcessOfBuilding<T>.toList() = buildList { this@toList(this, false) }
toList
call at the end, which IMHO is acceptable since it looks like what you would do to a Sequence to make it into a list for example and so it seems idiomatic. The second version immediately converts any 2 added map calls into a list, which works for simple usecases but is probably confusing for anyone who tries to use an odd number of map calls. The last one is what I would go for if toList
was absolutely out of the question (because I strongly prefer the toList
option) but if it was out of question then I'd go for this one because it still allows multiple calls to happen with only the last call collapsing the chain.
Also initially I wanted to have 2 different plus operators based on the needed return type (so that you can either "request" a ListInTheProcessOfBuilding or a List with type inference) but after diving into it a bit it seemed impossible to do (although I'm sure that there probably is a way that I'm too stupid to come up with) and so if you do find a way to get that to work then please LMK@file:Suppress("OVERRIDE_BY_INLINE", "NOTHING_TO_INLINE")
import java.util.*
import java.util.function.Consumer
import java.util.function.IntFunction
import java.util.stream.Stream
enum class ListInTheProcessOfBuildingCall {
AddTo, Size, Materialize
}
inline class ListInTheProcessOfBuilding<T> @PublishedApi internal constructor(@PublishedApi internal val lambda: (list: MutableList<T>?, call: ListInTheProcessOfBuildingCall) -> Any?) :
List<T> {
override inline fun contains(element: T): Boolean = materialize.contains(element)
override inline fun containsAll(elements: Collection<T>): Boolean = materialize.containsAll(elements)
override inline fun get(index: Int): T = materialize.get(index)
override inline fun indexOf(element: T): Int = materialize.indexOf(element)
override inline fun isEmpty(): Boolean = materialize.isEmpty()
override inline fun iterator(): Iterator<T> = materialize.iterator()
override inline fun lastIndexOf(element: T): Int = materialize.lastIndexOf(element)
override inline fun listIterator(): ListIterator<T> = materialize.listIterator()
override inline fun listIterator(index: Int): ListIterator<T> = materialize.listIterator(index)
override inline fun spliterator(): Spliterator<T> = materialize.spliterator()
override inline fun subList(fromIndex: Int, toIndex: Int): List<T> = materialize.subList(fromIndex, toIndex)
override inline fun forEach(action: Consumer<in T>?) = materialize.forEach(action)
override inline fun parallelStream(): Stream<T> = materialize.parallelStream()
override inline fun stream(): Stream<T> = materialize.stream()
override inline fun toString(): String = "ListInTheProcessOfBuilding(materialize=$materialize)"
override inline val size: Int get() = lambda(null, ListInTheProcessOfBuildingCall.Size) as Int
inline val materialize: List<T> get() = lambda(null, ListInTheProcessOfBuildingCall.Materialize) as List<T>
inline fun addTo(list: MutableList<T>) {
lambda(list, ListInTheProcessOfBuildingCall.AddTo)
}
}
inline fun <T> ListInTheProcessOfBuilding(
size: Int,
noinline add: (MutableList<T>) -> Unit
): ListInTheProcessOfBuilding<T> {
// variables captured in a closure are basically the same as a var in a class
var materializedValue: List<T>? = null
return ListInTheProcessOfBuilding { list, call ->
when (call) {
ListInTheProcessOfBuildingCall.AddTo -> add(list!!)
ListInTheProcessOfBuildingCall.Size -> size
ListInTheProcessOfBuildingCall.Materialize -> materializedValue
?: buildList { add(this) }.also { materializedValue = it }
}
}
}
operator fun <T> ListInTheProcessOfBuilding<T>.plus(other: ListInTheProcessOfBuilding<T>): ListInTheProcessOfBuilding<T> =
ListInTheProcessOfBuilding(size + other.size) {
addTo(it)
other.addTo(it)
}
operator fun <T> ListInTheProcessOfBuilding<T>.plus(other: List<T>): ListInTheProcessOfBuilding<T> =
ListInTheProcessOfBuilding(size + other.size) {
addTo(it)
it.addAll(other)
}
fun main() {
val listA = listOf("foo", "bar")
val listB = listOf("alpha", "beta", "gamma")
val listC = listOf("function", "property", "object", "class")
val listD = listOf("test", "best", "nest")
val listE = listOf('x', 'y')
// Note that you lose non-local returns here because of noinline.
val firstLetters: List<Char> = listA.map { it.first() } + listB.map { it.first() } + listC.map { it.first() } + listD.map { it.first() } + listE
println(firstLetters)
}
/* noinline is needed just to get the code compiling, but hopefully with the compiler plugin that modifier won't be needed. I haven't figured out how to suppress that error though yet lol */
inline fun <T, R> List<T>.map(noinline transform: (T) -> R): ListInTheProcessOfBuilding<R> = ListInTheProcessOfBuilding(size) { mapTo(it, transform) }
so yeah personally I prefer this one actually because it works perfectly as a drop-in replacement for List and it looks neat on the call site (Also I'm not splitting my messages on purpose, Slack for some reason just does that automatically so sorry for that)Derek Peirce
03/29/2021, 5:26 AMbuildList
with buildList(size)
to get it right-sized) as long as the compiler's ability to inline were improved. However, it also relies on a final call to materialize
that would be awkwardly wrapped around the output: (listA + listB + ... + listE).materialize
. At that point, it becomes clear that this would have to be optimized at the level of the language, like how concatenating multiple strings together becomes a StringBuilder
call. It's necessary so that "a" + "b" + "c"
doesn't require a toString()
call around it, the only way for the compiler to know when +
is adding to the StringBuilder
vs building the final string is a better representation in the language.Youssef Shoaib [MOD]
03/30/2021, 2:25 AMListInTheProcessOfBuilding
extends List on its own, and materialize is just simply an implementation detail of ListInTheProcessOfBuilding
that any consumer doesn't need to worry about since, again, any List-related method just operates on List as an interface (or in a lot of cases Collection which is a superinterface of List). Every single method in the List interface is delegated inside of ListInTheProcessOfBuilding
to a materialize
call followed by the intended method. I could provide a more detailed example if you want one where a normal listOf is replaced with a ListInTheProcessOfBuilding
in normal code so that you can see that it works right away with no awkward calls to materialize
or anything like that, but simply think of it as if StringBuilder
was a subtype of String, which means that you won't need a toString
call. (I didn't realize by the way that I missed up the buildList size thing so whoops nice catch). Again, I'm currently working on a compiler plugin to support this, and there's an issue already for this to be supported, and, tbh, from what I've been seeing so far while trying to implement this plugin, this isn't a complicated optimization to implement, and like the compiler itself already has more complicated things in it, and so yeah hopefully this'll be implemented soon enoughDerek Peirce
03/31/2021, 2:51 AMList
. However, it still doesn't function quite like a naturally-built List
, as it is instead a lazy List
. If the list is only used within the current method, then a theoretically powerful enough inlining compiler could optimize that out, but if the list is passed into a non-inline method or returned from a non-inline method, then you're forced to keep the list-building components in a closure unless you manually call toList()
. This incurs performance and garbage-collection penalties if laziness isn't desired.Youssef Shoaib [MOD]
03/31/2021, 3:54 AMDerek Peirce
03/31/2021, 4:19 AM