Let’s talk a bit about Kotlin’s collection extensi...
# language-evolution
d
Let’s talk a bit about Kotlin’s collection extension methods versus Java 8’s collection default methods, both designed to solve the issue of wanting to add more behaviors to existing collections, usually those that use lambdas, but in very different ways.
list.forEach { println(it) }
The Kotlin extension methods have an advantage in that, because they’re effectively static methods, they can be inlined, which makes this equivalent to:
for (value in list) { println(value) }
Meanwhile, the Java default methods have an advantage in that because it’s a default method, it can be specialized to the collection. For example, if
list
is an
ArrayList
, here’s how its
forEach
is implemented:
Copy code
public void forEach(Consumer<? super E> action) {
        Objects.requireNonNull(action);
        final int expectedModCount = modCount;
        @SuppressWarnings("unchecked")
        final E[] elementData = (E[]) this.elementData;
        final int size = this.size;
        for (int i=0; modCount == expectedModCount && i < size; i++) {
            action.accept(elementData[i]);
        }
        if (modCount != expectedModCount) {
            throw new ConcurrentModificationException();
        }
    }
This is significantly more efficient than using its iterator, which requires creation and use of an iterator object as well as more frequent
modCount
checks. The cost is that action is no longer inlined, so a lambda object is created instead (either once or per
forEach
call, depending on what is in its scope), and there’s another level of indirection in executing it. Of course, this doesn’t consider additional performance improvements from JIT optimizations, but I would expect that inlining the lambda is much simpler than inlining the logic of an iterator object, and the resulting code would still be more efficient due to the
modCount
checks. However, we haven’t covered another advantage of specialized methods: they can operate significantly more efficiently on their collections. Take, for example, Kotlin’s
getOrPut
method versus Java 8’s
computeIfAbsent.
They are effectively the same in what they do. However,
getOrPut
requires a
get
call that may be followed by a
put
call. On a
HashMap
, this means hashing the object twice and finding the proper location in the proper hash bucket twice. On a
TreeMap
, this means traversing the tree twice. However, `HashMap`’s
computeIfAbsent
is specialized so that as soon as it’s determined that there’s no prior value at the correct hash bucket, the new value can be put in the correct place immediately, without rehashing or renavigating. (Strangely,
TreeMap
doesn’t have a similar specialized
computeIfAbsent
implementation, and I can’t tell why.) There’s also the issue that extension methods, being implementation-agnostic, can’t guarantee that a
ConcurrentHashMap
is operated on safely, while its specialized
computeIfAbsent
can. So where does that leave Kotlin? Are there plans to ever find some way that a more advanced or JIT compiler can benefit from both inlining and specialization, or are we doomed to create iterators over `ArrayList`s forever?
m
There are default methods on interfaces in Kotlin too. I don't think they're compiled to Java 8 default methods yet though.
Copy code
interface Example {
    fun a()
    fun b() = a()
}
s
(Inline) extension functions often delegate to implementation specific instance methods if that would be a better or more efficient one, if there is such one exposed.
Copy code
inline BaseClass<T>.someFun(...) : ... {
  when (this) {
    is SubClass -> someFun(...) // better implementation
    else -> ... ...
  }
}
d
Kotlin would be able to convert its stdlib extension methods to default methods, but then if someone already wrote a specialized
forEach
method for their
List
implementation (which would currently only work when the object is known to be their specific class at compile-time), it would suddenly fail to compile for missing the
override
modifier. It would also be possible to use the
when (this)
check as you describe if the method is named slightly differently, but then suddenly you're replacing what should be a simple v-table lookup with a series of
instanceof
checks and, to make matters worse, whatever lambda you pass in to
someFun
will have to be repeated for the case of every specialized class as well as the default implementation, which would lead to a large increase in bytecode. Also, if we use Kotlin's default methods, we give up the ability to inline the methods, as methods can only be inlined if the implementation is known at compile-time.
m
@marstran Kotlin default methods on interfaces are compiled to Bytecode default interfaces IFF you're targeting JDK8 and use the @JvmDefault annotation. This was experimental in 1.3. https://kotlinlang.org/docs/reference/java-to-kotlin-interop.html#default-methods-in-interfaces
d
Kotlin would be able to convert its stdlib extension methods to default methods
This can't happen in arbitrary case for at least 2 reasons: 1) Kotlin uses Java collection classes, and can't add "missing" default methods. 2) Kotlin inline lambdas have different semantics than Java lambdas (for example, non-local return).
d
Ah, right, I had forgotten that part. Kotlin is currently entirely committed to using extension methods for the collections framework. So are we doomed to create
Iterator
objects for looping over
ArrayList
, or is there some hope that the JIT compiler will someday optimize that away to match what may happen with the JIT for the default methods on the JVM?
j
I have the same concerns about using iterators all over the place, where most of these lists are in fact `ArrayList`s. Couldn’t the Kotlin stdlib simply include some more specialized extensions for -say-
ArrayList
? Ones that don’t use an iterator but instead use an indexed
for
loop? This will obviously create some code duplication, but the same can be said for any approach that allows writing specialized code as far as I can tell.
d
The issue there is that you generally only specify
ArrayList
at creation time, and shouldn't require that a method take only an
ArrayList
, so you instead deal with `List`s. It may be virtually guaranteed to be an
ArrayList
at runtime, but extension methods are applied statically, so it can only use the
List
method. One could write the inline method with an
is ArrayList
check, and I expect that this would lead to a speedup for the
ArrayList
case in many cases, but at the often-unacceptable cost of having to repeat the entire block of code that is inlined. If we could get some compilation optimization to detect the cases in which a list is always an
ArrayList
or
ImmutableList
(or any other
RandomAccess
list), we could make improvements (though maintaining
ConcurrentModificationException
tracking will always be a challenge), but that would not be trivial to write and would require whole-program analysis such as R8.
j
@Derek Peirce right, I missed the static dispatch here 😥