I propose adding the following method to the stdli...
# stdlib
d
I propose adding the following method to the stdlib:
Copy code
public inline fun <T1, T2: Comparable<T2>> compareBy(crossinline selector: (T1) -> T2): Comparator<T1> =
        Comparator { a, b -> selector(a).compareTo(selector(b)) }
This looks similar to the existing `compareBy`:
Copy code
public inline fun <T> compareBy(crossinline selector: (T) -> Comparable<*>?): Comparator<T> =
    Comparator { a, b -> compareValuesBy(a, b, selector) }
But the type ambiguity and nullability make it inefficient, as it needs to box primitives. I ran a benchmark comparing the two
compareBy
implementations for finding the maximum of100,000 objects by their random int fields, and got these results:
Copy code
Benchmark                                           (N)  Mode  Cnt  Score   Error  Units
ComparatorBenchmarkPrimitiveHelper.maxKt         100000  avgt   10  2.824 ± 0.102  ms/op
ComparatorBenchmarkPrimitiveHelper.maxPrimitive  100000  avgt   10  0.381 ± 0.155  ms/op
The results speak for themselves, the mass boxing and unboxing in the existing implementation has major performance implications and ought to be prevented with a more specialized
compareBy
method. This naturally extends to the methods that use it such as
sortBy
and
sortedBy
(I ran the benchmark using
maxBy
because it gave the most contrast,
sortBy
spent more time on intermediate operations). Strangely,
maxBy
and similar already require non-null values and work as they are.
i
We had investigated an option of introducing
compareBy
with a constrained selector, considering mainly the benefit of more type safety than more performance, and found that introducing an additional type parameter might be unfortunate for some quite common usages of
compareBy
. https://youtrack.jetbrains.com/issue/KT-34043#focus=Comments-27-3715015.0-0
d
Would it be possible to include both, so that when only one type is specified or the value is nullable, the first is used, but when two types are specified, the second is used?
i
This needs an investigation to ensure that this overload wouldn't introduce conflicts. Perhaps it could be possible to avoid them now with the help of the recently introduced feature, overload resolution by lambda return type.
👍 1