So just out of curiosity I wanted to test how long...

# announcementsm

Michael Pohl

07/12/2021, 6:29 PMSo just out of curiosity I wanted to test how long a simple

`QuickSort`

implementation takes vs. Kotlin's own `List<T>.sorted()`

.
While I was on it, I also tested variations of my implementation as an extension function and without explicitly storing values in the function. And I noticed they are very different in how fast they are processed. What is causing this? See thread ->Michael Pohl

07/12/2021, 6:31 PMSo I had four different options to sort a random list of integers:
1. Kotlin's own

`myList.sorted()`

2. My basic implementation
Copy code

```
fun quickSort(input: List<Int>): List<Int> {
if (input.size < 2) return input
val pivot = input[input.size / 2]
val smaller = input.filter { it < pivot }
val equal = input.filter { it == pivot }
val bigger = input.filter { it > pivot }
return quickSort(smaller) + equal + quickSort(bigger)
}
```

3. The same as an extension
Copy code

```
fun List<Int>.quickSortExtension(): List<Int> {
if (this.size < 2) return this
val pivot = this[this.size / 2]
val smaller = this.filter { it < pivot }
val equal = this.filter { it == pivot }
val bigger = this.filter { it > pivot }
return smaller.quickSortExtension() + equal + bigger.quickSortExtension()
}
```

4. Without explicitly storing the vals
Copy code

```
fun quickSort2(input: List<Int>): List<Int> {
if (input.size < 2) return input
val pivot = input[input.size / 2]
return quickSort2(
input.filter { it < pivot }) +
input.filter { it == pivot } +
quickSort2(input.filter { it > pivot })
}
```

Now what I found curious was that they take extremely differnet times amounts to complete. I measured them in a very simple way:
Copy code

```
val result = measureNanoTime {
[myQuickSortFunction(myRandomArray)]
}/1000 // returns microseconds
```

So for an array with 1000 elements the average times in microseconds are like this:
1. Kotlin: 1800
2. No-val QuickSort: 3400
3. Basic QuickSort: 3600
4. Extension QuickSort: 13000
While the difference between no.2 and 3 might be negligible, it is still there - where does it come from? And how can it be explained that the extension function is so much slower - it takes more than 3 times as long? And what does the stdlib `.sorted()`

do different?
Now to make this even more interesting (to me at least 🙂 ), when testing the same methods with an array that just contains 10 items, the results are very different:
1. Basic QuickSort: 85
2. No-val QuickSort: 95
3. Kotlin: 650
4. Extension QuickSort: 6500
What causes these differences? Especially, why is Kotlin's standard sorting so much slower for small arrays than it is for larger ones? And what causes the extension function to be so much slower? Does that mean there is a general performance impact when using extensions vs. "regular" functions?a

andylamax

07/12/2021, 9:04 PMwell, this is interesting

n

Nir

07/13/2021, 12:09 AMKeep in mind that your implementations are not actually quick sort

Nir

07/13/2021, 12:09 AMQuick sort is in-place

👍 2

Nir

07/13/2021, 12:11 AMit's hard to say what exactly is safe to conclude from these benchmarks, if you want to show that extensions are slower I would try to just measure their overhead without anything else

Nir

07/13/2021, 12:11 AMAnd then for a recursive function

Nir

07/13/2021, 12:15 AMYou should probably also provide the full code you used to do the benchmarks. It's incredibly easy to make mistakes when micro benchmarking

☝️ 1

Nir

07/13/2021, 12:15 AMNot that I personally could spot any said mistakes for a JVM languy but I'm sure other people here could

p

pniederw

07/13/2021, 12:24 AMThe above code is very inefficient in that every

`filter()`

and `+`

creates a new list. Another difference is that Kotlin's `sorted()`

uses Java's Timsort, which can be significantly faster than QuickSort. To compare the performance of sorting algorithms, consider sorting arrays (see Kotlin's `Array.sort()`

). Always use JMH for JVM microbenchmarks.d

Dave K

07/13/2021, 1:32 AMSome sorting algorithms are faster on small inputs and worse with larger or vice versa and may contain overhead for setting up. Some are better when the data is fairly sorted already but less well if they’re random. Hybrid algorithms tend to cover this nuance.
The majority of the time, trying to hyper optimize something like sort is meaningless in the context of your code; consider Amdahl’s law. Or the saying that premature optimization is the root of all evil (Donald Knuth). I wouldn’t worry about the speed of each implementation too much until you know it’s a bottleneck.
And lastly, if you know what you need for the input you’re expecting (or if you know the data is largely pre sorted versus random), it’s better to use the algorithm you need. The default implementation Kotlin provides A) may change and B) is suitable when you don’t care exactly how it is implemented (I believe the contract is only that it’s stable but I could be wrong). Consider the same for listOf - this is currently an array list (and may change). If you know you’ll be inserting a lot of items in the middle of your list, you might choose to explicitly use a linked list. If you know you’ll be doing a lot of random accesses, you may choose to use an array list instead (explicitly since Kotlin can change its current implementation at any time)

m

Michael Pohl

07/13/2021, 5:03 AMThanks for enlightening me! Of course I am aware that neither my implementations nor my measuring are rather sophisticated...performance does not play a big role in my day to day so I usually don't benchmark things. I just did this out of interest and apparently didn't put much thought into it. Not planning to optimize anything actually, just looking around.
Still curious though about the huge difference the extension function produces...

p

pniederw

07/13/2021, 5:26 AMBest write a JMH benchmark to see if the difference is real.

e

ephemient

07/13/2021, 5:43 AMright, JMH or other benchmarking systems are designed to handle JVM behavior, such as warmup, GC, and blackholing (so that the optimizer doesn't just strip out all your code)

ephemient

07/13/2021, 5:45 AMKotlin's

`.sorted()`

copies to an Array, which can be accessed via bytecode insructions rather than List's `get()`/`set()`, uses Java's built-in sort on that (which is likely timsort under the covers), and copies back to a Listephemient

07/13/2021, 5:46 AMwhich is 2 * n element copies while your implementations are approximately n^2 log n element copies

ephemient

07/13/2021, 5:53 AMtimsort is like a stable in-place mergesort variant with heuristics that optimizes for minimizing comparison operations on inputs that are already somewhat structured... in the end it's O(n log n) like other comparison-based sorting methods, but it's effectively O(n) when you're just re-sorting a pre-sorted list after inserting a single element, or other common scenarios

ephemient

07/13/2021, 5:55 AMnote that you will get different timings again if you sort IntArray instead of Array<Int>, as that uses yet a different algorithm under the covers, since there's no need to use a stable sort on value types

ephemient

07/13/2021, 5:57 AM(not to mention the difference between boxed and unboxed integers... not something that matters in many cases, but it can certainly show up in microbenchmarks)

p

Paul Griffith

07/13/2021, 8:45 PMI did a not very exact benchmark using JMH and all it really shows is that primitive arrays are really important if you care about speed

Paul Griffith

07/13/2021, 8:45 PM Copy code

```
MyBenchmark.quickSort2 thrpt 25 2787.005 ± 66.497 ops/s
MyBenchmark.quickSortExtension thrpt 25 2848.910 ± 11.635 ops/s
MyBenchmark.quickSortNonExtension thrpt 25 3112.069 ± 45.936 ops/s
MyBenchmark.sortBoxedArrayInPlace thrpt 25 511593.816 ± 12580.323 ops/s
MyBenchmark.sortBoxedArrayToArray thrpt 25 11702.525 ± 112.049 ops/s
MyBenchmark.sortBoxedArrayToList thrpt 25 12085.770 ± 360.113 ops/s
MyBenchmark.sortListInPlace thrpt 25 1140047.824 ± 13258.572 ops/s
MyBenchmark.sortListToList thrpt 25 12124.497 ± 165.510 ops/s
MyBenchmark.sortPrimitiveArrayInPlace thrpt 25 4523362.961 ± 176819.205 ops/s
MyBenchmark.sortPrimitiveArrayToArray thrpt 25 86727.839 ± 1426.855 ops/s
MyBenchmark.sortPrimitiveArrayToList thrpt 25 10678.672 ± 120.942 ops/s
```

e

ephemient

07/13/2021, 8:59 PMI don't think you're testing quite right; sort will mutate the list in-place, and you're sorting the same mutated array/list over and over

ephemient

07/13/2021, 9:00 PM(but yes, in a straight shootout of sorting integers, primitive arrays make a big difference)

p

Paul Griffith

07/13/2021, 9:02 PMAh, that's a good point. JMH does re-initialize the list a few times, but not per-operation. I don't know if I care enough to rerun, but that is definitely throwing things off

e

ephemient

07/13/2021, 9:04 PMit won't affect the trivial quicksort implementations, but timsort will perform a single pass check and bail out early 🙂

p

Paul Griffith

07/13/2021, 11:11 PMupdated benchmark with fresh data per call

3 Views