https://kotlinlang.org logo
Title
f

ferrytan

04/23/2019, 3:49 AM
Hi everyone, i've got a question about collection, is there any function or trick in kotlin that could merge only adjacent items in collection? for example, i have an object containing string and an integer. let's say i have this array of object: Raw:
[
   {"A",1}, 
   {"A",1}, 
   {"A",1}, 
   {"B",1}, 
   {"C",1},
   {"D",1}, 
   {"D",1}, 
   {"A",1},
   {"A",1}, 
   {"D",1}
]
I'm aiming to merge the adjacent items and add the integer value for each adjacent items, the end result should transform as this: Result:
[
   {"A",3}, 
   {"B",1}, 
   {"C",1}, 
   {"D",2}, 
   {"A",2}, 
   {"D",1}
]
I've searched through basic extensions for Collection, closest i can found is using
windowed
with
step
1 and
size
1, but it seems like i have to loop and map the call multiple times until there are no pairs to get the result i wanted. Is there any other way to get around this? I also tried
zipWithNext
but it gets tricky if there are more than 2 adjacent items on the list
y

yen

04/23/2019, 5:41 AM
I believe you can do something like
list.groupBy { ... }
a

Alowaniak

04/23/2019, 5:44 AM
I think group by would group everything though? As in the later "A"s would be grouped with the earlier ones. Wouldn't just using
fold
work though? https://pl.kotl.in/FYrYZ9Zxg
👍 3
p

pindab0ter

04/23/2019, 7:55 AM
list.groupBy { … }
won’t work because it doesn’t preserve separated groups.
list.groupingBy { … }
does, but when I tried to implement a solution with it it quickly became more complex than the
fold
@Alowaniak posted.
r

ribesg

04/23/2019, 8:09 AM
Maybe you can do things with
windowed
k

karelpeeters

04/23/2019, 8:13 AM
This is a commonly asked question, and there isn't really a nice solution. People usually end up writing their own utility function. Maybe this should be added to the stdlib at some point.
👍 1
s

stellarspot

04/23/2019, 8:43 AM
I tried to implement the most forward approach just to go over collection, sum the same elements and put them to new list when key is changed:
fun <Key> List<Pair<Key, Int>>.countAdjacent(): List<Pair<Key, Int>> {

    val res = mutableListOf<Pair<Key, Int>>()

    if (this.isEmpty()) {
        return res
    }

    var key = this.first().first
    var count = this.first().second

    this.forEachIndexed { index, pair ->

        if (index != 0) {
            if (pair.first == key) {
                count += pair.second
            } else {
                res.add(Pair(key, count))
                key = pair.first
                count = pair.second
            }
        }
    }

    res.add(Pair(key, count))

    return res
}
The same solution which does not use indices looks like below but I found that it requires to use !! with key even it has been explicitly checked to null:
fun <Key> List<Pair<Key, Int>>.countAdjacent(): List<Pair<Key, Int>> {

    val res = mutableListOf<Pair<Key, Int>>()

    var key: Key? = null
    var count = 0

    this.forEach { pair ->

        if (pair.first == key) {
            count += pair.second
        } else {
            if (key != null) {
                res.add(Pair(key!!, count)) // Compilation error if !! is not used with key
            }
            key = pair.first
            count = pair.second

        }
    }

    if (key != null) {
        res.add(Pair(key!!, count)) // Compilation error if !! is not used with key
    }

    return res
}
p

pindab0ter

04/23/2019, 9:31 AM
@Alowaniak’s solution is short and simple. I changed it to be a function and to use an immutable List internally mostly as an exercise:
fun <T> List<Pair<T, Int>>.countAdjacent(): List<Pair<T, Int>> = fold(emptyList()) { acc, e ->
    acc.lastOrNull()
        ?.takeIf { it.first == e.first }
        ?.let { acc.dropLast(1).plus(it.first to it.second + e.second) }
        ?: acc.plus(e)
}
👍 1
r

ribesg

04/23/2019, 9:34 AM
This is one of those cases where immutability introduces so much inefficiency. Imagine calling that on a list of 100k or 1m items. How many
ArrayList
instances to you create with a single function call there? I would be the kind of guy to implement this on
MutableList
and attempt to do it in place and in
O(n)
with a
MutableIterator
. But I guess it depends on the use case. If you know that you’ll have at most 30 items in your list, immutability is the way to go
:tnx: 1
👍 1
I quickly wrote that
data class Entry<T>(val value: T, var count: Int)
fun <T> MutableList<Entry<T>>.countAdjacent(): MutableList<Entry<T>> {
    if (size <= 1) return this
    val iterator = listIterator()
    var previous = iterator.next()
    do {
        val current = iterator.next()
        if (current.value == previous.value) {
            previous.count++
            iterator.remove()
        }
        previous = current
    } while (iterator.hasNext())
    return this
}
👍 1
f

ferrytan

04/23/2019, 10:46 AM
Wow thank you for all the replies! Many interesting approaches, i will try them and see if it fits
I just tested your approaches, the
fold
works perfectly, i have around 250+ data and it took around 70ms to fold the collection. fast enough, and my data count won't be higher than that, so i guess the
fold
method fits my requirement 👍🏻 The extension function is awesome @pindab0ter, thank you so much! gets amazed even more by `kotlin`'s cleanliness although the suggestion from @stellarspot @ribesg might be preferable if i have a very huge data count. Anyway, thank you guys for the answers! solved my problem🙏
☺️ 1
k

karelpeeters

04/23/2019, 6:16 PM
70 ms
just to do some trivial operation? That kind of stuff adds up quickly 😕
😆 1
☝️ 1
p

pindab0ter

04/23/2019, 8:52 PM
@ribesg and @karelpeeters are right. The solution I posted is incredibly inefficient. The iterative solution @ribesg posted is the most efficient. The fold solution @Alowaniak posted also performs decently. The adjustments I made make it absolutely glacial. Please don’t use that. You can fiddle around with it here: https://pl.kotl.in/-Ne-2Voow
👍 1
a

Alowaniak

04/23/2019, 10:43 PM
When I was fiddling around with it the "Imperative" one performs worse than just folding with immutable list. I guess because it copies the list first. Afaik the fold is just an inline for loop, so it shouldn't make a difference in performance compared to explicitly going over it yourself afaict Btw you could also use
apply
which makes it look a bit nicer IMHO (I don't like the trailing acc) but ymmv, it can get confusing because you're basically having multiple receivers
fun <T> List<Pair<T, Int>>.sumAdjacentFoldMutable() = fold(mutableListOf<Pair<T, Int>>()) { acc, e -> 
    acc.apply {
    	if (lastOrNull()?.first == e.first) set(lastIndex, e.first to e.second + last().second) 
        else add(e)
    }
} //as List<Pair<T, Int>>
👍 1
f

ferrytan

04/24/2019, 10:55 AM
wow the results are so distinct... thank you guys
sumAdjacentFoldMutable took 267ms
sumAdjacentFoldImmutable took 10496ms
countAdjacentImperative took 142ms