Is there a function in the kotlin stdlib to split ...
# stdlib
n
Is there a function in the kotlin stdlib to split a list into a list of sublists with a predicate to choose where to separate each sublist, and where the result preserves the order of elements.
List::chunked
doesn’t do what I want, nor does
List::windowed
.  E.g. if I have a list, x = [“a”, “b”, “C”, “d”, “E”, “F”, “g”] and I want to split it into sublists where the case is the same within the sublist, I could pass in a predicate to this function something like…
Copy code
x.split { x, b -> a.isUpperCase() != b.isUpperCase() }
giving me:
Copy code
[["a", "b"], ["C"], ["d"], ["E", "F"], ["g"]]
I often need this operation, but always end up writing it with imperative code.  Is there something in the stdlib I’m missing.
☝️ 1
m
In this case you can use
x.partition { it.isUpperCase() }
no red 1
About splitting to more than 2 lists there is a thread https://kotlinlang.slack.com/archives/C0B8Q383C/p1634581886069400
n
What I gather from that thread is that (a) there is no such function, and (b) groupBy does not do what I need.
Specifically, I want the result to preserve the order of the input list.
e
n
I don’t think it is
What I’m looking/asking for is a function that splits a list into a list of lists, allowing me to identify between which elements the list should be split, and preserving the order of elements. Almost the opposite of flatten. Unflatten, say.
e
Copy code
x.groupConsecutiveBy { it.isUpperCase() }
(or whatever it is named, if it existed) would give you the same groupings you asked for
n
For this specific case, which is just an illustrative example
For my actual use case, I need to compare consecutive elements to decide whether to split the list between them
I think the use cases of groupConsecutiveBy are a subset of the function I’m proposing — you only need to examine one or the two consecutive elements
p
For my actual use case, I need to compare consecutive elements to decide whether to split the list between them (edited)
That's sort of the opposite of a functional pattern, which is probably why there's no builtin stdlib function that's generalizable
e
inefficient (due to Map) hack (due to mutation), but works:
Copy code
private object Sentinel@Suppress("UNCHECKED_CAST")
fun <T> Iterable<T>.splitWhen(predicate: (T, T) -> Boolean): List<List<T>> {
    var last: Any? = Sentinel
    var counter = 0
    return groupBy {
        if (last !is Sentinel && predicate(last as T, it)) {
            counter++
        }
        last = it
        counter
    }.values.toList()
}
efficient (by streaming) but prone to misuse (breaks if inner sequence is not iterated exactly once fully):
Copy code
private object Sentinel
@Suppress("UNCHECKED_CAST")
fun <T> Sequence<T>.splitWhen(predicate: (T, T) -> Boolean): Sequence<Sequence<T>> = sequence {
    var next: Any? = Sentinel
    val iterator = this@splitWhen.iterator()
    while (next !== Sentinel || iterator.hasNext()) {
        yield(sequence loop@{
            for (item in iterator) {
                val last = next
                next = item
                if (last !== Sentinel) {
                    yield(last as T)
                    if (predicate(last, item)) return@loop
                }
            }
            if (next !== Sentinel) yield((next as T).also { next = Sentinel })
        })
    }
}
either way I don't see an elegant functional solution
n
@Paul Griffith I don’t know why you think it’s “the opposite of a functional pattern”. The function is in the standard library of languages that are more functional than Kotlin, such as Clojure.
i
What is you actual use case then? How do you compare consecutive elements? What is the name of that function in Clojure?
n
The consecutive elements are passed to the predicate.
As shown in the example at the top of this thread
My actual use case is partitioning lists of machine instructions into blocks by whether pairs of instructions have deterministic relative timing or not in the instruction processor.
But that’s by the by.
e
Haskell doesn't have a function like this -
Data.List.groupBy
is close, but it always performs comparisons between the first item of a group and the item to be considered (in your original example, it'll compare "a" and "C", not "b" and "C")
I'm not familiar with Clojure, but at a glance, it doesn't either. it does have
group-by
, which is like Kotlin's
groupBy
, and
partition-by
, which is similar to aforementioned proposal on YouTrack.
n
You’re right. I was thinking of partition by in Clojure. In Kotlin we have windowed and zipWithNext which apply a function consecutive elements of a collection, and then the elements can be combined into sublists with a fold. I worry that using fold to collect into Java lists will involve a lot of copying, so use imperative code and a list builder instead.
e
the sequence version I left earlier could be easily made resilient against misuse just by list-ifying it, e.g.
Copy code
fun <T> Iterable<T>.splitWhen(predicate: (T, T) -> Boolean): List<List<T>> = this.asSequence().splitWhen(predicate).map { it.toList() }.toList()
or avoiding Sequence by building the lists in-line, as you mentioned
in any case, I can see some value in it, but maybe not a lot - no other functional language I'm aware of has quite what you're looking for built-in. Haskell's
groupBy
is close, but it really only works with equivalence classes, as it does not use consecutive values