We have a function that converts on disk data into a stream kotlinlang #coroutines

We have a function that converts on-disk data into...

nkiesel

05/28/2020, 7:56 PM

We have a function that converts on-disk data into a stream of objects. We have multiple files and wanted to use coroutines to process them concurrently. The solution we came up with looks like this:

Copy code

fun readAll(): Stream<Metadata> {
    return runBlocking {
        channelFlow<Stream<Metadata>> {
            Type.supportedTypes.forEach { type ->
                launch(<http://Dispatchers.IO|Dispatchers.IO> + CoroutineName("Read all for $type")) {
                    send(metadataPersistenceHelper(type).readAll())
                }
            }
        }.reduce { accumulator, value -> Stream.concat(accumulator, value) }
    }
}

(

Type

has multiple supported types and

metadataPersistenceHelper

does the heavy IO). Does this makes sense or are we abusing coroutines here?

octylFractal

05/28/2020, 7:58 PM

unless you have some need to stick with Streams, it's probably better if you make it all Flow

☝️ 1

nkiesel

05/28/2020, 8:02 PM

No real reason to stick with streams; that is just a left-over from an earlier java implementation. We always consume all the results anyway, so I was actually thinking of converting to use

List

(and thus return

List<Metadata>

)

henrikhorbovyi

05/28/2020, 8:03 PM

I'm wondering if it's a good or bad ideia to use

runBlocking

in a non-test scenario 🤔

octylFractal

05/28/2020, 8:04 PM

it depends on how far upwards you can propagate the

suspend

modifier

octylFractal

05/28/2020, 8:04 PM

if you're being called by some Java code that won't know what a

suspend

function is, you either have to

launch

or use

runBlocking

henrikhorbovyi

05/28/2020, 8:04 PM

Got it

henrikhorbovyi

05/28/2020, 8:05 PM

@nkiesel are you calling this code from Java?

nkiesel

05/28/2020, 8:05 PM

yes (until we switch everything to Kotlin but that might be a few years...).

henrikhorbovyi

05/28/2020, 8:06 PM

😄 well... Okay

nkiesel

05/28/2020, 8:06 PM

Really I have 2 concerns: (1) does that do what we want (i.e. concurrently run the internal readAll) and (2) are we abusing coroutines here

octylFractal

05/28/2020, 8:07 PM

yes it works, I don't really know if it's abusing it but I can think of a better way to write this

nkiesel

05/28/2020, 8:08 PM

and perhaps (3) is

channelFlow

launch

reduce

the right approach (I do understand that it will blow up with a large set of supportedTypes but this comes from an enum which will never have mor ethan a dozen or so items)

octylFractal

05/28/2020, 8:18 PM

I'm curious, what does

metadataPersistenceHelper(type)

return?

nkiesel

05/28/2020, 8:27 PM

it returns a type-specific reader all implementing an interface containing

readAll

octylFractal

05/28/2020, 8:38 PM

so personally I prefer to implement like this:

Copy code

return runBlocking {
        flow {
            withContext(<http://Dispatchers.IO|Dispatchers.IO>) {
                Type.supportedTypes.forEach { type ->
                    val deferredStream = async(CoroutineName("Read all for $type")) {
                        metadataPersistenceHelper(type).readAll()
                    }
                    emit(deferredStream)
                }
            }
        }
            .buffer(Channel.UNLIMITED)
            .map { it.await() }
            .reduce { accumulator, value -> Stream.concat(accumulator, value) }
    }

since it's a little more sequential-like

nkiesel

05/28/2020, 9:37 PM

interesting! I will go through this and try too understand why we e.g. need the

.buffer(Channel.UNLIMITED)

here

octylFractal

05/28/2020, 9:41 PM

as a hint, the UNLIMITED part is not strictly necessary, it allows a bound on parallel tasks but since you said it was irrelevant (already bounded), I left it UNLIMITED

nkiesel

05/28/2020, 9:53 PM

in my original version, looks like I could use

channelFlow {

instead of

channelFLow<Stream<Metadata>>

, no?

octylFractal

05/28/2020, 9:54 PM

I think so, it might depend on the new inference system which is enabled in the IDE but not in the compiler by default

nkiesel

05/28/2020, 10:01 PM

Good point; forgot about that. But actually in this case also works when using

./gradlew jar

2 Views

Open in Slack

Previous Next