I m using <http Dispatchers IO|Dispatchers IO> while reading kotlinlang #coroutines

I'm using <Dispatchers.IO> while reading data from...

Tim Malseed

10/27/2020, 1:08 AM

I'm using Dispatchers.IO while reading data from files on disk. It can take anywhere from 1-5 mins to complete. But, it's using up basically all of the available resources. Is there a way to kind of cap the amount of work being done on a particular dispatcher?

Tim Malseed

10/27/2020, 1:09 AM

For example, create some kind of Dispatcher similar to the IO one, but it only uses 75% of the available threads?

Zach Klippenstein (he/him) [MOD]

10/27/2020, 1:11 AM

Use a semaphore to limit how much work you're doing

Tim Malseed

10/27/2020, 1:12 AM

Could you elaborate a little?

TwoClocks

10/27/2020, 1:16 AM

I'm confused. Are you launching more the one disk operation at a time?

TwoClocks

10/27/2020, 1:17 AM

because one launch shouldn't ever use more than one thread. at least I don't think so...

Tim Malseed

10/27/2020, 1:19 AM

The operation involves reading a file from disk, then parsing it. It's bottlenecked by CPU, rather than disk access

TwoClocks

10/27/2020, 1:19 AM

makes sense. It still shouldn't use more than one CPU though. (unless it spawns other threads somehow).

Tim Malseed

10/27/2020, 1:20 AM

I'm not sure what happens under the hood. Running these operations in parallel is about 7x faster (on my device). But, it means all other coroutine based operations are pretty much suspended until this work is complete

TwoClocks

10/27/2020, 1:21 AM

yeah. Dispatches.IO == threads. it's the same if you spun up a bunch of threads that don't yeid anywhere. They take up all the reources.

Tim Malseed

10/27/2020, 1:22 AM

Ok. So, should I be using some other Dispatcher?

TwoClocks

10/27/2020, 1:23 AM

maybe.. depends. trying doing the file IO from a Dispatcher.IO but do the CPU work from a different type of dispatcher.

Tim Malseed

10/27/2020, 1:24 AM

I don't have the luxury - it's all happening in native code, under a single function entry point

TwoClocks

10/27/2020, 1:24 AM

but you probably need to look at the guts of the CPU intensive code, and figure out if it's creating threads, or what it's doing.

TwoClocks

10/27/2020, 1:24 AM

is the code kotlin and coroutine aware, or is it java?

Tim Malseed

10/27/2020, 1:25 AM

I'm calling a native function, which takes a file descriptor, reads in the file and parses it, and returns an object back to Java. This is called from Kotlin, with Coroutines

TwoClocks

10/27/2020, 1:26 AM

oh.. native like... C or some other language.. nmm

Tim Malseed

10/27/2020, 1:26 AM

Yes, sorry for not being clear. C++/JNI

Tim Malseed

10/27/2020, 1:26 AM

I think there's nothing wrong with the approach. I just don't want to use max threads for this task

TwoClocks

10/27/2020, 1:26 AM

you can try a different dispatcher, but I don't think it'll matter. The C code is just gonna run (unless you modify it) you can't control it from the JVM.

TwoClocks

10/27/2020, 1:27 AM

I'm not sure what you mean by "max threads"?

Tim Malseed

10/27/2020, 1:28 AM

I don't want to use all available resources to do this work

TwoClocks

10/27/2020, 1:28 AM

you mean JVM threads, or just all the processors are being used?

TwoClocks

10/27/2020, 1:29 AM

are you sure you arn't launching more that one of these native things at a time?

Tim Malseed

10/27/2020, 1:29 AM

Whatever is going on behind the scenes, when I pass 3500 tasks to be completed with Dispatchers.IO - I would like it to use less than the maximum available resources

TwoClocks

10/27/2020, 1:29 AM

oh! yeah!

TwoClocks

10/27/2020, 1:29 AM

that's your poblrm!

Tim Malseed

10/27/2020, 1:29 AM

I am intentionally launching more than one native task at a time

TwoClocks

10/27/2020, 1:29 AM

don't do that

Tim Malseed

10/27/2020, 1:29 AM

lol

Tim Malseed

10/27/2020, 1:29 AM

That's not helpful at all

TwoClocks

10/27/2020, 1:30 AM

you can limit the # of thread Dispatcher.IO will use.

Tim Malseed

10/27/2020, 1:30 AM

I want parallelism. But I don't want to use max available resources.

TwoClocks

10/27/2020, 1:30 AM

yeah. you should launch N, and wait for one to finish before you launch the next, where N is < the number of CPUs you have.

TwoClocks

10/27/2020, 1:31 AM

I think there is launch code to do that for you.

Tim Malseed

10/27/2020, 1:31 AM

Yeah, so that's what I'm asking about here. I don't want to actually calculate number of available CPUs if I don't have to. I'm hoping there's some sort of Coroutine abstraction that does this for me.

TwoClocks

10/27/2020, 1:31 AM

https://kotlin.github.io/kotlinx.coroutines/kotlinx-coroutines-core/kotlinx.coroutines/-dispatchers/-i-o.html

TwoClocks

10/27/2020, 1:33 AM

I remember looking at the constructor for Dispachers.IO. It knows how many CPUs there are, and you can pull that value out of it. Use that-1 for the cap for the # of threads and you should be fine.

Tim Malseed

10/27/2020, 1:37 AM

OK, sounds like a reasonable approach

TwoClocks

10/27/2020, 1:38 AM

here ya go

Runtime.getRuntime().availableProcessors()

TwoClocks

10/27/2020, 1:38 AM

limit the number of threads to that -1 and your system won't get swamped.

Tim Malseed

10/27/2020, 1:40 AM

Yeah. I'm wondering whether cores - 1 is the optimal number. I actually have no idea - I don't want to jam everything up, but I also want this to complete fairly quickly (relatively speaking)

TwoClocks

10/27/2020, 1:41 AM

well, once you exceede the number of CPUs, it's not going to go any faster. in fact it'll likely slow down (assuming it's all CPU bound).

Tim Malseed

10/27/2020, 1:42 AM

Does this assume no hyperthreading (possibly dumb question for mobile devices)

TwoClocks

10/27/2020, 1:43 AM

in theory

availableProcessors()

should give you execution channels, cores or hyperthreads or whatever.

Tim Malseed

10/27/2020, 1:43 AM

Ah OK

Tim Malseed

10/27/2020, 1:44 AM

Thanks for your help

TwoClocks

10/27/2020, 1:44 AM

in reality, I'm not sure anyone uses hyperthreads any more. That was kinda a 90's intel thing...

TwoClocks

10/27/2020, 1:44 AM

but I'm not up on my ARM microcode these days.

5 Views

Open in Slack

Previous Next