https://kotlinlang.org logo
#coroutines
Title
# coroutines
t

Tim Malseed

10/27/2020, 1:08 AM
I'm using Dispatchers.IO while reading data from files on disk. It can take anywhere from 1-5 mins to complete. But, it's using up basically all of the available resources. Is there a way to kind of cap the amount of work being done on a particular dispatcher?
For example, create some kind of Dispatcher similar to the IO one, but it only uses 75% of the available threads?
z

Zach Klippenstein (he/him) [MOD]

10/27/2020, 1:11 AM
Use a semaphore to limit how much work you're doing
t

Tim Malseed

10/27/2020, 1:12 AM
Could you elaborate a little?
t

TwoClocks

10/27/2020, 1:16 AM
I'm confused. Are you launching more the one disk operation at a time?
because one launch shouldn't ever use more than one thread. at least I don't think so...
t

Tim Malseed

10/27/2020, 1:19 AM
The operation involves reading a file from disk, then parsing it. It's bottlenecked by CPU, rather than disk access
t

TwoClocks

10/27/2020, 1:19 AM
makes sense. It still shouldn't use more than one CPU though. (unless it spawns other threads somehow).
t

Tim Malseed

10/27/2020, 1:20 AM
I'm not sure what happens under the hood. Running these operations in parallel is about 7x faster (on my device). But, it means all other coroutine based operations are pretty much suspended until this work is complete
t

TwoClocks

10/27/2020, 1:21 AM
yeah. Dispatches.IO == threads. it's the same if you spun up a bunch of threads that don't yeid anywhere. They take up all the reources.
t

Tim Malseed

10/27/2020, 1:22 AM
Ok. So, should I be using some other Dispatcher?
t

TwoClocks

10/27/2020, 1:23 AM
maybe.. depends. trying doing the file IO from a Dispatcher.IO but do the CPU work from a different type of dispatcher.
t

Tim Malseed

10/27/2020, 1:24 AM
I don't have the luxury - it's all happening in native code, under a single function entry point
t

TwoClocks

10/27/2020, 1:24 AM
but you probably need to look at the guts of the CPU intensive code, and figure out if it's creating threads, or what it's doing.
is the code kotlin and coroutine aware, or is it java?
t

Tim Malseed

10/27/2020, 1:25 AM
I'm calling a native function, which takes a file descriptor, reads in the file and parses it, and returns an object back to Java. This is called from Kotlin, with Coroutines
t

TwoClocks

10/27/2020, 1:26 AM
oh.. native like... C or some other language.. nmm
t

Tim Malseed

10/27/2020, 1:26 AM
Yes, sorry for not being clear. C++/JNI
I think there's nothing wrong with the approach. I just don't want to use max threads for this task
t

TwoClocks

10/27/2020, 1:26 AM
you can try a different dispatcher, but I don't think it'll matter. The C code is just gonna run (unless you modify it) you can't control it from the JVM.
I'm not sure what you mean by "max threads"?
t

Tim Malseed

10/27/2020, 1:28 AM
I don't want to use all available resources to do this work
t

TwoClocks

10/27/2020, 1:28 AM
you mean JVM threads, or just all the processors are being used?
are you sure you arn't launching more that one of these native things at a time?
t

Tim Malseed

10/27/2020, 1:29 AM
Whatever is going on behind the scenes, when I pass 3500 tasks to be completed with Dispatchers.IO - I would like it to use less than the maximum available resources
t

TwoClocks

10/27/2020, 1:29 AM
oh! yeah!
that's your poblrm!
t

Tim Malseed

10/27/2020, 1:29 AM
I am intentionally launching more than one native task at a time
t

TwoClocks

10/27/2020, 1:29 AM
don't do that
t

Tim Malseed

10/27/2020, 1:29 AM
lol
That's not helpful at all
t

TwoClocks

10/27/2020, 1:30 AM
you can limit the # of thread Dispatcher.IO will use.
t

Tim Malseed

10/27/2020, 1:30 AM
I want parallelism. But I don't want to use max available resources.
t

TwoClocks

10/27/2020, 1:30 AM
yeah. you should launch N, and wait for one to finish before you launch the next, where N is < the number of CPUs you have.
I think there is launch code to do that for you.
t

Tim Malseed

10/27/2020, 1:31 AM
Yeah, so that's what I'm asking about here. I don't want to actually calculate number of available CPUs if I don't have to. I'm hoping there's some sort of Coroutine abstraction that does this for me.
I remember looking at the constructor for Dispachers.IO. It knows how many CPUs there are, and you can pull that value out of it. Use that-1 for the cap for the # of threads and you should be fine.
t

Tim Malseed

10/27/2020, 1:37 AM
OK, sounds like a reasonable approach
t

TwoClocks

10/27/2020, 1:38 AM
here ya go
Runtime.getRuntime().availableProcessors()
limit the number of threads to that -1 and your system won't get swamped.
t

Tim Malseed

10/27/2020, 1:40 AM
Yeah. I'm wondering whether cores - 1 is the optimal number. I actually have no idea - I don't want to jam everything up, but I also want this to complete fairly quickly (relatively speaking)
t

TwoClocks

10/27/2020, 1:41 AM
well, once you exceede the number of CPUs, it's not going to go any faster. in fact it'll likely slow down (assuming it's all CPU bound).
t

Tim Malseed

10/27/2020, 1:42 AM
Does this assume no hyperthreading (possibly dumb question for mobile devices)
t

TwoClocks

10/27/2020, 1:43 AM
in theory
availableProcessors()
should give you execution channels, cores or hyperthreads or whatever.
t

Tim Malseed

10/27/2020, 1:43 AM
Ah OK
Thanks for your help
t

TwoClocks

10/27/2020, 1:44 AM
in reality, I'm not sure anyone uses hyperthreads any more. That was kinda a 90's intel thing...
but I'm not up on my ARM microcode these days.
5 Views