I'm using <Dispatchers.IO> while reading data from...
# coroutines
t
I'm using Dispatchers.IO while reading data from files on disk. It can take anywhere from 1-5 mins to complete. But, it's using up basically all of the available resources. Is there a way to kind of cap the amount of work being done on a particular dispatcher?
For example, create some kind of Dispatcher similar to the IO one, but it only uses 75% of the available threads?
z
Use a semaphore to limit how much work you're doing
t
Could you elaborate a little?
t
I'm confused. Are you launching more the one disk operation at a time?
because one launch shouldn't ever use more than one thread. at least I don't think so...
t
The operation involves reading a file from disk, then parsing it. It's bottlenecked by CPU, rather than disk access
t
makes sense. It still shouldn't use more than one CPU though. (unless it spawns other threads somehow).
t
I'm not sure what happens under the hood. Running these operations in parallel is about 7x faster (on my device). But, it means all other coroutine based operations are pretty much suspended until this work is complete
t
yeah. Dispatches.IO == threads. it's the same if you spun up a bunch of threads that don't yeid anywhere. They take up all the reources.
t
Ok. So, should I be using some other Dispatcher?
t
maybe.. depends. trying doing the file IO from a Dispatcher.IO but do the CPU work from a different type of dispatcher.
t
I don't have the luxury - it's all happening in native code, under a single function entry point
t
but you probably need to look at the guts of the CPU intensive code, and figure out if it's creating threads, or what it's doing.
is the code kotlin and coroutine aware, or is it java?
t
I'm calling a native function, which takes a file descriptor, reads in the file and parses it, and returns an object back to Java. This is called from Kotlin, with Coroutines
t
oh.. native like... C or some other language.. nmm
t
Yes, sorry for not being clear. C++/JNI
I think there's nothing wrong with the approach. I just don't want to use max threads for this task
t
you can try a different dispatcher, but I don't think it'll matter. The C code is just gonna run (unless you modify it) you can't control it from the JVM.
I'm not sure what you mean by "max threads"?
t
I don't want to use all available resources to do this work
t
you mean JVM threads, or just all the processors are being used?
are you sure you arn't launching more that one of these native things at a time?
t
Whatever is going on behind the scenes, when I pass 3500 tasks to be completed with Dispatchers.IO - I would like it to use less than the maximum available resources
t
oh! yeah!
that's your poblrm!
t
I am intentionally launching more than one native task at a time
t
don't do that
t
lol
That's not helpful at all
t
you can limit the # of thread Dispatcher.IO will use.
t
I want parallelism. But I don't want to use max available resources.
t
yeah. you should launch N, and wait for one to finish before you launch the next, where N is < the number of CPUs you have.
I think there is launch code to do that for you.
t
Yeah, so that's what I'm asking about here. I don't want to actually calculate number of available CPUs if I don't have to. I'm hoping there's some sort of Coroutine abstraction that does this for me.
I remember looking at the constructor for Dispachers.IO. It knows how many CPUs there are, and you can pull that value out of it. Use that-1 for the cap for the # of threads and you should be fine.
t
OK, sounds like a reasonable approach
t
here ya go
Runtime.getRuntime().availableProcessors()
limit the number of threads to that -1 and your system won't get swamped.
t
Yeah. I'm wondering whether cores - 1 is the optimal number. I actually have no idea - I don't want to jam everything up, but I also want this to complete fairly quickly (relatively speaking)
t
well, once you exceede the number of CPUs, it's not going to go any faster. in fact it'll likely slow down (assuming it's all CPU bound).
t
Does this assume no hyperthreading (possibly dumb question for mobile devices)
t
in theory
availableProcessors()
should give you execution channels, cores or hyperthreads or whatever.
t
Ah OK
Thanks for your help
t
in reality, I'm not sure anyone uses hyperthreads any more. That was kinda a 90's intel thing...
but I'm not up on my ARM microcode these days.