Is there a way to read/write a file without blocki...
# coroutines
l
Is there a way to read/write a file without blocking a thread at all that I could put in a simple extension function? Right now, the easiest way I know is blocking a thread from
<http://Dispatchers.IO|Dispatchers.IO>
and using
readText()
or
readBytes()
, but this still blocks a thread while we are waiting for storage and not the CPU.
d
You could use Java's
AsynchronousFileChannel
, which is callback based and could be adapted to
suspend
functions.
đź‘Ť 1
p
I don't get it. Doesn't somewhere a thread has to be blocked while performing disc read?
o
technically yes, but with async io (on the JVM) it is only ever one thread in total that waits for the IO, and then runs the callback. in theory it could be even better, because at the native level IO can be completely non blocking due to
select
, but in Java async io simply runs
select
forever in a jvm-managed thread
l
Doesn't Vertx do this, which is like Nodejs on JVM where it has an evenloop and using epoll to just place a task on the queue of the eventloop when data is ready
o
yes, but since java ... 8? there is native support for it using the aforementioned
AsynchronousFileChannel
s
There was this, I have a feeling it was deprecated/removed but might be a basis for something - https://github.com/Kotlin/kotlinx.coroutines/tree/87eaba8a287285d4c47f84c91df7671fcb58271f/integration/kotlinx-coroutines-nio
l
Interesting! @elizarov Is there a reason this Java NIO integration got removed since then?
e
AsynchronousFileChannel is not a good solution for async file IO.
And the APIs based on those channel were not easy to use
l
So you'd recommend me to keep using
<http://Dispatchers.IO|Dispatchers.IO> { file.readText() }
?
s
Yeah, if @elizarov says AsynchronousFileChannel is a bad solution then that's good enough for me but I would be very interested in knowing why that is.
e
First of all, it does not really work. On Linux there’s really no async file api that would work across different file systems. Essentially, what
AsynchronousFileChannel
does inside is not different from
<http://Dispatchers.IO|Dispatchers.IO> { file.readText() }
, but with much more overhead and much less convenenient API on top of that.
u
I think posix does not define non-blocking io on files. Only on sockets.
https://www.remlab.net/op/nonblock.shtml
Copy code
Regular files are always readable and they are also always writeable. This is clearly stated in the relevant POSIX specifications. I cannot stress this enough. Putting a regular file in non-blocking has ABSOLUTELY no effects other than changing one bit in the file flags.
Linux being the exception, its posix AIO implementation in glibc emulates async operations with user level threads, whereas its native async I/O interface (io_submit() etc.) are truly asynchronous all the way down to the driver, assuming the driver supports it.
l
So on Apple's platforms, we could use async filesystem I/O because its implementation being async, unlike Linux?
e
But you don’t write backend apps on Apple platforms anyway, so you should not really care
l
I wasn't thinking about writing a backend app but just making a program as efficient as possible so it can finally do more or allow other programs do more at the same time. But now, that makes me wondering… if it is more efficient to use macOS for I/O intensive apps rather than Linux, would that make macOS finally more cost efficient for backends? When Apple announced the new Mac Mini, they advertised rack/cluster capabilities and that some customers were doing it (e.g. for CI). They did the same when talking about the new Mac Pro.
e
It does not really matter. In fact, blocking IO is often more efficient (has higher throughput). The reason we want to use non-blocking IO for network is to scale our services to thousands of connections. It makes no difference for a typical disk IO usage.
🙏 1
g
@elizarov May I have a question, what would you recommend if the file reading operation should be cancellable?
e
I’d recommend to structure your code if a way that you wouldn’t need a cancellable file IO. In a rare event when you need to read a really large file which takes a lot of time you can read it in limited-size blocks and call
ensureActive
between blocks to support cancellation.
đź‘Ť 1
g
I see, thank you!
e
Most applications don’t read large files and do not care whether file IO is cancellable or not.
l
If we were to read very big text files (like
View.java
from Android, ~30K lines of code), the best thing to do would be to use
bufferedReader()
, reading chunk by chunk and calling
ensureActive()
before reading each chunk?
e
I don’t think that 30K lines is big enough to really care. I’d say if it takes <1s don’t do anything special. No benefit.
l
A better example would be insanely huge log files, I recall waiting 5+ seconds to open one like these in SublimeText in the past, but yeah, I agree about your point that it's not really needed if it's sub-second.
e
Yes. If are writing log viewer app you’d better avoid reading it all into memory in the first place. Then if user does something like “search” you’d design it in such a way that it not just periodically checks for cancellation, but also periodically updates UI to report progress.
đź‘Ť 1