<@U4CP1J0QP> I will need to revisit IO soon and wi...
# io
a
@e5l I will need to revisit IO soon and will try to bring my fork more in direction of primary repository. Ther is still one important question about usage of
Bytes
though. Will it be possible to use bytes for multi-read operations? The major problem for me so far was that once input is called on
Bytes
, it becomes invalid. If this is still the case, I will still need to introduce
Binary
for lazy multi-read and I think that the API should be somehow changed to avoid rereading
Bytes
object.
e
Hi, @altavir. We’re planning to make
Bytes
internal, and provide
BytesInput/BytesOutput
instead(with
size
and
remaining
).
BytesInput
should work well for caching purposes via
preview
and
discard
. Could you tell me if you have different usecases to cover?
a
Indeed, I have different needs. We quite frequently need to work with continuous random-access multi-read binary blocks like ByteArray. Basically what I need is to be able to read from it via
Input
fill it via
Output
and be able to read from it multiple times in different places in code (and be sure it won't be deleted until GC hits). The API for such binary blocks could be completely external, but for one point.
ByteArray
is not the only structure that has this behavior. The same goes for random access files. It does not make sense to polute memory with long-living large objects. when you can re-read from file. My proposal is to introduce this Binary interface for multi-reads, then add one method to Input API, which will allow to read a Binary with given length. For most Inputs it will use default ByteArray inplementation, but for specific inputs like ByteArrayInput or FileInput, it could use more optimized approach (view-based for arrays and file-system based for files).
j
The same goes for random access files. It does not make sense to polute memory with long-living large objects. when you can re-read from file.
mmap ^
a
🤦‍♂️
j
how is what you said anything original or unique, given that mmap does the exact thing you specified, but without being named?
this is microsoft's slide for what you said.
"File mapping allows the process to use both random input and output (I/O) and sequential I/O. It also allows the process to work efficiently with a large data file, such as a database, without having to map the whole file into memory. Multiple processes can also use memory-mapped files to share data."
perhaps this is not specific enough ?
a
mmap does not have anything to do with the question being discussed. We were talking about IO api on inputs and outputs. Files are just examples. We could talk about Apache Arrow buffers, or Java ByteBuffer in a same way. Your favorite memory-mapped files need the same API to work through Input/Output interfaces.
j
actually, you just described mmap, and for whatever, you implied that @e5l should give special considerations to the two things you need, one of which is identical in every word to mmap functionality.
a
I described any buffer with random axess functionality. Could you please read previous discussion and possibly see how Input and Output are currently implemented and why it is a problem to introduce a constant buffer into the API?
j
no interest, at all, but i would probably say that upon my last time donated to kotlin IO which may be slightly dated now, and with no links, not worth my time researching, there was not enough to say we've got efficient mappings to the underlying libc and posix kernel assumptions related to every modern runtime.
a
Then we shall probably continue working on API. With good API someday, it will be possible to write implementation matching to your standards.
j
in 2019 i had to write my own from jvm options. kotlin can avoid some of thos mistakes, but not by avoiding the options.
really, it seems as simple as: take a kernel IO handle, and make some object for its lifecycle and actions as kotlin. if there's some other actual outcome than mapping kernel handles like leveraging c++ iostreams, someone's doing it wrong.