Hi everyone! I am a member of the Kotlin Libraries...
# stdlib
a
Hi everyone! I am a member of the Kotlin Libraries team. I would like to hear your thoughts on the Base64 API in the standard library. You can find the issue for KEEP discussion here: https://github.com/Kotlin/KEEP/issues/373, and the KEEP document itself here: https://github.com/Kotlin/KEEP/blob/base64/proposals/stdlib/base64.md Specifically, I would like to know what you think about the function InputStream.decodingWith(base64: Base64): InputStream. The input stream returned by this function interprets the padding character
'='
as the end of the symbol stream. As a result, subsequent symbols are not read, even if the end of the underlying input stream has not been reached. Do you find it counterintuitive that the returned input stream ignores symbols after the padding character? This behavior is similar to Java Base64.Decoder.wrap(InputStream)
An alternative would be to require that the underlying stream has indeed ended after the pad character is encountered.
The documentation on the website is outdated. It should read:
```* The requirement, prohibition, or optionality of padding in the input symbols
* is determined by the [PaddingOption] set for the [base64] instance.```
Please see the latest documentation on GitHub.
It’s also valuable to know if someone relies on Java behavior and what their use cases are.
c
An alternative would be to require that the underlying stream has indeed ended after the pad character is encountered.
Not fond of this. consumers of a stream (Base64 decoder) shouldn’t be coupled to additional context as to what comes before/after in the stream - they read what they need to and stop, leaving the stream for others to use if need be. Perhaps the stream contains unrelated content to be read afterwards. The original behaviour is consistent with that. “decode base64 from this stream” vs “decode this stream of base64" (former is the desired behaviour, it doesn’t presume the stream is solely/dedicated to base64 content) What if one had a stream of multiple base64-encoded things?
1
c
As a result, subsequent symbols are not read, even if the end of the underlying input stream has not been reached.
I like this behavior. Input streams are often used for low-level stuff. If I were parsing some data, I may want to parse a base64 string that's within a more complex data structure, so the function not touching the rest seems convenient to me.
3
a
Unfortunately, this won’t guarantee that the stream won’t read the rest of the symbols. The number of bytes Base64-encoded could be a multiple of 3, where no padding character will be present in the underlying symbol stream. I guess for this use case to work, the user must know the number of encoded bytes beforehand, and the input symbols must be necessarily padded to a multiple of 4 symbols. Am I missing anything? Do you comply with these conditions when decoding a stream?
c
Ah, I missed that part. I'm a bit confused by the intended use of this function is, then: • It can't be intended to read a base64 string from within a larger stream, since it doesn't let you specify how many characters to read, so it may read too many (if there's no padding) or too few (if there are multiple padding characters) • It can't be intended to read a base64 string that is the entirety of the stream, since it won't actually read it until the end / close the stream / etc