when reading unicode characters from an StringRead...
# announcements
p
when reading unicode characters from an StringReader, should I use method #read() returning [Int] values or is there something else to convert the characters into [kotlin.Char] first ? what is the correct way to represent unicode code points?
r
That's the correct method. The reason it returns
int
instead of
char
is so it can return a
-1
if you've reached the EOI.
p
so basically, after removing -1 Char and Int are equivalent?
kotlin could probably provider a “nicer” method readChar(): Char? instead I guess
since using Int for delivering -1 as a sentinel value feels like a quirk
r
This is a Java API, not a Kotlin API. On the JVM all primitives < 32 bits are internally `int`s.
p
yeah, I was just thinking of an extension method… thanks for clarifying this, Ruckus 🙂
I just do it like this now:
Copy code
input.read().takeIf { it != -1 }?.toChar()
r
Meh, it's not really any different than any other sentinel value, and has the advantage that it's returning a primitive value, so there's no need for object allocation etc.
That will work, but it will box the
Int
and the
Char
. Whether that matters is entirely dependent on your use case. In general it's most likely fine, but worth keeping in mind if it's in a hot loop and you see any performance issues.
p
I use it to implement a tokenizer used by a parser to evaluate expressions
thanks a lot for helping 🙂
👍 1
r
I agree it would be a much cleaner API to use some sort of
Option
or
Result
type, but unfortunately those aren't free on the JVM.
p
with the introduction of value classes it will probably come sooner or later
^ which are currently experimental afaik
‘value class Option<T>’ for the world :-)
r
I doubt it. There's way too much legacy in place to change that, and using
-1
as a sentinel is such a well established standard in Java at this point I doubt many will see the need (or want) to change it. I could be wrong though...
Even the lowly
indexOf
uses it.
p
true … legacy vs modern api = 1:0 i guess 😛
r
That's my guess, yeah
👍 1
e
even Kotlin's
indexOf()
uses -1 as a sentinel. that will stay unless there's an inexpensive alternative