hi :wave: I’m trying to convert some Kotlin/JVM c...
# squarelibraries
a
hi 👋 I’m trying to convert some Kotlin/JVM code to Kotlin Multiplatform, so I’m using Okio. The class in question reads an InputStream, determines the BOM (byte order mark), and then provides a correct stream. I can replace the InputStream with an Okio Source, and I can determine the correct encoding from the BOM, but I’m struggling to figure out how to use Okio to read a string with a different encoding. I’ve gotten pretty close, but if I try and
peek()
past the BOM, Okio always throws an exception. There’s no KDoc so I’m not clear what it’s for.
example.kt.cpp
hmm, it likes like that simple example is working but in my actual code it’s not. I’ll keep digging.
y
It won't really be KMP if you are using java.nio.charset.Charset
a
d’oh, I thought that Charset was multiplatform. Thanks for pointing it out.
j
Okio doesn’t yet implement charsets other than UTF-8 in multiplatform. We could probably add UTF-16 and it’s endianness variants pretty easily, and similarly for UTF-32. Maybe even ISO-8859-1 because it’s a small mapping table. But it’s unlikely we’ll do anything that requires a larger mapping table. What charsets are you dealing with? In 2023 I claim that UTF-8 won and anything else should only be used for legacy system interop
a
It would be my preference to use UTF8 for everything too, but at the moment I’m porting SnakeYAML from Java to Kotlin Multiplatform. I want to be able to replicate the same functionality without platform-specific code. Being able to handle non-UTF8 charsets in commonMain would be particularly useful for the tests. SnakeYAML has good test coverage so I’ve been able to transfer (almost*) everything without editing the tests. *the only test I had to disable was a negative-test where Okio handles an invalid string successfully, which is fine by me! All I’d need is to be able to convert non-UTF8 into UTF8, I don’t need the reverse operations.
j
You could probably handle non-UTF-8 specially by converting it to UTF-8 out-of-band?
a
You could probably handle non-UTF-8 specially by converting it to UTF-8 out-of-band?
if there’s an easy way to achieve it then that would be great! I’ve not heard about ‘out-of-band’ though, could you explain more please?
thanks for that OkHttp code, it’s helpful to see how Okio is supposed to be used properly
j
For UTF-16, read the entire Buffer one char at a time to make an Array<Char>, then use a stdlib function to convert that into a String, then call Buffer.writeUtf8() to get a Buffer back
For UTF-16LE, do the same thing but swap the first and last bytes of the
Char
before creating a string
For UTF-32, read the buffer an Int at a time, then call Buffer.writeUtf8CodePoint() with each int to transcode