hi wave I m trying to convert some Kotlin JVM code to Kotlin kotlinlang #squarelibraries

hi :wave: I’m trying to convert some Kotlin/JVM c...

Adam S

05/31/2023, 4:04 PM

hi 👋 I’m trying to convert some Kotlin/JVM code to Kotlin Multiplatform, so I’m using Okio. The class in question reads an InputStream, determines the BOM (byte order mark), and then provides a correct stream. I can replace the InputStream with an Okio Source, and I can determine the correct encoding from the BOM, but I’m struggling to figure out how to use Okio to read a string with a different encoding. I’ve gotten pretty close, but if I try and

peek()

past the BOM, Okio always throws an exception. There’s no KDoc so I’m not clear what it’s for.

Adam S

05/31/2023, 4:05 PM

example.kt.cpp

Adam S

05/31/2023, 4:10 PM

hmm, it likes like that simple example is working but in my actual code it’s not. I’ll keep digging.

yschimke

05/31/2023, 4:26 PM

It won't really be KMP if you are using java.nio.charset.Charset

Adam S

05/31/2023, 7:49 PM

d’oh, I thought that Charset was multiplatform. Thanks for pointing it out.

jessewilson

06/01/2023, 9:08 AM

Okio doesn’t yet implement charsets other than UTF-8 in multiplatform. We could probably add UTF-16 and it’s endianness variants pretty easily, and similarly for UTF-32. Maybe even ISO-8859-1 because it’s a small mapping table. But it’s unlikely we’ll do anything that requires a larger mapping table. What charsets are you dealing with? In 2023 I claim that UTF-8 won and anything else should only be used for legacy system interop

Adam S

06/01/2023, 11:15 AM

It would be my preference to use UTF8 for everything too, but at the moment I’m porting SnakeYAML from Java to Kotlin Multiplatform. I want to be able to replicate the same functionality without platform-specific code. Being able to handle non-UTF8 charsets in commonMain would be particularly useful for the tests. SnakeYAML has good test coverage so I’ve been able to transfer (almost*) everything without editing the tests. *the only test I had to disable was a negative-test where Okio handles an invalid string successfully, which is fine by me! All I’d need is to be able to convert non-UTF8 into UTF8, I don’t need the reverse operations.

jessewilson

06/01/2023, 7:46 PM

You could probably handle non-UTF-8 specially by converting it to UTF-8 out-of-band?

jessewilson

06/01/2023, 7:50 PM

We don’t do this in OkHttp yet on anything but the JVM, but we do have a BOM detector https://github.com/square/okhttp/blob/a05ee927ebba6c23b9dd76c839e99ae3134bad02/okhttp/src/commonMain/kotlin/okhttp3/internal/-UtilCommon.kt#L48

Adam S

06/01/2023, 9:19 PM

You could probably handle non-UTF-8 specially by converting it to UTF-8 out-of-band?

if there’s an easy way to achieve it then that would be great! I’ve not heard about ‘out-of-band’ though, could you explain more please?

Adam S

06/01/2023, 9:19 PM

thanks for that OkHttp code, it’s helpful to see how Okio is supposed to be used properly

jessewilson

06/01/2023, 11:03 PM

For UTF-16, read the entire Buffer one char at a time to make an Array<Char>, then use a stdlib function to convert that into a String, then call Buffer.writeUtf8() to get a Buffer back

jessewilson

06/01/2023, 11:03 PM

For UTF-16LE, do the same thing but swap the first and last bytes of the

Char

before creating a string

jessewilson

06/01/2023, 11:04 PM

For UTF-32, read the buffer an Int at a time, then call Buffer.writeUtf8CodePoint() with each int to transcode

28 Views

Open in Slack

Previous Next