I got a bug report that my app doesn't support chi...
# squarelibraries
c
I got a bug report that my app doesn't support chinese characters when saving a user entered field. I thought retrofit + okhttp all operate using utf 8 (note, I'm not super comfortable with text encodings), but it doesn't seem like this is the case. This seems easily googleable, but most of the responses I get on stackoverflow have no answers, and the searching okhttp/retrofit github also has a bunch of questions that get closed without an answer from the author. My googling, might be failing me, but I did try. Am I missing something? Is this not something that okhttp/retrofit handle? Maybe it's moshi that I'm after? Appreciate any pointers.
z
Retrofit, Okhttp, Moshi all support utf8.
Why do you suspect this is a network IO issue?
c
@Zach Klippenstein (he/him) [MOD] is it utf8 out of the box though, or do I have to make some kind of change to support it? I suspect it's due to one of the networking libraries because (from what I read) androids edittext will already give me utf8 characters, and my backend developer is saying that android is sending incorrect data, whereas the ios app is not. So I'm definitely a little lost, but all I can narrow it down to is network layer libraries
z
Java strings are actually encoded as utf16, but okio will encode those in utf8 for you if you pass them as strings. Are you converting them to bytes manually anywhere?
c
Nope... I wonder if it's just not url encoded or something and that's messing it up?
There's definitely a chance that my backend team is wrong about this one, I just kind of don't know that much about this to give a canonical answer like you just did ie "Java strings are actually encoded as utf16, but okio will encode those in utf8 for you if you pass them as strings"
But this app is really simple, and we're really not doing anything crazy in terms of setup or modifying things. It's as plain vanilla as it gets with okhttp + moshi + retrofit, so maybe it isn't my fault? I'm just not sure of what the best next steps are for me to move this forward.
z
Have you looked at the bytes being sent on the wire?
c
That was my next question... If I sniff the traffic over charles, i would have access to the data there... would that be enough to copy some text out and see what format it's encoded in?
I kind of don't get the concept of encoding. Like even if I sniff the traffic in charles, how can I tell what encoding something is in charles?
z
If you get a sample of ASCII text, you can at least tell if it's utf16 by how many bytes each character takes.
Did your backend counterpart elaborate on what "incorrect data" is specifically?
c
They said chinese or swedish chars don't work. with an example of `èx무mp`l@gmail.com`
Anyway, thanks @Zach Klippenstein (he/him) [MOD] for the help. I think the "java is utf16 and okio by default will convert things to utf8" is the information I really needed.
z
That would only be an issue if you're converting strings to bytes manually with a different encoding, which is unlikely if you're just using Moshi/retrofit as normal. Again, not sure what "don't work" means. I would definitely start by looking at the network bytes to see what your app is actually sending, then work backward from there (eg using the debugger)
c
Thank you @Zach Klippenstein (he/him) [MOD] I'll report back with results. As it stands... charles seems to show that my characters are going through exactly as I expect.
@Zach Klippenstein (he/him) [MOD] just in case you're curious. It seems that my backend developer was in the wrong. He is looking into it, but I showed him that my charles shows that the characters go up to the server as they were typed in, while the response from the server came back as
?
I'm still not 100% sure whether the fact that my characters showed in charles actually means Im completely off the hook, but it would make sense if it did get me off the hook. Still trying to read into different encodings, etc. I'm assuming http only works in a single encoding.
z
Is your HTTP request correctly setting the encoder in the
Content-Type
header? Again pretty sure the square libraries should do this automatically, but worth verifying.
c
I seem to be sending up
application/json
z
Ah, I guess that parameter isn’t defined for JSON, never mind (https://stackoverflow.com/a/26206930/1502069)
c
That SO link says that utf8 is the default for json.
👍 1
Anyway. Just wanted to say thanks again for your help. No need to spend any more of your time. Really appreciate the second pair of eyes!
👍 1
z
Good luck!