A modern programming language that makes developers happier.

kotlinlang

Is unicode encoding in string literals working as it is supposed to? I’m trying to encode a Flag emoji character. They are encoded as to regional indicators, basically the 2 char country iso code. So a german flag would be `\u1f1e9\u1f1ea`  but for some reason only the first 4 chars of each unicode codepoint are recognized as part of the character. Do I have to encode unicode codepoints as a surrogate pair if they aren’t part of the BMP or is there a better way to do this?

Before I go dig up my code that uses it are you sure your editor can display it ?

No, not really. I’m working on a game engine so this is part of a test I’m writting. I also printed all the codepoints of the string and the first codepoint is `\u1f1e`  followed by a `9`

Also looking at intellij syntax higlighting, the 9 has a different color than the rest of the unicode char

We dont use it often but Im pretty sure it works:
```private val wantedRegex: Regex by lazy { "^[a-zA-Z]{2}$".toRegex() }
private fun isoToUnicodeFlag(iso: String) : String? {
    return if (wantedRegex.matches(iso)) {
        val first = charToFunkyIsoChar(iso[0].toLowerCase())
        val second = charToFunkyIsoChar(iso[1].toLowerCase())
        "\uD83C$first\uD83C$second"
    } else null
}

//<https://en.wikipedia.org/wiki/Regional_Indicator_Symbol#Unicode_Block>
private fun charToFunkyIsoChar(char: Char) : Char {
    return ((char.toInt() - 'a'.toInt()) + '\uDDE6'.toInt()).toChar()
}```


Its odd that I dont use the same \u1f1e as you do

You are encoding them as surrogate pairs. This means you represent them as if they are saved in utf16. The real codepoint is U+1f1e9 which is also the representation in utf32.

I guess the problem is that the jvm represents strings in utf16 so a char literal can’t contain any unicode characters that need more than 16 bits, even if the literal is used for strings.

“Ah right, Yeah” — my reaction exactly. I’m trying for the last 3 days to wrap my head around unicode and how to use it on the jvm. Ever tried to ask kotlin what the length of this string is `"👨‍👩‍👦‍👦"` ? I think the answer is 6 or 7.

Unicode is always a headscratcher :upside_down_face: