If I have a rather large `Map<String, String&gt...
# getting-started
u
If I have a rather large
Map<String, String>
(keyed labels of UI elements), which is created dynamically via fetching the labels from api -- so not statically with string literals. Does it make sense to implement a
Trie
or is this somehow optimized under the hood in jvm since
String
is so special? I remember faintly about string interning, but that only applies to string literals and deals with copies, right? It might be more memory efficient, however what about lookup? I believe there are native implementations of this inside when simply matching
Strings
, no? Also, turning unicode strings to char arrays won't work all that great?
c
how large is "rather large"?
u
okay its not really really large, about 600 keys, hwoever they mostly looked like this
Copy code
"usage_weekly": "/ týždeň",
"used_1_data": "použitý",
"used_1_min": "prevolaná minúta",
"used_1_sms": "odoslaná správa",
"used_24_data": "použité",
"used_24_min": "prevolané minúty",
"used_24_sms": "odoslané správy",
"used_x_data": "použitých",
"used_x_min": "prevolaných minút",
"used_x_sms": "odoslaných správ",
so, trie could optimize the size
but my question mostly is since
Char
is a class in kotlin, I presume
charArrayOf('a', 'b', 'c')
creates 3 Char instances on heap; which I fear would nulify anything; as opposed to java where its just primitive ints
Copy code
public class Char private constructor() : Comparable<Char> {
    public fun toInt(): Int
}
I dont even understand what is this, virtual method with no implementation in a concrete class, huh?
okay guess I all ask this one separate first, thanks!
c
i don't know the answer to your questions but I think if your map is <10k entries, or even 100k I wouldn't worry. definitely measure before you optimize
☝️ 1
d
@ursus Hi! Everything is an object in Kotlin. This does not mean they are not handled as primitives under the hood on the JVM. For instance,
Char
is handled under the hood as
char
on the JVM. However, there are cases where the JVM uses boxes around those primitives. For instance,
List<Char>
boxes the characters but
CharArray
does not. Also,
String
are interned on the JVM. I can't say about the performance however. In general, the JVM is very good at optimizing standard patterns. You should not try to prematurely and manually optimize things.
☝️ 1
u
yes but interning is for literals, isnt it?
e
JVM string constants are automatically in the intern pool. strings can be manually interned, and there are some cases where it may make sense - e.g. if you expect to be re-constructing the same string repeatedly - but in practice the string pool is a fixed size hashmap, so make sure to have a realistic system benchmark before you consider interning strings (it may affect other code unexpectedly due to contention or eviction)
☝️ 1
as for a trie, it depends... for most cases I would expect the complexity wouldn't be worthwhile even if it did save memory, which isn't a guarantee - if you don't implement trie compression, or if you do but there are many short branches, you'll end up with a lot of object overhead for the tree nodes. have a realistic benchmark to measure before you commit to a trie.
u
Yea I was worried about that node representation, even though hashmap has those entry objects, but this would require object per char; guess value types cant come soon enough
e
even if there were value types, an uncompressed trie is very wasteful. 12-byte object header + 16-bit char key + 32- or 64-bit pointer to data + padding = 32 or 40 bytes per character, while modern JVM String uses compact strings and can usually be 1 byte per character
u
yes that was kind of the reason i posted here, i expected jvm string to be heavily optimized, unless there is some trick with primitive arrays to store not pointer but index of next one etc btw whats a compressed trie? compressed where, memory? disc?
e
for a set {"abc1", "abc2"}, a compressed trie is {"abc" => {"1", "2"}} instead of {'a' => {'b' => {'c' => {'1', '2'}}}}
🤔 1
u
I see, thanks!