Hi I need an identifier for a cache that is based ...
# announcements
d
Hi I need an identifier for a cache that is based on four strings and two ints. I need to do this without allocating (because I'll generate many of them in a tight loop). I think that java's hashCode for strings is always the same for the same characters, but I'm not quite sure how to combine multiple hashCodes
n
well, hashcodes aren't sufficient for identifiers because they could be the same...but re: multiple hashcodes, IntelliJ can generate hashcode methods for you from the fields of the object.
d
Yeah, my problem is that I can't create a class instance (that would be an allocation).
I'm thinking something like string1.hashCode() shl someN + string2.hashCode() shl someN ...
So I bitshift and add to combine them
n
is this 4-strings-2-ints thing not encapsulated in a single class?
d
It was, but I need to make a lot of them in a hot loop, and I'm trying to avoid the allocation
m
Doesn’t a cache mean that you have a hash map of some sorts already? I.e. at least one allocation. How’s the cache implemented? Maybe you can merge the hashing in there? Otherwise if you want to avoid an allocation and can’t piggyback on an already existing allocation then probably the best you can do is to use a 64 bit hash algorithm with low probability of collision. That fits into a
Long
. Maybe this is of help, but I don’t know what the collision probability is: https://lemire.me/blog/2018/08/15/fast-strongly-universal-64-bit-hashing-everywhere/
d
I didn't realize that. I'm totally new to programming while thinking about allocation. All I noticed was that the IDE didn't highlight a hash map insert
Right now I have a HashSet<Int> where Int is my hash
m
The IDE only highlights if you make unintentional copies of a map or list. What kind of
HashSet
do you use? The default uses
LinkedHashSet
which in turn uses
HashMap
. That’s probably two allocations per insertion if I’m not mistaken: One for the new node and one for boxing the
Int
.
d
Oh. Where can I learn about what allocates? I think I need to learn from something systemic
m
Have you measured that there actually is a performance problem? Usually allocation of small objects is quite fast.
d
I use the default HashSet
I guess if it isn't a problem, I shouldn't worry
In that case, should I just allocate a new data class and ignore the warning?
m
Regarding allocation of insert you’d need two things: • Look into the implementation of
add()
. If you follow it you’ll see at some point that it creates a new Node instance for each new entry. • The knowledge that every primitive type like
Int
etc. causes an allocation when using it in a generic context (e.g. putting it into a Map or List). There are some exceptions (values between -128 and +127).
What warning exactly?
d
The IDE warning about the allocation
m
Can you please share it here? Otherwise it's not fully clear what warning you're referring to.
d
Sure! This whole thing started because of the Android Studio warning "Avoid allocations during draw/layout operations, preallocate instead"
It's what made me look into computing hashes instead of taking the hash of an object
m
Ah. In that case you should usually ask yourself why you make that allocation in the draw/layout operation instead of either before that or in the background.
d
Every draw I check if a bunch of tiles are in a cache. For each of them if they are cached I draw them, otherwise I enqueue them to be fetched in the background
A tile is uniquely identified by a few factors. I used to have a
HashMap<TileIdentifier, Tile>
, now I have a
HashMap<Int, Tile>
Instead of
TileIdentifier(val a: String, val b: String)
(simplified example) I now compute
a.hashCode() * 31 + b.hashCode()
m
Hmm how do you determine the tile identifier? Aren’t tiles drawn next to each other and you can identify them by position/index or x|y coordinate?
Yeah
hashCode
is probably too unreliable for that
It has no guarantees about collisions. They could always return
0
and it’s still a valid implementation.