https://kotlinlang.org logo
#announcements
Title
# announcements
j

jeggy

10/31/2019, 1:19 PM
Could someone help me find the source code of hashCode for data classes in Kotlin?
I have a data class with about 150 fields defined in the primary constructur, and I think the hashCode generated for 2 different objects in different server environment but both containing the exact same data are not resulting in the same hashCode.
k

karelpeeters

10/31/2019, 1:32 PM
Do you want the compiler code that generates it or the generated code?
j

jeggy

10/31/2019, 1:33 PM
I'm just interested in the exact formula that Kotlin uses when I call
myDataClassInstance.hashCode()
k

karelpeeters

10/31/2019, 1:35 PM
You can easily look at the generated code in InteliJ: Tools > Kotlin > Show Kotlin bytecode > Decompile
👍 1
You don't happen to be using an array or something else that doesn't have the expected
hashcode
implementation?
j

jeggy

10/31/2019, 1:39 PM
the data class is extending an abstract class with one
abstract string val
and one
open var of a enum class
. Other than that my data class only has
String
and
optional Int val
k

karelpeeters

10/31/2019, 1:50 PM
Let us know what you figure out!
j

jeggy

10/31/2019, 1:52 PM
Do you know if the hashCode of a enum class value promises to return the same value?
k

karelpeeters

10/31/2019, 1:53 PM
Yes it does. Ah wait it's between different JVMs? Then I'm not sure actually.
j

jeggy

10/31/2019, 1:54 PM
yes it's between JVMs
k

karelpeeters

10/31/2019, 1:56 PM
Looks like no in theory, although I don't know which classes change in practice. https://stackoverflow.com/questions/1516843/java-object-hashcode-result-constant-across-all-jvms-systems
d

Dico

10/31/2019, 2:00 PM
Enum hashCode implementation = enum ordinal
But your code should never rely on this implementation detail
👍 1
j

jeggy

10/31/2019, 2:31 PM
we have a service that delivers data to customers and all changes afterwards to this data. We have been using data classes for this data and saving the hashCode from this dataclass in a database and then whenever this hashCode changes we resend it. But somehow now, the hashCode generated on the server is not the same as the hashCode generated locally and this has been running for years for us.
Decision has been made and we will do a new version of this that doesn't rely on hashCode 🙂
Anyway, thanks for all the help
You guys think it would be wrong to do a
md5(data.toString)
?
d

Dico

10/31/2019, 2:55 PM
For this purpose, I would use SHA1. MD5 is not good enough. SHA1 can also be broken but it requires a ridiculous amount of computing power. SHA1 creates a 20-byte string. You could use something like this:
Copy code
// do not share between threads
class DataHashCtx {
    private val md = MessageDigest.getInstance("SHA-1")!!
    private val bb = ByteBuffer.allocate(20)!!

    fun reset() {
        md.reset()
        tmpBuf.clear()
    }

    fun update(value: Int) {
        bb.putInt(0, value)
        md.update(bb.array(), 0, Int.SIZE_BYTES)
    }

    fun update(value: String) {
        // String hashCode implementation is part of API and unchanged since JDK 1.2
        update(value.hashCode()) 
    }

    fun hash(): Hash {
        md.digest(bb.array(), 0, 20)
        // Store 20 bytes in some Hash class. 
        return hashOf(bb)
    }
}
Copy code
data class MyData {
    fun dataHash(ctx: DataHashCtx): Hash {
        ctx.reset()
        ctx.update(this.dataItem1)
        ctx.update(this.dataItem2)
        ctx.update(this.dataItem3)
        ctx.update(this.dataItem4)
        ctx.update(this.dataItem5)
        return ctx.hash()
    }
}
But for an initial implementation you could well use data.toString().
a

Alowaniak

10/31/2019, 6:29 PM
Personally I'm not sure if relying on hashes being different is the best approach, i guess it depends on how bad it is if you do get a collision
j

jeggy

11/01/2019, 11:21 AM
we are using keys alongside these hashes, so collisions aren't really a concern for us. We have been using hashCode and I just checked, we did have a few collisions there now, but as we are using a key alongside this, it just needs to not to collide for the same key. So for us going from a 2^32 to 2^180, should be just fine 🙂
2 Views