I'd like to see extension methods like `equals`, `...
# stdlib
d
I'd like to see extension methods like
equals
,
hashCode
, and
toString
for collections that each take a lambda, so you can specify a specific field that you want to check for equality or a substitute method to determine if two elements are equal, and so on. I don't think there's a simple way to write the
equals
currently without involving sequences (preventing inlining), ensuring same size and then
all
on indices and manually comparing (which is how I expect a method to be implemented, edit: that requires efficient random access, using the iterators directly would be required), or mapping each list to the desired field to compare or hash or whatnot (which means creating two unnecessary intermediate collections).
e
I don't think this is generally possible without `map`'ing all the elements
e.g. in a hypothetical
Copy code
setOf(1 to 3, 1 to 4, 2 to 5).equalsBy(setOf(1 to 3, 2 to 4, 2 to 5)) { it.first }
do you want it to be set equality (requiring mapping all the elements to reduce duplicates) or list equality (relying on iteration order which isn't determinate for all sets) or something else?
d
I suppose general collections wouldn't be practical, but constraining to
List
and
Array
should be feasible enough
e
also not sure what you mean by
hashMap
method, but
.map(f).toString()
could be
.joinToString(prefix = "[", postfix = "]") { f(it).toString() }
without the intermediate list
d
Whoops, should be
hashCode
e
ah, then
.map(f).hashCode()
==
.fold(1) { acc, x -> 31 * acc + f(x).hashCode() }
anyhow, I'm not sure there's really that much general applicability. for equals-by, I more often encounter situations where I have a
List<A>
and a
List<B>
which I want to use different mappers on before comparing
d
That would also be a useful method,
listOfA.equals(listOfB) { a, b -> customEquals(a, b) }
c
@Derek Peirce but that's essentially
zip
+`all`
d
zip
creates an intermediate list of pairs, which shouldn't be necessary. If the two lists are usually not equal, so the method fails fast, we've made it
Ω(N)
unnecessarily and created N+1 objects. We could
zip
with `Sequence`s instead, but now we've instead created two
Sequence
objects, and our lambda is no longer inlined. Both cases still require that we first check for size equality. The optimal method would look something like this:
Copy code
inline fun <A, B> List<A>.equals(list: List<B>, equals: (A, B) -> Boolean): Boolean {
    if (this === list) return true
    if (size != list.size) return false
    val iterA = iterator()
    val iterB = list.iterator()
    repeat(size) {
        if (!equals(iterA.next(), iterB.next())) {
            return false
        }
    }
    return true
}
which is unwieldy enough that you definitely want it to be a dedicated method, it's just a question of whether or not it warrants a place in stdlib.
e
size checks aren't enough if the lists can be mutated concurrently - which definitely isn't typical, but
Copy code
val iteraA = this.iterator()
val iterB = that.iterator()
while (iterA.hasNext() && iterB.hasNext()) if (!equals(iterA.next(), iterB.next()) return false
return iterA.hasNext() == iterB.hasNext()
is safe under that
d
None of the existing
List
methods are specified as thread-safe, including the existing
equals
. You definitely want to do a size check before comparing the two lists. It would be a waste to iterate over both lists and check every single pair for equality only to realize at the very end that A had 701 elements and B had at least 702.
e
depends on the
List
.
CopyOnWriteArrayList#equals
is thread safe
d
Yes, specific lists like
CopyOnWriteArrayList
or
Collections::synchronizedList
are specified to be thread-safe, but Kotlin's other collection extension methods are not written to be thread-safe.
e
ArrayList.equals(ArrayList) will throw a CME on modification, not a NoSuchElementException or a false positive as your implementation will
d
Throwing
ConcurrentModificationException
would be an improvement, yes, but I think the size check should definitely come first.
equalsArrayList
checks for size equality before doing any element checking. I don't know why
equalsRange
wouldn't, though, that seems very easily wasteful.
String
also checks for size equality before any elements.
e
String is immutable
d
Optimization-wise, though, the initial assumption should be that the lists aren't being modified during the
equals
check, as if they are, the method will fail no longer how long it takes.
Actually, on lists that support comodification checks, the call to
next()
should be sufficient. Looking at
ArrayList
,
next()
calls
checkForComodification()
first, so
NoSuchElementException
will not be called unless the list is modified concurrently with the
next()
check itself, which any
equals
method will risk.
hasNext()
doesn't even have any comodification checks, so skipping it should be fine on any list that conforms to the rules of
List
.
e
there's a gap between when
size()
is called and when
iterator()
is called
d
Ah, we'd have to compare the sizes again after creating the iterators.
Or check the sizes only after creating the iterators, though that would mean that we allocated two objects that we possibly didn't need.
e
or just check that the iterations ran out at the same time
it's simple and as safe as best-effort can be
d
Depending on the complexity of the
equals
method, that can be tons of unnecessary work just to conclude that two lists had different sizes, though.
e
I'm not saying you can't check the size first
but it's super easy and strictly better to not use the size after that