Would it be useful to have specialized versions of...
# language-evolution
d
Would it be useful to have specialized versions of e.g.
map
and
flatMap
? Currently
map
on Iterable always returns a
List<T>
, but what if you applied map to a set and wanted a set as a result? There is
mapTo
, but it is a bit cumbersome to use. What about
mapToSet
or something? Then you could also have a proper initial size for the backing set
The implementation for
mapTo
just adds every transformed item to the destination, so if you pass a new HashSet as a destination, it could be resized a couple of times
m
I'd think this is like this because it can't be guaranteed in general that whatever you do in
map
will keep the uniqueness of the items. Your map function could e.g. map everything to the same value.
e
HashSet
sizing is additionally kind of tricky because even if the items don't collide, their hash codes may, or at least they might fall into the same buckets at certain sizes
d
Funnily enough, Kotlin's
toSet
also uses that heuristic in calculating the initial size for the set
So maybe it can be reused in this case
k
It's the same with the JDK. Basically, you need to choose your load factor based on likelihood of hash collisions. The JDK lets you choose it, but Guava uses a hardcoded default of 0.75.
If your JVM target is >= 19, you can do
mapTo(HashSet.newHashSet(size)) { ... }
.
p
Another downside to
mapTo
and
filterTo
etc, is the passed destination needs to be mutable, but usually I'm after an imutable result. I've taken to doing something like
buildSet(size) { a.mapTo(this) { ... } }
which I could/should probably wrap into a
fun Iterable<T>.mapToSet
helper...
e
.toSet()
doesn't bother wrapping in an immutable set, for the record
p
It doesn't wrap it, but it does return the immutable interface (I know ... you can always cast back to mutable and break things, but it's a matter of intent imo)
I suppose a helper of
mapToSet(f) = mapTo(mutableSetOf(), f) as Set<T>
kinda thing would do the same
e
or even with
.mapTo()
e.g.
Copy code
val output: Set<...> = list.mapTo(mutableSetOf()) { ... }
1
k
Some functions take an optional lambda, e.g.
first
can be called as
collection.first()
or
collection.first { it > 42 }
. It could be nice if
toSet()
had an overload
toSet(transformation: (T) -> R)
.
e
I think that's tricky there's a combinatorial explosion of potential
indexedToSet()
notNullToSet()
flattenToSet()
etc. the existence of
mapTo()
etc. is already enough for all common cases
d
Actually, why the
map
on the
Set
doesn't simply return
Set
, without any additional ceremony? I wouldn't mind if this was the default but of course it would be an unnecessary breaking change now. Still, I'm curious about the design decisions behind this.
d
Yeah for
map
it can make sense as uniqueness is not guaranteed of the returned elements and may lead to surprises, but
Set<T>.filter()
also returns a
List<T>
while filtering a set will always yield a (mathematical) set with the same or fewer elements
Most (if not all) of these kinds of operations are defined on
Iterable<T>
returning a
List<T>
e
well, that's tricky. not all Sets are the same
for example,
Copy code
val a = sortedSetOf(Collator.getInstance(<http://Locale.US|Locale.US>), "\u00C1")
val b = a.toSet()
"A\u0301" in a
"A\u0301" !in b
👍 1
d
@ephemient I'm probably lacking some knowledge here 🙂 What is your point exactly?
k
That's a great example of how different kinds of set behave differently!
d
I think his point is that sets are collections of things that are different or unique, but whether something is different or unique depends on the implementation of the uniqueness constraint passed to the set, his example may be a bit obtuse, but you can have a case-insensitive set where a == A and a case-sensitive set where a != A
This strange thing about sets can also cause for equality to not be symmetrical, i.e. it can occur that a == b (or a.equals(b), according to a), while at the same time b != a
👍 1
d
different kinds of set behave differently
Even though this might be true I still don't see why a map on a set shouldn't produce a set by default. Because in most code I've written or seen there's a sequence
set.map { ... }.toSet()
. Obviously I can write the code myself and is not such a big deal. I'm just interested why it's not the default for sets.
Also, if I'm not mistaken, some JVM language behaves that way.