Is there a way in kotlin to do something like that...
# getting-started
s
Is there a way in kotlin to do something like that:
Copy code
val someVar = "some value"
if(some-condition && someVar in(valueOne,valueTwo) ){
}
a
you could use a set:
Copy code
if (foo && setOf("a", "b").contains(someVar)) {
}
r
contains
is an operator function, so
someVar in setOf("a", "b")
works as well
3
s
Nice, Thank you !
c
setOf
is quite expensive though.
if (foo && (someVar == "a" || someVar == "b"))
is much faster and less memory-intensive
j
if you're going to optimize that, why not === ?
I think it depends on the use context
a
does the Kotlin compiler do some clever optimizations here? And/or could it?
c
@August Lilleaas it couldn't,
setOf
means the comparison is based on hashCode+equals, the compiler cannot know if it's safe to replace or not
K 1
@Jonathan Locke because
===
on strings is a trap
👍 1
1
w
Also, in most cases thinking about such optimizations will be a premature optimization. It's good to be aware of the performance overhead, but I'd always recommend starting with more "kotlinic" constructs such as
a in xOf(b, c)
. Especially when you're new to Kotlin or JVM languages. JVM has great profilers that can help you find your performance bottlenecks when necessary 🙂. But if you enjoy thinking about performance, you should of course! Everyone should find their way of enjoying Kotlin 🔥
c
I disagree. Premature optimization is one thing, but creating an entire set (one of the most complex data structures we have) just to test the equality of two objects is just waste. Also now you have to understand why that code doesn't work if your
hashCode
is wrong (and beginners' implementations of
hashCode
will be wrong).
w
I wouldn't use sets as well personally, in my pet projects I would write things like:
Copy code
private fun <T> oneOf(vararg options: T) = options
a in oneOf(b, c)
(which is actually a micro optimization as well) I have done quite some profiling on my pet projects (most of my pet projects run on raspberries), and these kind of things have never been the cause (it's always higher level repetitions, incorrect threading models, or stupid oversights like
someSet.random()
). And yes, you are right in saying that hashcode might be wrong for beginners, thats a good point. But I'd argue that it reinforces my point, they should be thinking about the semantics of their code first, then the performance. And aside from semantics, writing good hashcode functions can impact performance a lot, and that is even harder to determine. Also, for libraries like decouple (love the concept btw), with code that will be run quite often, good documentation and a will to attract users, I totally understand that performance plays an important role. It's just that I personally teach beginners mostly about finding a way to enjoy programming first.
c
“should be thinking of the semantics of their code first”: that's exactly why sets shouldn't be used in such trivial cases. A dev should know of the associated costs of what they use, and should be able to quickly compare between solutions.
a == "first" || b == "second"
means “check the first one, then check the second one”. It means exactly how it reads, there are no hidden costs.
setOf
is expensive. Of course, there's a reason it's here, it's very convenient when there are no other solutions, and I'm not criticizing its usage. However, creating and destroying a set in a single line should immediately look wrong to developers. Of course, used once or twice, it won't be an issue. It's still a bad habit that will make code very slow when sprinkled everywhere—and you're right that it won't appear in many profilers. For the original poster: sorry to have derailed your thread that much, you should probably not concern yourself with most of this. In my opinion, just try to keep code as simple as you can (as in: you can explain exactly what it does), because that's how you get a good understanding of when to use things. I encourage you to read on sets and how they are implemented (not to be an expert, having a general understanding is good enough). Like many data structures, they are expensive to create but very efficient to use.
Also, thanks for the comment on Decouple. I agree with you, but it's often the case that developers who started with languages that hide all of this (e.g. Python) have a very hard time understanding the choices between data structures, because they never had to think of them in the first place. Sure on these examples the impact is not much, but this mentality at the level of an entire codebase leads to bad patterns. Also if anyone is curious, the best way (on the JVM) to compare small amounts (a few dozen) of strings is a
when
statement. It creates a jump-table, which is essentially a set that has no construction cost because it's all compile-time. (but indeed, you should not worry about this, I just wanted to add it for completion)
c
@CLOVIS In many cases within my own app, I choose to use the
x in setOf(a, b)
approach knowing it’s inefficient and preferring the readability, but I really don’t know exactly how inefficient it is. Would it be more efficient to use
listOf()
or
arrayOf()
instead of
setOf()
for these kinds of checks? (FWIW, it’s pretty much always with enums that I do this)
c
If it's an enum, and you're on the JVM, you could pre-create
EnumSet
instances. E.g.
Copy code
enum class Foo {
    First,
    Second,
    Third,
    ;

    companion object {
        // of course, find a better name than this
        val firstTwo = EnumSet.of(First, Second)
    }
}

// Example usage:
val a: Foo = TODO()
if (a in Foo.firstTwo) {
    …
}
EnumSet
is great because it is dirt cheap memory-wise and is extremely performant (it stores the enum as a bit field and does binary operations directly on it to implement all methods). I would only really use this if it happens in a lot of places in the app, but to my knowledge that's the only way to get
in
to be
O(1)
. For strings, I believe the best way is to use the bytecode jumptable (
when
), but for two elements I would still go with the
||
approach for readability reasons.
||
becomes inconvenient with 3+ cases, so it's perfect that
when
is this good then. Keep in mind that all of this is for small amount of values. For large amount of values, nothing beats a pre-created Set (that's their reason to exist!).
s
@Wout Werkman your
oneOf
function is basically a redifinition of
arrayOf
, so why not include the is-in check to make the functionality unique:
Copy code
fun <T:Any> T.isIn(option1: T, option2: T): Boolean = this==option1 || this==option2

fun <T:Any> T.isIn(option1: T, option2: T, option3: T): Boolean = when {
    this==option1 -> true
    this==option2 -> true
    else -> this==option3
}

fun <T:Any> T.isIn(option1: T, option2: T, option3: T, option4: T, vararg other: T): Boolean = when {
    this==option1 -> true
    this==option2 -> true
    this==option3 -> true
    this==option4 -> true
    else -> this in other
}
which you call like this
1.isIn(4, 5,3,1)
Vararg creates an array, but this should be cheaper than creating a set!? 🤔
w
Yes, it is indeed exactly arrayOf, I purposely don't use a type alias, so I can safely inline the function once Kotlin has collection literals (
x in [1, 2]
). And you're right, using an array for these contexts is an order of magnitude faster (no hashcode, only one object instantiation, data localization) then a set, additionally your variant will even be more space efficient. All flavours are subjective, and I vouch for freedom of choice 🙂
a
I wish Kotlin had Clojure-style “automatic optimizations” for data structures 🙂 Sets in Clojure are arrays until an internal size (8, I believe) is used, precisely because arrays are faster for small collections
🐕 1
c
Kotlin does have a keyword specifically optimized for this though:
when
💯 1
w
Intersting! Python has (depending on interpreter/jit compiler) optimizations for this as well, it inlines
x in [a, b]
, to
x == a && x == b
, but it does not do this for sets.
c
Python can do that because the user has no control over how data structures work. There is only one list, there is only one set. And even then the user can override the hash so they can't optimize sets.
w
Well, the
when
optimization is actually backend dependant. JVM might make a jump table or not, but currently JVM (unlike C# .NET/core) will never turn your switches / if else rows into a hashtable lookup.
c
No, because JVM
switch
is limited to primitive values, enums and strings, which are the cases where a jumptable is optimal, so there's no reason to ever use anything else.
1