https://kotlinlang.org logo
Title
r

rpillay

02/19/2021, 6:20 AM
I think we should revisit this as a topic of discussion - At the moment, this would be a blocker to upgrading my kotest version, but more importantly its a bit of a blocker to properly setting up edgecases - theres an incentive to just ignore them so that our tests don't balloon out into the hundreds of thousands of iterations. What do you think @sam and @mitch?
m

mitch

02/19/2021, 6:20 AM
@sam i think we touched upon this topic a while back
I wonder what you’d think about introducing some kind of knob to tell kotest to either do cartesian product or randomly weighted edgecases (i.e. scalacheck way)
s

sam

02/19/2021, 6:23 AM
that link doesn't load for me
@rpillay how many arbs was that? and how big was the bind?
s

sam

02/19/2021, 6:24 AM
So basically the problem is, with arity k where k > some n, the edgecases explode and it all becomes silly
m

mitch

02/19/2021, 6:24 AM
yeah, which makes sense i guess.. at a certain point you’d have an exponential
s

sam

02/19/2021, 6:24 AM
right
It seems like the edgecases idea is good - 0, -1 and 1 are "better" values than 121234578234872394
m

mitch

02/19/2021, 6:25 AM
😆 that’s a big number, and yes
s

sam

02/19/2021, 6:25 AM
but not if it means the first 20000 tests are all edgecases
m

mitch

02/19/2021, 6:26 AM
i wonder if we should revisit the probabilistic edgecases idea
s

sam

02/19/2021, 6:27 AM
yes
Also I wonder if we should do something like, the first test picks random edge cases, but after that it's completely random
r

rpillay

02/19/2021, 6:27 AM
Yeah. Edgecases are definitely a useful concept, and if Kotest peppered them (at a greater frequency) than random values, I think we'd get a lot of the value without the downsides of a huge test space
s

sam

02/19/2021, 6:28 AM
that makes me think that a useful idea might be, edge cases are just included in the sample values, and each time a value is requested, we randomly pick either an edge case or a truely random value
so edge cases are not guaranteed, but likely
I guess the whole point of property testing is that over time you include a lot of values, you cannot ever include all values every time
does this give rise to a concept of "ranking" values? You want an Int generator, it's more likely to generate 0 1 -1 than 21323423
m

mitch

02/19/2021, 6:30 AM
i’m gonna try to research again how scalacheck does their thing
i mean they have that kind of behaviour, surely
r

rpillay

02/19/2021, 6:30 AM
Would you ever want a rank other than "edgecase - try more often" and "truly random"?
s

sam

02/19/2021, 6:31 AM
I've looked into scalacheck, it's quite basic (imo)
r

rpillay

02/19/2021, 6:31 AM
I can't see a case where saying that say, -1 is more likely to be a problem than 0
s

sam

02/19/2021, 6:31 AM
The best one is the haskell one that only does edgecases, hedgehog or something
m

mitch

02/19/2021, 6:31 AM
i haven’t used that
I can’t see a case where saying that say, -1 is more likely to be a problem than 0
i think this is actually makes sense, it depends on the system that you’re testing
s

sam

02/19/2021, 6:32 AM
My idea of edgecases @rpillay was that -1 / 1 / 0 is more likely to throw errors than 2342342
It might not necessarily be true I suppose
but things like off by one errors
Or for strings, do we always test for
""
or a single unicode (non ascii) character
r

rpillay

02/19/2021, 6:34 AM
Yeah, I get that (and agree - edge cases like that are more likely to cause problems). I thought you were suggesting ranking them further - which I don't see a use for
s

sam

02/19/2021, 6:35 AM
I'm wondering if we just say, pick 50/50 (or whatever) from an edge case vs a real random value
rather than currently where we iterate all edge cases first
in other words, we just promote edgecases to make them more likely to appear
we kind of have explored this already with some work in 4.5
m

mitch

02/19/2021, 6:36 AM
Nice. i think we can improve this further but i think we can make that work with the current ingredients. One question, say if we implement this - would that mean: • a) we discard the idea of cartesian product and go all in to probabilistic or • b) we keep the cartesian product but give users a way to configure various behaviours?
i’m keen to keep that simple for now, like give some sort of configurable probability of edgecases appearing.
s

sam

02/19/2021, 6:38 AM
I'm open to anything
m

mitch

02/19/2021, 6:38 AM
on ranking maybe that’s the next iteration of the feature
s

sam

02/19/2021, 6:38 AM
I'm happy to release 5.0 if we come up with something great but breaking
m

mitch

02/19/2021, 6:38 AM
with probabilistic edgecases it’s possible to do it with 100% backward compatibility
because essentially that method is governed in
Gen.generate(rs)
i believe
s

sam

02/19/2021, 6:40 AM
Right
r

rpillay

02/19/2021, 6:41 AM
I'd be in favour of a) discard cartesian products. Mostly because it seems unlikely that you'd need the configurability. There seem to be other options within Kotest if you need that sort of thing - exhaustives instead of arbs, for example (?)
m

mitch

02/19/2021, 6:45 AM
hmm yeah that’s a good point, i haven’t been playing too much with exhaustives
@rpillay let’s raise an issue so that we can capture this properly
i’ll try looking into it soon
r

rpillay

02/19/2021, 6:48 AM
👍 I'll raise the issue
m

mitch

02/19/2021, 6:57 AM
awesome! thanks @rpillay
@rpillay @sam PR’s up, it’s a simple but widespread change. got some qs for Sam there. need you guys’ input to see what’s best for kotest. 🎉 https://github.com/kotest/kotest/pull/2126
s

sam

02/23/2021, 10:17 PM
I saw the PR email, and I'll review it as soon as I can. I know it'll be good 🙂
r

rpillay

02/23/2021, 10:17 PM
:awesome:
m

mitch

02/23/2021, 10:17 PM
you’re awesome Sam, my inbox is a bit unworldly to look at, i need to do some spring cleaning to get rid of the far too many junks
😂 1