https://kotlinlang.org logo
Title
m

mitch

08/31/2020, 9:40 AM
@sam hey hey I finally got into this slack! thanks for answering my q on https://github.com/kotest/kotest/issues/1646 I'm currently prepping a PR for that (including introducing the new
value(rs: RandomSource): Sample<A>
). now alot of the code is currently using values (it shows the strikethrough in intellij) and I'm fixing those as well. I realized that there's a fair few functions like this one
arb(...)
that wants Sequence in it, how would I go about that?
s

sam

09/01/2020, 8:54 AM
We would need to keep the original functions, and somehow come up with an alternative builder function that doesn't require sequences, but a simple function. Eg,
arb { n }
If we can't do that in the same package because of signature clashes then we can either, add that a subpackage (arb.builders?) or add a function to the arb companion,
fun Arb.Companion.from(f: () -> T): Arb<T>
🎖️ 1
Arb.from could be Arb.builder or Arb.fn or whatever
m

mitch

09/01/2020, 9:14 AM
ah nice. Yeah i was thinking along the same line
Arb.create
already exist with the correct types, so i wonder if I can just add several more creates
meanwhile i'm hitting another wall: bind.. still thinking how to make that work with the edgecases with the single emission model..
hmm, looking at scalacheck, the way they do it is by using varying frequency i.e.
Arb.choose(vararg pairs: Pair<Int, Arb<A>>)
in kotest https://github.com/typelevel/scalacheck/blob/master/src/main/scala/org/scalacheck/Gen.scala#L1224-L1232
@sam i can make that change but will need to run it by you as to whether that design aligns to what you have in mind
s

sam

09/01/2020, 9:40 AM
Why is bind problematic with the single emission model ?
Do you mean because of how to handle edgecases? If so, I would just ignore edge cases. They don't really make much sense once you get into composition
Alternatively, the edge cases in bind, could be the permutation of the edge cases of each contributing arb
m

mitch

09/01/2020, 9:44 AM
yeah tried the permutation approach (cartesian product) but it failed as soon as there's an arb with an empty edgecases. i.e.
[1,2,3]
['a', 'b', 'c']
[]

-> yields []
s

sam

09/01/2020, 9:44 AM
Right, I guess in that case, you just return no edge cases.
If you're trying to compose a, b, c into (a, b, c) and c has no edge cases, then you can't have a union with edge cases.
m

mitch

09/01/2020, 9:46 AM
yeah it's a product after all.. alternatively i'm pretty sure the approach using
Arb.choose
would work. as in, when you do a bind, you'd assign something like weight 1 to the edge cases, and weight 9 to the randomized values
edgecases i feel is one of the most powerful feature in proptesting so that's one thing that i don't want to sacrifice
s

sam

09/01/2020, 9:48 AM
I'm not sure how many other prop libraries do edge cases like kotest
I think it's acceptable to say, if you're using bind and one of the contributing arbs doesn't provide edge cases, then the product doesn't have edge cases either.
All the "basic" types have edge cases, so most of the time, your bind will to
Another alternative is this - take the permutations of the edge cases and if an arb has zero edge cases, we treat it as an edge case of 1, where that 1 value is random. If all arbs have zero edge cases, then bind has zero edge cases.
[1,2,3]
['a', 'b', 'c']
[] -> we treat this as [random]
That would give you [1, a, random], [2, a, random] and so on
m

mitch

09/01/2020, 9:53 AM
hang on - don't edgecases not have access to random seed?
s

sam

09/01/2020, 9:53 AM
Yeah that's a good point
we could introduce it
m

mitch

09/01/2020, 9:58 AM
Hmm.. we can do that, i'm not entirely sure if that's fit with the interface. I kinda like the edgecases being a list right now. So far we have a couple of options: • cartesian product with empty assigned as randomized list of 1. this need access to rs • the scalacheck approach using weighted arbs. no change needed in the interface.
I might bang my head to the keyboard a few more times for some inspiration.
s

sam

09/01/2020, 10:02 AM
How does the scalatest one work, if the arb has no edge cases, then what does the weighting do ?
m

mitch

09/01/2020, 10:03 AM
it's going to give the full weight to
value(rs)
i suppose
s

sam

09/01/2020, 10:03 AM
I think the weighting in scalacheck is how they do edgecases. Give a bigger weighting to edge cases. I don't like that because you're not guaranteed to get edge cases, which I feel you should be.
I think any function accepting an int should always be tested with 0 for example. Not just "likely" tested with 0.
m

mitch

09/01/2020, 10:04 AM
yeah agreed
s

sam

09/01/2020, 10:04 AM
I also agree with you that I like edgecases to not require a random source.
The only other thing I can think of, is to introduce a subtype of Arb,
ComposedArb
that has extra method(s) for dealing with this kind of thing.
You would require the property test framework to then be aware of the extra value in the ADT
My final suggestion is to have bind accept a random parameter of it's own. This would require the user to provide their own random instance, or it could default.
m

mitch

09/01/2020, 10:11 AM
I'm tempted to experiment with the ComposedArb idea. The final suggestion feels quite leaky as an abstraction for the users.
s

sam

09/01/2020, 10:12 AM
Have a play with ComposedArb and see what you can come up with. Probably some function that generates edge cases from a random source delegating to the underlying arbs.
❤️ 1
m

mitch

09/01/2020, 10:12 AM
what's in my head right now is as if
edgecases(): Exhaustive<A>
(conceptually)
s

sam

09/01/2020, 10:13 AM
I think it would need to stay as List to avoid breakage
m

mitch

09/01/2020, 10:14 AM
yeah definitely a list - sorry that's just how it felt like in my head.
👍🏻 1
because exhaustive.toArb does shove the values into a list and randomly choosing one
s

sam

09/01/2020, 10:15 AM
Right
abstract class ComposedArb<out A> : Gen<A> {

   fun edgecases(rs: RandomSource): List<A>

   abstract fun value(rs: RandomSource): Sample<A>
}
It doesn't feel quite right, given that it's just an Arb with the rs parameter in edgecases
m

mitch

09/01/2020, 10:17 AM
indeed 🤔
s

sam

09/01/2020, 10:17 AM
What about if edgecases was to return an ADT itself.
m

mitch

09/01/2020, 10:18 AM
something like a
(RandomSource) -> List<A>
kleisli?
s

sam

09/01/2020, 10:18 AM
sealed class Edgecases<out A> {

   abstract fun values(rs: RandomSource): List<A>

   class Static(val values: List<A>) : Edgecases<A> {
      override fun values(rs: RandomSource): List<A> = values
   }
   class Dynamic(val fn: (RandomSource) -> List<A>) : Edgecases<A> {
      override fun values(rs: RandomSource): List<A> = fn(rs)
   }
}
yea
m

mitch

09/01/2020, 10:19 AM
ha that's so good
s

sam

09/01/2020, 10:19 AM
The trick is making that work with the existing signature
Again we could introduce a sibling function,
fun edges() :EdgeCases<A>
🎖️ 1
m

mitch

09/01/2020, 10:21 AM
i think this is less of a user-facing thing, the old builder would still work because it's
_ -> List<A>
s

sam

09/01/2020, 10:21 AM
abstract class Arb<out A> : Gen<A>() {

   abstract fun edges(): Edgecases<A> = Edgecases.Static(edgecases())

   abstract fun edgecases(): List<A>

   abstract fun values(rs: RandomSource): Sequence<Sample<A>>

   companion object
}
Yes, some people do override Arb directly though. I also completely rewrote the prop test framework for 4.0.0 so I'm loathe to introduce any breaking change now, no matter how small. Best to have stability for users who spent the time converting from 3.x to 4.x
m

mitch

09/01/2020, 10:22 AM
ah yeah me dumb, didn't consider peeps who override edgecases directly
s

sam

09/01/2020, 10:23 AM
But that works, if you override edgecases, you can continue to do so. Or you can override edges().
m

mitch

09/01/2020, 10:23 AM
users who spent the time converting from 3.x to 4.x
i can relate lol
😂 1
s

sam

09/01/2020, 10:23 AM
I think I prefer this to composed arb
1
I don't like the terminology of static and dynamic, so better names perhaps
Maybe we don't even bother with the ADT,
typealias Edgecases = (RandomSource) -> List<A>
m

mitch

09/01/2020, 10:27 AM
or simply a data class for now:
data class Edgecases<A>(edges: (RandomSource) -> List<A>)
i'll get something ready,
s

sam

09/01/2020, 10:27 AM
Although, arguing with myself, there's not much difference between returning a function that accepts a random source and just passing random source into the function
Let's take the Sequence -> Sample problem first, and leave bind as it is for now. Then we can introduce edge cases as a second PR.
✔️ 1
sealed class Edgecases<out A> {
   abstract fun values(rs: RandomSource): List<A>
   class List(val values: List<A>) : Edgecases<A> {
      override fun values(rs: RandomSource): List<A> = values
   }
   class Randomized(val fn: (RandomSource) -> List<A>) : Edgecases<A> {
      override fun values(rs: RandomSource): List<A> = fn(rs)
   }
}
m

mitch

09/01/2020, 10:32 AM
awesome! yeah personally i like having something that's typed (like
Sample<A>
and in this case
Edgecases<A>
makes it easier to enrich and refactor in the future.
👍🏻 1
s

sam

09/01/2020, 10:34 AM
Ok sounds like a plan
m

mitch

09/01/2020, 10:34 AM
thanks @sam!! that's awesome, let me prep something for you
👍🏻 1
@sam phew - i suppose i'll check this in first - the first PR to introduce single emission is up. https://github.com/kotest/kotest/pull/1688
I refrained from changing all the arbs in the kotest-property. I suppose i'd need to focus on backward compatibility (and given that we haven't to introduced
Edgecases<A>
yet)
@sam qq - i'm currently doing some experiment on bind with
Edgecases<A>
i'm not entirely convinced with the cartesian product approach, as the size of minimum iterations can explode with the number of edgecase combinations. i.e.
a - [1, 2, 3]
b - [1, 2, 3, 4]
c - [1, 2]
d - [1, 2, 3, 4, 5]
e - [] - randomize 1 element
if we were to compute the product we'll end up with 3 * 4 * 2 * 5 * 1 = 120 minimum iterations. currently Kotest takes the edgecases linearly due to generate(rs) and iterator.
an alternative would be to follow what kotest currently does (linearly take them), but then the combo between a,b,c,d, and e will be very deterministic:
(a, b, c, d, e)
(1, 1, 1, 1, r)
(2, 2, 2, 2, r)
(3, 3, r, 3, r)
(r, 4, r, 4, r)
(r, r, r, 5, r)
I start to contemplate about this:
Do you mean because of how to handle edgecases? If so, I would just ignore edge cases. They don't really make much sense once you get into composition
questioning what's the correct thing to do in
bind
and
flatMap
case..
we can somewhat shuffle the list first (because we have access to random seed). However, in terms of flatMap the size of the list is still a problem. we can randomly pick candidates out of the edgecases, but that's going to be similar to the scalacheck approach using frequency 🤔
s

sam

09/06/2020, 5:03 PM
Tricky questions. The point of edgecases is to make some parts of the selection of values deterministic. Like I believe it's important to always test (Int) -> T with 0 for example. Most (all?) other frameworks are not going to guarantee you'll get the 0 every time.
So it appears our options are: a) combinatorial explosion b) linear select losing some combinations c) randomize the edge cases
a) is clearly the best option if it would work. I prefer b over c but other people may have a different opinion.
The framework will raise an error if we try to use an arb that generates more edge cases than the iteration count of our prop test. So if you have 2500 edge cases, and you try to use that arb in a forAll that only has 1000 iterations, it errors.
So that puts the focus on the user to not use bind in situations where you will end up with millions of edge cases (or you increase your iteration count appropriately).
An interesting function to add would be
Arb.randomOnly()
which returns a copy of the arb but with the edge cases removed (that randomOnly name is a bit lame though) and Arb.withEdgeCases to copy the arb with differnt edge cases.
Then you could use bind, and drop the edge cases from either one or more of the inputs ,or all them from the result.
m

mitch

09/07/2020, 12:34 AM
we can even offload those decisions to the user at execution. With the single-emission and new edgecases model that supports random seed, it's possible to encode both behaviours at time of generation. I'm thinking out loud here, but it might look like this:
fun generate(rs: RandomSource, edgesSamplingMode: Edgecases.SamplingMode = Edecases.SamplingMode.Exhaustive) = // implementation detail

// i'm making stuff up here
sealed class SamplingMode {
    object Exhaustive : SamplingMode() // all exhaustive permutations

    // if we want to dirty our generated distribution with edgecases with fixed probability
    case class FixedProbabilitySampling(val samplingProbability: Double = 0.1) : SamplingMode()

    // if we want more control over the ratio of the distribution
    case class DynamicProbabilitySampling(val startProbability: Double = 1.0, val targetProbability = 0.1, val decayFunction: (previousProbability: Double, iterations: Int) -> Double) : SamplingMode()
}
for these options the advantage of shuffling is so that we can test all edgecases combination uniformly, i.e.
(a, b, c, d, e)
(1, r, 1, 3, r)
(2, 2, 3, 2, r)
(3, 1, r, r, r)
(r, 4, 3, 1, r)
(r, 3, r, 5, r)
It's not as powerful as the exhaustive combinations, because it gives only an approximation of it. If things fail, dev will be presented with the random seed which they can use. I won't be surprised if this is a nice compromise for a system with many inputs.
s

sam

09/07/2020, 12:55 AM
It's a solution but it does go against the concept of edgecases a bit
m

mitch

09/07/2020, 3:32 AM
yeah i see your perspective - which i also strongly agree. (I just happen to have been dealing with some developers who have some perspective around this). I think I like this approach. Users who want to transform their edgecases to more of a probabilistic Arbitrary can do that instead. Meanwhile I believe Kotest should do what it should do, i.e. getting all the edgecases permutations.
An interesting function to add would be 
Arb.randomOnly()
  which returns a copy of the arb but with the edge cases removed (that randomOnly name is a bit lame though) and Arb.withEdgeCases to copy the arb with differnt edge cases. (edited)
I believe I'll close #1646 and raise a new issue for this
s

sam

09/07/2020, 4:07 AM
Alright cool
we're making great improvements with this work
or you are
m

mitch

09/07/2020, 8:00 AM
Thanks to you! I like the framework. And my secret agenda is to make Kotest the go-to standard for testing Kotlin projects in my company. 😉
it’s by far superior to every other ones
❤️ 1
and to me, the major selling point is definitely the property testing bit
💯 1