< sam> hey hey I finally got into this slack thanks for answ kotlinlang #kotest

<@U12AGS8JG> hey hey I finally got into this slack...

mitch

08/31/2020, 9:40 AM

@sam hey hey I finally got into this slack! thanks for answering my q on https://github.com/kotest/kotest/issues/1646 I'm currently prepping a PR for that (including introducing the new

value(rs: RandomSource): Sample<A>

). now alot of the code is currently using values (it shows the strikethrough in intellij) and I'm fixing those as well. I realized that there's a fair few functions like this one

arb(...)

that wants Sequence in it, how would I go about that?

sam

09/01/2020, 8:54 AM

We would need to keep the original functions, and somehow come up with an alternative builder function that doesn't require sequences, but a simple function. Eg,

arb { n }

sam

09/01/2020, 8:58 AM

If we can't do that in the same package because of signature clashes then we can either, add that a subpackage (arb.builders?) or add a function to the arb companion,

fun Arb.Companion.from(f: () -> T): Arb<T>

🎖️ 1

sam

09/01/2020, 8:59 AM

Arb.from could be Arb.builder or Arb.fn or whatever

mitch

09/01/2020, 9:14 AM

ah nice. Yeah i was thinking along the same line

Arb.create

already exist with the correct types, so i wonder if I can just add several more creates

mitch

09/01/2020, 9:15 AM

meanwhile i'm hitting another wall: bind.. still thinking how to make that work with the edgecases with the single emission model..

mitch

09/01/2020, 9:36 AM

hmm, looking at scalacheck, the way they do it is by using varying frequency i.e.

Arb.choose(vararg pairs: Pair<Int, Arb<A>>)

in kotest https://github.com/typelevel/scalacheck/blob/master/src/main/scala/org/scalacheck/Gen.scala#L1224-L1232

mitch

09/01/2020, 9:38 AM

@sam i can make that change but will need to run it by you as to whether that design aligns to what you have in mind

sam

09/01/2020, 9:40 AM

Why is bind problematic with the single emission model ?

sam

09/01/2020, 9:41 AM

Do you mean because of how to handle edgecases? If so, I would just ignore edge cases. They don't really make much sense once you get into composition

sam

09/01/2020, 9:41 AM

Alternatively, the edge cases in bind, could be the permutation of the edge cases of each contributing arb

mitch

09/01/2020, 9:44 AM

yeah tried the permutation approach (cartesian product) but it failed as soon as there's an arb with an empty edgecases. i.e.

Copy code

[1,2,3]
['a', 'b', 'c']
[]

-> yields []

sam

09/01/2020, 9:44 AM

Right, I guess in that case, you just return no edge cases.

sam

09/01/2020, 9:45 AM

If you're trying to compose a, b, c into (a, b, c) and c has no edge cases, then you can't have a union with edge cases.

mitch

09/01/2020, 9:46 AM

yeah it's a product after all.. alternatively i'm pretty sure the approach using

Arb.choose

would work. as in, when you do a bind, you'd assign something like weight 1 to the edge cases, and weight 9 to the randomized values

mitch

09/01/2020, 9:47 AM

edgecases i feel is one of the most powerful feature in proptesting so that's one thing that i don't want to sacrifice

sam

09/01/2020, 9:48 AM

I'm not sure how many other prop libraries do edge cases like kotest

sam

09/01/2020, 9:50 AM

I think it's acceptable to say, if you're using bind and one of the contributing arbs doesn't provide edge cases, then the product doesn't have edge cases either.

sam

09/01/2020, 9:50 AM

All the "basic" types have edge cases, so most of the time, your bind will to

sam

09/01/2020, 9:51 AM

Another alternative is this - take the permutations of the edge cases and if an arb has zero edge cases, we treat it as an edge case of 1, where that 1 value is random. If all arbs have zero edge cases, then bind has zero edge cases.

sam

09/01/2020, 9:51 AM

Copy code

[1,2,3]
['a', 'b', 'c']
[] -> we treat this as [random]

sam

09/01/2020, 9:52 AM

That would give you [1, a, random], [2, a, random] and so on

mitch

09/01/2020, 9:53 AM

hang on - don't edgecases not have access to random seed?

sam

09/01/2020, 9:53 AM

Yeah that's a good point

sam

09/01/2020, 9:54 AM

we could introduce it

mitch

09/01/2020, 9:58 AM

Hmm.. we can do that, i'm not entirely sure if that's fit with the interface. I kinda like the edgecases being a list right now. So far we have a couple of options: • cartesian product with empty assigned as randomized list of 1. this need access to rs • the scalacheck approach using weighted arbs. no change needed in the interface.

mitch

09/01/2020, 9:58 AM

I might bang my head to the keyboard a few more times for some inspiration.

sam

09/01/2020, 10:02 AM

How does the scalatest one work, if the arb has no edge cases, then what does the weighting do ?

mitch

09/01/2020, 10:03 AM

it's going to give the full weight to

value(rs)

i suppose

sam

09/01/2020, 10:03 AM

I think the weighting in scalacheck is how they do edgecases. Give a bigger weighting to edge cases. I don't like that because you're not guaranteed to get edge cases, which I feel you should be.

sam

09/01/2020, 10:04 AM

I think any function accepting an int should always be tested with 0 for example. Not just "likely" tested with 0.

mitch

09/01/2020, 10:04 AM

yeah agreed

sam

09/01/2020, 10:04 AM

I also agree with you that I like edgecases to not require a random source.

sam

09/01/2020, 10:05 AM

The only other thing I can think of, is to introduce a subtype of Arb,

ComposedArb

that has extra method(s) for dealing with this kind of thing.

sam

09/01/2020, 10:06 AM

You would require the property test framework to then be aware of the extra value in the ADT

sam

09/01/2020, 10:07 AM

My final suggestion is to have bind accept a random parameter of it's own. This would require the user to provide their own random instance, or it could default.

mitch

09/01/2020, 10:11 AM

I'm tempted to experiment with the ComposedArb idea. The final suggestion feels quite leaky as an abstraction for the users.

sam

09/01/2020, 10:12 AM

Have a play with ComposedArb and see what you can come up with. Probably some function that generates edge cases from a random source delegating to the underlying arbs.

❤️ 1

mitch

09/01/2020, 10:12 AM

what's in my head right now is as if

Copy code

edgecases(): Exhaustive<A>

(conceptually)

sam

09/01/2020, 10:13 AM

I think it would need to stay as List to avoid breakage

mitch

09/01/2020, 10:14 AM

yeah definitely a list - sorry that's just how it felt like in my head.

👍🏻 1

mitch

09/01/2020, 10:14 AM

because exhaustive.toArb does shove the values into a list and randomly choosing one

sam

09/01/2020, 10:15 AM

Right

sam

09/01/2020, 10:16 AM

Copy code

abstract class ComposedArb<out A> : Gen<A> {

   fun edgecases(rs: RandomSource): List<A>

   abstract fun value(rs: RandomSource): Sample<A>
}

sam

09/01/2020, 10:16 AM

It doesn't feel quite right, given that it's just an Arb with the rs parameter in edgecases

mitch

09/01/2020, 10:17 AM

indeed 🤔

sam

09/01/2020, 10:17 AM

What about if edgecases was to return an ADT itself.

mitch

09/01/2020, 10:18 AM

something like a

(RandomSource) -> List<A>

kleisli?

sam

09/01/2020, 10:18 AM

Copy code

sealed class Edgecases<out A> {

   abstract fun values(rs: RandomSource): List<A>

   class Static(val values: List<A>) : Edgecases<A> {
      override fun values(rs: RandomSource): List<A> = values
   }
   class Dynamic(val fn: (RandomSource) -> List<A>) : Edgecases<A> {
      override fun values(rs: RandomSource): List<A> = fn(rs)
   }
}

sam

09/01/2020, 10:18 AM

yea

mitch

09/01/2020, 10:19 AM

ha that's so good

sam

09/01/2020, 10:19 AM

The trick is making that work with the existing signature

sam

09/01/2020, 10:20 AM

Again we could introduce a sibling function,

fun edges() :EdgeCases<A>

🎖️ 1

mitch

09/01/2020, 10:21 AM

i think this is less of a user-facing thing, the old builder would still work because it's

_ -> List<A>

sam

09/01/2020, 10:21 AM

Copy code

abstract class Arb<out A> : Gen<A>() {

   abstract fun edges(): Edgecases<A> = Edgecases.Static(edgecases())

   abstract fun edgecases(): List<A>

   abstract fun values(rs: RandomSource): Sequence<Sample<A>>

   companion object
}

sam

09/01/2020, 10:22 AM

Yes, some people do override Arb directly though. I also completely rewrote the prop test framework for 4.0.0 so I'm loathe to introduce any breaking change now, no matter how small. Best to have stability for users who spent the time converting from 3.x to 4.x

mitch

09/01/2020, 10:22 AM

ah yeah me dumb, didn't consider peeps who override edgecases directly

sam

09/01/2020, 10:23 AM

But that works, if you override edgecases, you can continue to do so. Or you can override edges().

mitch

09/01/2020, 10:23 AM

users who spent the time converting from 3.x to 4.x

i can relate lol

😂 1

sam

09/01/2020, 10:23 AM

I think I prefer this to composed arb

✅ 1

sam

09/01/2020, 10:23 AM

I don't like the terminology of static and dynamic, so better names perhaps

sam

09/01/2020, 10:24 AM

Maybe we don't even bother with the ADT,

Copy code

typealias Edgecases = (RandomSource) -> List<A>

mitch

09/01/2020, 10:27 AM

or simply a data class for now:

Copy code

data class Edgecases<A>(edges: (RandomSource) -> List<A>)

i'll get something ready,

sam

09/01/2020, 10:27 AM

Although, arguing with myself, there's not much difference between returning a function that accepts a random source and just passing random source into the function

sam

09/01/2020, 10:28 AM

Let's take the Sequence -> Sample problem first, and leave bind as it is for now. Then we can introduce edge cases as a second PR.

✔️ 1

sam

09/01/2020, 10:29 AM

Copy code

sealed class Edgecases<out A> {
   abstract fun values(rs: RandomSource): List<A>
   class List(val values: List<A>) : Edgecases<A> {
      override fun values(rs: RandomSource): List<A> = values
   }
   class Randomized(val fn: (RandomSource) -> List<A>) : Edgecases<A> {
      override fun values(rs: RandomSource): List<A> = fn(rs)
   }
}

mitch

09/01/2020, 10:32 AM

awesome! yeah personally i like having something that's typed (like

Sample<A>

and in this case

Edgecases<A>

makes it easier to enrich and refactor in the future.

👍🏻 1

sam

09/01/2020, 10:34 AM

Ok sounds like a plan

mitch

09/01/2020, 10:34 AM

thanks @sam!! that's awesome, let me prep something for you

👍🏻 1

mitch

09/04/2020, 11:23 AM

@sam phew - i suppose i'll check this in first - the first PR to introduce single emission is up. https://github.com/kotest/kotest/pull/1688

mitch

09/04/2020, 11:26 AM

I refrained from changing all the arbs in the kotest-property. I suppose i'd need to focus on backward compatibility (and given that we haven't to introduced

Edgecases<A>

yet)

mitch

09/06/2020, 6:28 AM

@sam qq - i'm currently doing some experiment on bind with

Edgecases<A>

i'm not entirely convinced with the cartesian product approach, as the size of minimum iterations can explode with the number of edgecase combinations. i.e.

Copy code

a - [1, 2, 3]
b - [1, 2, 3, 4]
c - [1, 2]
d - [1, 2, 3, 4, 5]
e - [] - randomize 1 element

if we were to compute the product we'll end up with 3 * 4 * 2 * 5 * 1 = 120 minimum iterations. currently Kotest takes the edgecases linearly due to generate(rs) and iterator.

mitch

09/06/2020, 6:31 AM

an alternative would be to follow what kotest currently does (linearly take them), but then the combo between a,b,c,d, and e will be very deterministic:

Copy code

(a, b, c, d, e)
(1, 1, 1, 1, r)
(2, 2, 2, 2, r)
(3, 3, r, 3, r)
(r, 4, r, 4, r)
(r, r, r, 5, r)

mitch

09/06/2020, 6:40 AM

I start to contemplate about this:

Do you mean because of how to handle edgecases? If so, I would just ignore edge cases. They don't really make much sense once you get into composition

questioning what's the correct thing to do in

bind

and

flatMap

case..

mitch

09/06/2020, 6:57 AM

we can somewhat shuffle the list first (because we have access to random seed). However, in terms of flatMap the size of the list is still a problem. we can randomly pick candidates out of the edgecases, but that's going to be similar to the scalacheck approach using frequency 🤔

sam

09/06/2020, 5:03 PM

Tricky questions. The point of edgecases is to make some parts of the selection of values deterministic. Like I believe it's important to always test (Int) -> T with 0 for example. Most (all?) other frameworks are not going to guarantee you'll get the 0 every time.

sam

09/06/2020, 5:05 PM

So it appears our options are: a) combinatorial explosion b) linear select losing some combinations c) randomize the edge cases

sam

09/06/2020, 5:06 PM

a) is clearly the best option if it would work. I prefer b over c but other people may have a different opinion.

sam

09/06/2020, 5:09 PM

The framework will raise an error if we try to use an arb that generates more edge cases than the iteration count of our prop test. So if you have 2500 edge cases, and you try to use that arb in a forAll that only has 1000 iterations, it errors.

sam

09/06/2020, 5:10 PM

So that puts the focus on the user to not use bind in situations where you will end up with millions of edge cases (or you increase your iteration count appropriately).

sam

09/06/2020, 5:10 PM

An interesting function to add would be

Arb.randomOnly()

which returns a copy of the arb but with the edge cases removed (that randomOnly name is a bit lame though) and Arb.withEdgeCases to copy the arb with differnt edge cases.

sam

09/06/2020, 5:11 PM

Then you could use bind, and drop the edge cases from either one or more of the inputs ,or all them from the result.

mitch

09/07/2020, 12:34 AM

we can even offload those decisions to the user at execution. With the single-emission and new edgecases model that supports random seed, it's possible to encode both behaviours at time of generation. I'm thinking out loud here, but it might look like this:

Copy code

fun generate(rs: RandomSource, edgesSamplingMode: Edgecases.SamplingMode = Edecases.SamplingMode.Exhaustive) = // implementation detail

// i'm making stuff up here
sealed class SamplingMode {
    object Exhaustive : SamplingMode() // all exhaustive permutations

    // if we want to dirty our generated distribution with edgecases with fixed probability
    case class FixedProbabilitySampling(val samplingProbability: Double = 0.1) : SamplingMode()

    // if we want more control over the ratio of the distribution
    case class DynamicProbabilitySampling(val startProbability: Double = 1.0, val targetProbability = 0.1, val decayFunction: (previousProbability: Double, iterations: Int) -> Double) : SamplingMode()
}

mitch

09/07/2020, 12:40 AM

for these options the advantage of shuffling is so that we can test all edgecases combination uniformly, i.e.

Copy code

(a, b, c, d, e)
(1, r, 1, 3, r)
(2, 2, 3, 2, r)
(3, 1, r, r, r)
(r, 4, 3, 1, r)
(r, 3, r, 5, r)

It's not as powerful as the exhaustive combinations, because it gives only an approximation of it. If things fail, dev will be presented with the random seed which they can use. I won't be surprised if this is a nice compromise for a system with many inputs.

sam

09/07/2020, 12:55 AM

It's a solution but it does go against the concept of edgecases a bit

mitch

09/07/2020, 3:32 AM

yeah i see your perspective - which i also strongly agree. (I just happen to have been dealing with some developers who have some perspective around this). I think I like this approach. Users who want to transform their edgecases to more of a probabilistic Arbitrary can do that instead. Meanwhile I believe Kotest should do what it should do, i.e. getting all the edgecases permutations.

An interesting function to add would be
Arb.randomOnly()
which returns a copy of the arb but with the edge cases removed (that randomOnly name is a bit lame though) and Arb.withEdgeCases to copy the arb with differnt edge cases. (edited)

mitch

09/07/2020, 3:34 AM

I believe I'll close #1646 and raise a new issue for this

sam

09/07/2020, 4:07 AM

Alright cool

sam

09/07/2020, 4:07 AM

we're making great improvements with this work

sam

09/07/2020, 4:07 AM

or you are

mitch

09/07/2020, 8:00 AM

Thanks to you! I like the framework. And my secret agenda is to make Kotest the go-to standard for testing Kotlin projects in my company. 😉

mitch

09/07/2020, 8:02 AM

it’s by far superior to every other ones

❤️ 1

mitch

09/07/2020, 8:04 AM

and to me, the major selling point is definitely the property testing bit

💯 1

2 Views

Open in Slack

Previous Next