I am working on overhauling our property testing as part of the upcoming 4.0 release. To this end, I have brought together the requirements (based off tickets existing in this tracker), and come up with a basic design. I would like feedback on this design before I fully implement it. There are also some questions that I don't have an answer to yet that I would like to discuss. At this stage everything is open to change.
Property Test Requirements
Deterministic Re-runs
If a test failed it is useful to be able to re-run the tests with the same values. Especially in cases where shrinking is not available. Therefore, the test functions accept a seed value which is used to create the Random instance used by the tests. This seed can then be programatically set to re-run the tests with the same random instance.
By default the seed is null, which means the seed changes each time the test is run.
Exhaustive
The generators are passed an Exhaustivity enum value which determines the behavior of generated values.
- Random - all values should be randomly generated
- Exhaustive - every value should be generated at least once. If, for the given iteration count, all values cannot be generated, then the test should fail immediately.
- Auto - Exhaustive if possible and supported, otherwise random.
By default Auto mode is used.
Question - do we want to be able to specify exhaustivity per parameter?
In #1101 @EarthCitizen talks about Series
vs Gen
. I do like the nomenclature, but we would need another abstraction (Arbitrary
?) on top which would then be able to provide a Gen or Series as required based on the exhaustive flag.
Question - do we want to implement this way as opposed to the way I have outlined in the code below
Min and Max Successes
These values determine bounds on how many tests should pass. Typically min and max success would be equal to the iteration count, which gives the forAll
behavior.
For forNone
behavior, min and max would both be zero. Other values can be used to mimic behavior like forSome
, forExactly(n)
and so on.
By default, min and max success are set to the iteration count.
Distribution
It is quite common to want to generate values across a large number space, but have a bias towards certain values. For example, when writing a function to test emails, it might be more useful to generate more strings with a smaller number of characters than larger amounts. Most emails are probably < 50 characters for example.
The distribution mode can be used to bias values by setting the bound from which each test value is generated.
- Uniform - values are distributed evenly across the space. For an integer generator of values from 1 to 1000 with 10 runs, a random value would be generated from 0.100, another from 101..200 and so on.
- Pareto - values are biased towards the lower end on a roughly 80/20 rule.
By default the uniform distribution is used.
The distribution mode may be ignored by a generator if it has no meaning for the types produced by that generator.
The distribution mode has no effect if the generator is acting in exhaustive mode.
Question - it would be nice to be able to specify specific "biases" when using specific generators. For example, a generator of A-Z chars may choose to bias towards vowels. How to specify this when distribution is a sealed type? Use an interface and allow generators to create their own implementations?
Shrinking Mode
The ShrinkingMode determines how failing values are shrunk.
- Off - Shrinking is disabled for this generator
- Unbounded - shrinking will continue until the minimum case is reached as determined by the generator
- Bounded(n) - the number of shrink steps is capped at n. After this, the shrinking process will halt, even if the minimum case has not been reached. This mode is useful to avoid long running tests.
By default shrinking is set to Bounded(1000).
Question1 - do we want to be able to control shrinking per parameter? Turn it off for some parameters, and not others?
When mapping on a generator, shrinking becomes tricky.
If you have a mapper from GenT to GenU and a value u fails, you need to turn that u back into a t, so you can feed that t into the original shrinker. So you can either keep the association between the original value and mapped value, or precompute (lazily?) shinks along with the value.
Question2 - which is the best approach?
Gen design
The gens accept the Random instance used for this run.
They accept an iterations parameter so they know the sample space when calculating based on a distribution
They accept the exhausitivity mode and the distribution mode.
Question - move the iteration count into the distribution parameter itself?
Note that the gens no longer specify a shrinker, but should provide the shrinks along with the value (see shrinker section for discussion).
interface Gen<T> {
fun edgecases(): Iterable<T>
fun generate(
random: Random,
iterations: Int,
exhaustivity: Exhaustivity = Exhaustivity.Auto,
distribution: Distribution
): Sequence<Pair<T, Shrinker<T>>>
companion object
}
fun Gen.Companion.int(lower: Int, upper: Int) = object : Gen<Int> {
private val literals = listOf(Int.MIN_VALUE, Int.MAX_VALUE, 0)
override fun edgecases(): Iterable<Int> = literals
override fun generate(
random: Random,
iterations: Int,
exhaustivity: Exhaustivity,
distribution: Distribution
): Sequence<Pair<Int, Shrinker<Int>>> {
val randomized = infiniteSequence { k ->
val range = distribution.get(k, iterations, lower.toLong()..upper.toLong())
random.nextLong(range).toInt()
}
val exhaustive = generateInfiniteSequence {
require(iterations <= upper - lower)
(lower..upper).iterator().asSequence()
}.flatten()
val seq = when (exhaustivity) {
Exhaustivity.Auto -> when {
iterations <= upper - lower -> exhaustive
else -> randomized
}
Exhaustivity.Random -> randomized
Exhaustivity.Exhaustive -> exhaustive
}
return seq.map { Pair(it, IntShrinker) }
}
}
fun <T, U> Gen<T>.map(f: (T) -> U): Gen<U> {
val outer = this
return object : Gen<U> {
override fun edgecases(): Iterable<U> = outer.edgecases().map(f)
override fun generate(
random: Random,
iterations: Int,
exhaustivity: Exhaustivity,
distribution: Distribution
): Sequence<Pair<U, Sequence<U>>> =
outer.generate(random, iterations, exhaustivity, distribution)
.map { (value, shrinks) ->
Pair(f(value), shrinks.map { f(it) }.asSequence())
}
}
}
sealed class Distribution {
abstract fun get(k: Int, iterations: Int, range: LongRange): LongRange
object Uniform : Distribution() {
override fun get(k: Int, iterations: Int, range: LongRange): LongRange {
val step = (range.last - range.first) / iterations
return (step * k)..(step * (k + 1))
}
}
object Pareto : Distribution() {
override fun get(k: Int, iterations: Int, range: LongRange): LongRange {
val step = (range.last - range.first) / iterations
return 0..(step * k + 1)
}
}
}
sealed class Exhaustivity {
object Auto : Exhaustivity()
object Random : Exhaustivity()
object Exhaustive : Exhaustivity()
}
sealed class ShrinkingMode {
object Off : ShrinkingMode()
object Unbounded : ShrinkingMode()
data class Bounded(val bound: Int) : ShrinkingMode()
}
enhancement discussion property-testing