You know the most important quality of a good unit test? It has to be incredibly easy to run.
–Onorio Catenacci
Transparency is a close second. It should be obvious what the unit test tests, how it tests it, and when it fails, how it failed and how to reproduce the failure.
FsCheck is the F# implementation of the well known Haskell QuickCheck test combinator library (also available for many other languages). This article demonstrates some of the tools and techniques available in FsCheck not only to make your tests transparent, but also to make FsCheck itself transparent.
(The rest of this article assumes some familiarity with FsCheck. If it is new to you, or you need to brush up on it, FsCheck – Breaking Your Code in New and Exciting Ways is an excellent place to start, and don’t forget the FsCheck documentation )
Choose your Check
The examples in this article come from tests of new functions in FSharpx.Collections.Vector to support Vector<Vector<‘T>>. Let’s look at one test implementing familiar property-based test techniques.
[<Test>]
let WindowedTest() =
let testWindowed =
gen { let! windowLength = Gen.choose(1,5)
let! source = Arb.generate<List<int>>
return ((windowSeq windowLength source), (windowLength, source))
}
Check.QuickThrowOnFailure (Prop.forAll (Arb.fromGen testWindowed)
(fun (vOfV, (windowLength, source)) ->
let outerLength =
if source.Length = 0 then 1
else int (Math.Ceiling((float)source.Length/(float)windowLength))
(outerLength = vOfV.Length &&
flatten vOfV |> List.ofSeq = source)
|> Prop.classify (source.Length > 0 && outerLength > 0) "windowLength, outerLength"
|> Prop.classify (source.Length = 0) "empty"
|> Prop.collect (windowLength, outerLength)
)
)
My go to form of FsCheck checking is usually a derivative of Check.Quick
, which will check a single property with a default check config. Using the NUnit external runner on a compiled test project, as I usually do, calls for using the Check.QuickThrowOnFailure
method, otherwise the runner will leave your test green lit, even though it reports as falsifiable in the Text Output tab
Check a falsification
Let's look at what FsCheck returns upon falsification by slipping a falsification into the property under test:
(outerLength = vOfV.Length && true = false &&
which has the desired result of displaying the generated data that failed:
*** VectorTest.WindowedTest
Falsifiable, after 1 test (0 shrinks) (StdGen (1338874294,295749962)):
(seq [seq [1]], (3, [1]))
This test data came from the return
of the gen {...}
GenBuilder above. It helps to understand the function under test.
Returns a vector of vectors of given length from the seq. Result may be a jagged vector.
windowSeq : int -> seq<'T> -> Vector<Vector<'T>>
The first member of the outer tuple of test data is a Vector<Vector<'T>> produced from the parameter data in the second member, a tuple of window length and a source list. (FsCheck does not know about Vector<Vector<'T>>, so it prints it as seq [seq []]
.)
Classify your input
Once satisfied how FsCheck reports generated data, let's display more information about the range of generated data. Otherwise upon success FsCheck only provides the happy, but otherwise unsatisfying report
*** VectorTest.WindowedTest
Ok, passed 100 tests.
This is where classify
and collect
come in, allowing us to categorize the input and satisfy ourselves its range is reasonable.
*** VectorTest.WindowedTest
Ok, passed 100 tests.
6% (5, 2), windowLength, outerLength.
6% (5, 1), windowLength, outerLength.
5% (5, 4), windowLength, outerLength.
5% (3, 1), windowLength, outerLength.
4% (5, 1), empty.
4% (4, 1), windowLength, outerLength.
...
1% (1, 11), windowLength, outerLength.
1% (1, 1), windowLength, outerLength.
1% (1, 1), empty.
Verbosely putting it all together
Digging further we can get FsCheck to report the data of every generated test case using Check.Verbose
.
*** VectorTest.WindowedTest
0:
(seq [seq []], (4, []))
1:
(seq [seq [-1]; seq [2]; seq [2]; seq [-2]], (1, [-1; 2; 2; -2]))
2:
(seq [seq [1; -3]; seq [-3]], (2, [1; -3; -3]))
3:
(seq [seq []], (5, []))
...
99:
(seq
[seq [-33; 8; 48]; seq [-77; -31; 10]; seq [50; -75; -29]; seq [12; 58; -69];
...],
(3,
[-33; 8; 48; -77; -31; 10; 50; -75; -29; 12; 58; -69; -27; 14; 60; -67; -32;
31; 4; -10; -27; 60; -21; -54; 39; 9; -68; -9; -11; 83; 11; 0; -43; 60; 39;
80; -41; -1; 41; 82; -39; 1; 43; -83; -37; 4; 44; -81; -35; 6; 46; -79; -33;
8; 48; -77; -31; 10; -37; -64; -22; 24; 65; -62; -20; 26; 67; -54; -18; 28;
69; -52; -12; 30; 71; -50; -10; 32; 73; -48; -7; 34; 75; -46]))
Ok, passed 100 tests.
8% (3, 1), windowLength, outerLength.
...
Model-based checking by Command
You now have full command of property-based tests. There is another testing paradigm available within FsCheck, and that is by progressing the object under test from one state to another by means of "commands" and checking the state against an expected model. Kurt Schelfthout shows us how to use this technique at the end of the FSharpx.Collections.Deque tests
I'm not going to fully explain how this technique works, you can read about it here and study the Deque
and Vector
test examples. Instead I want to focus on transparency in stateful testing.
Out of the box a successful test gives us output like this:
*** VectorTest.Grow Vector<Vector<'T>>, check by flatten
Ok, passed 100 tests.
73% long sequnecs (>6 commands).
20% short sequences (between 1-6 commands).
1% trivial.
FsCheck.Commands.asProperty
already provides statistics on the range of our generated tests, but the range is over something generated like this:
let ``Grow, check by flatten`` =
[conjInner1Elem(checkFlatten); conjInnerEmpty(checkFlatten); appendInnerMulti(checkFlatten)]
[<Test>]
let ``Grow Vector<Vector<'T>>, check by flatten``() =
Check.QuickThrowOnFailure (asProperty (specVofV ``Grow, check by flatten``))
The sequences referred to in the output are sequences of generated commands, each potentially altering the previous state of an object under test. This opens the possibilities of
1) using commands as primitives in generating objects under test, and
2) testing multiple features in a single test.
I took advantage of both these possibilities to test the remaining new functions for Vector<Vector<'T>>.
Commands verbosely
So what do generated tests look like? For that, let's turn again to Check.Verbose
*** VectorTest.Grow Vector<Vector<'T>>, check by flatten
0:
[conjInner1Elem: elem = 2; conjInner1Elem: elem = -1]
1:
[conjInner1Elem: elem = 1; conjInnerEmpty; conjInnerEmpty]
2:
[conjInnerEmpty]
3:
[]
4:
[conjInnerEmpty; conjInner1Elem: elem = 0;
appendInnerMulti: elems = [-6; -2; 5; 6; -3; -2; 5];
appendInnerMulti: elems = [-2; 5; 6]; appendInnerMulti: elems = [-6; 1; -5];
conjInnerEmpty]
...
Ok, passed 100 tests.
75% long sequnecs (>6 commands).
16% short sequences (between 1-6 commands).
4% trivial.
It is necessary to override ToString()
in the commands you write to provide as much information as necessary, otherwise the record of test generations will be considerably less informative:
*** VectorTest.Grow Vector
Falsified command
So what output will a falsified command return? For this we will plant a little time-bomb, in order to make it interesting. The Pre
member will prevent execution of this command until the other commands have generated an object of sufficient length. Then changing the Post
member, where the check is performed to not
will cause falsification.
let conjInnerEmpty check =
Gen.constant <|
{ new ICommand<Vector2Actual,VectorModel>() with
member x.RunActual c = c |> conj empty
member x.RunModel m = m
member x.Pre m = (length m) > 0
member x.Post (c,m) = not (check (c,m))
override x.ToString() = sprintf "conjInnerEmpty"}
Resulting in this falsification output:
VectorTest.Grow Vector<Vector<'T>>, check by flatten:
System.Exception : Falsifiable, after 6 tests (2 shrinks) (StdGen (1351373830,295750028)):
[appendInnerMulti: elems = [0; -4; 0; -3]; conjInnerEmpty]
Conclusion
All-in-all, FsCheck is a powerful tool for unit test generation that provides full visibility into generation, execution, and repeatability. All of the practices I've outlined, verbose checking, classifying and collecting, and intentional falsification, are good exercises to run through every time you write a new test.