Gaining FsCheck Fluency through Transparency

You know the most important quality of a good unit test? It has to be incredibly easy to run.
–Onorio Catenacci

Transparency is a close second. It should be obvious what the unit test tests, how it tests it, and when it fails, how it failed and how to reproduce the failure.

FsCheck is the F# implementation of the well known Haskell QuickCheck test combinator library (also available for many other languages). This article demonstrates some of the tools and techniques available in FsCheck not only to make your tests transparent, but also to make FsCheck itself transparent.

(The rest of this article assumes some familiarity with FsCheck. If it is new to you, or you need to brush up on it, FsCheck – Breaking Your Code in New and Exciting Ways is an excellent place to start, and don’t forget the FsCheck documentation )

Choose your Check

The examples in this article come from tests of new functions in FSharpx.Collections.Vector to support Vector<Vector<‘T>>. Let’s look at one test implementing familiar property-based test techniques.

[<Test>]
let WindowedTest() =
    let testWindowed = 
        gen { let! windowLength = Gen.choose(1,5)
              let! source = Arb.generate<List<int>>
              return ((windowSeq windowLength source), (windowLength, source))
        }

    Check.QuickThrowOnFailure   (Prop.forAll  (Arb.fromGen testWindowed)

        (fun (vOfV, (windowLength, source)) -> 
            let outerLength =
                if source.Length = 0 then 1
                else int (Math.Ceiling((float)source.Length/(float)windowLength))
            (outerLength = vOfV.Length &&
                flatten vOfV |> List.ofSeq = source)
            |> Prop.classify (source.Length > 0 && outerLength > 0) "windowLength, outerLength"
            |> Prop.classify (source.Length = 0) "empty"
            |> Prop.collect (windowLength, outerLength)
        )
)

My go to form of FsCheck checking is usually a derivative of Check.Quick, which will check a single property with a default check config. Using the NUnit external runner on a compiled test project, as I usually do, calls for using the Check.QuickThrowOnFailure method, otherwise the runner will leave your test green lit, even though it reports as falsifiable in the Text Output tab

Check a falsification

Let's look at what FsCheck returns upon falsification by slipping a falsification into the property under test:

(outerLength = vOfV.Length && true = false &&

which has the desired result of displaying the generated data that failed:

*** VectorTest.WindowedTest
Falsifiable, after 1 test (0 shrinks) (StdGen (1338874294,295749962)):
(seq [seq [1]], (3, [1]))

This test data came from the return of the gen {...} GenBuilder above. It helps to understand the function under test.

Returns a vector of vectors of given length from the seq. Result may be a jagged vector.
windowSeq : int  -> seq<'T> -> Vector<Vector<'T>>

The first member of the outer tuple of test data is a Vector<Vector<'T>> produced from the parameter data in the second member, a tuple of window length and a source list. (FsCheck does not know about Vector<Vector<'T>>, so it prints it as seq [seq []].)

Classify your input

Once satisfied how FsCheck reports generated data, let's display more information about the range of generated data. Otherwise upon success FsCheck only provides the happy, but otherwise unsatisfying report

*** VectorTest.WindowedTest
Ok, passed 100 tests.

This is where classify and collect come in, allowing us to categorize the input and satisfy ourselves its range is reasonable.

*** VectorTest.WindowedTest
Ok, passed 100 tests.

6% (5, 2), windowLength, outerLength.
6% (5, 1), windowLength, outerLength.
5% (5, 4), windowLength, outerLength.
5% (3, 1), windowLength, outerLength.
4% (5, 1), empty.
4% (4, 1), windowLength, outerLength.
...
1% (1, 11), windowLength, outerLength.
1% (1, 1), windowLength, outerLength.
1% (1, 1), empty.

Verbosely putting it all together

Digging further we can get FsCheck to report the data of every generated test case using Check.Verbose.

*** VectorTest.WindowedTest
0:
(seq [seq []], (4, []))

1:
(seq [seq [-1]; seq [2]; seq [2]; seq [-2]], (1, [-1; 2; 2; -2]))

2:
(seq [seq [1; -3]; seq [-3]], (2, [1; -3; -3]))

3:
(seq [seq []], (5, []))
...

99:
(seq
   [seq [-33; 8; 48]; seq [-77; -31; 10]; seq [50; -75; -29]; seq [12; 58; -69];
    ...],
 (3,
  [-33; 8; 48; -77; -31; 10; 50; -75; -29; 12; 58; -69; -27; 14; 60; -67; -32;
   31; 4; -10; -27; 60; -21; -54; 39; 9; -68; -9; -11; 83; 11; 0; -43; 60; 39;
   80; -41; -1; 41; 82; -39; 1; 43; -83; -37; 4; 44; -81; -35; 6; 46; -79; -33;
   8; 48; -77; -31; 10; -37; -64; -22; 24; 65; -62; -20; 26; 67; -54; -18; 28;
   69; -52; -12; 30; 71; -50; -10; 32; 73; -48; -7; 34; 75; -46]))

Ok, passed 100 tests.

8% (3, 1), windowLength, outerLength.
...

Model-based checking by Command

You now have full command of property-based tests. There is another testing paradigm available within FsCheck, and that is by progressing the object under test from one state to another by means of "commands" and checking the state against an expected model. Kurt Schelfthout shows us how to use this technique at the end of the FSharpx.Collections.Deque tests

I'm not going to fully explain how this technique works, you can read about it here and study the Deque and Vector test examples. Instead I want to focus on transparency in stateful testing.

Out of the box a successful test gives us output like this:

*** VectorTest.Grow Vector<Vector<'T>>, check by flatten
Ok, passed 100 tests.

73% long sequnecs (>6 commands).
20% short sequences (between 1-6 commands).
1% trivial.

FsCheck.Commands.asProperty already provides statistics on the range of our generated tests, but the range is over something generated like this:

let ``Grow, check by flatten`` = 
    [conjInner1Elem(checkFlatten); conjInnerEmpty(checkFlatten); appendInnerMulti(checkFlatten)]

[<Test>]
let ``Grow Vector&lt;Vector&lt;'T>>, check by flatten``() =
    Check.QuickThrowOnFailure (asProperty (specVofV ``Grow, check by flatten``))

The sequences referred to in the output are sequences of generated commands, each potentially altering the previous state of an object under test. This opens the possibilities of

1) using commands as primitives in generating objects under test, and

2) testing multiple features in a single test.

I took advantage of both these possibilities to test the remaining new functions for Vector<Vector<'T>>.

Commands verbosely

So what do generated tests look like? For that, let's turn again to Check.Verbose

*** VectorTest.Grow Vector<Vector<'T>>, check by flatten
0:
[conjInner1Elem: elem = 2; conjInner1Elem: elem = -1]

1:
[conjInner1Elem: elem = 1; conjInnerEmpty; conjInnerEmpty]

2:
[conjInnerEmpty]

3:
[]

4:
[conjInnerEmpty; conjInner1Elem: elem = 0;
 appendInnerMulti: elems = [-6; -2; 5; 6; -3; -2; 5];
 appendInnerMulti: elems = [-2; 5; 6]; appendInnerMulti: elems = [-6; 1; -5];
 conjInnerEmpty]
 ...
Ok, passed 100 tests.

75% long sequnecs (>6 commands).
16% short sequences (between 1-6 commands).
4% trivial.

It is necessary to override ToString() in the commands you write to provide as much information as necessary, otherwise the record of test generations will be considerably less informative:

*** VectorTest.Grow Vector>, check by flatten
0:
[conjInner1Elem; conjInner1Elem]
1:
[conjInner1Elem; conjInnerEmpty]

2:
[conjInnerEmpty]

3:
[]

4:
[conjInnerEmpty; conjInner1Elem; appendInnerMulti;
 appendInnerMult; appendInnerMult; conjInnerEmpty]
 ...

Falsified command

So what output will a falsified command return? For this we will plant a little time-bomb, in order to make it interesting. The Pre member will prevent execution of this command until the other commands have generated an object of sufficient length. Then changing the Post member, where the check is performed to not will cause falsification.

let conjInnerEmpty check = 
    Gen.constant <|
        { new ICommand<Vector2Actual,VectorModel>() with
            member x.RunActual c = c |> conj empty
            member x.RunModel m = m
            member x.Pre m = (length m) > 0
            member x.Post (c,m) = not (check (c,m))
            override x.ToString() = sprintf "conjInnerEmpty"}

Resulting in this falsification output:

VectorTest.Grow Vector<Vector<'T>>, check by flatten:
System.Exception : Falsifiable, after 6 tests (2 shrinks) (StdGen (1351373830,295750028)):
[appendInnerMulti: elems = [0; -4; 0; -3]; conjInnerEmpty]

Conclusion

All-in-all, FsCheck is a powerful tool for unit test generation that provides full visibility into generation, execution, and repeatability. All of the practices I've outlined, verbose checking, classifying and collecting, and intentional falsification, are good exercises to run through every time you write a new test.

craftyThoughts

Composability -> Expressiveness -> Correctness -> Testability -> Intention