Skip to main content

Property Based Testing: Step by Step

Reading: Property Based Testing: Step by Step

Property Based Testing

What is property based testing (PBT), anyway?

The basic idea is to validate an expected behavior of a system (a property of the system) against a range of data points. This is in contrast to example-based testing, which is the basis of most unit testing and microtesting. An example-based test case is, as the name implies, a single concrete example of an expected behavior of the system under test.

Most of the time, a well-chosen set of examples will provide high confidence that the system under test behaves as expected. Sometimes, there are many combinations and permutations of inputs, and it isn’t practical to define concrete examples that cover all possible scenarios. In these cases, PBT might be useful.

I say might be useful because the tools that support PBT are not trivial to work with. If your system operates on basic types the tool already knows how to generate, then it’s easy to set up PBT cases. On the other hand, if your system is that simple then it may not be valuable to run a large number of randomly-generated cases.

Simplistic Examples Don’t Tell the Story

When you take an interest in a new subject or technique, naturally your first stop is your favorite Internet search engine. You’ll find a lot of information about PBT that way. Unfortunately, most of it is unhelpful for novices. The examples tend to be either (a) too trivial to suggest any value in PBT, or (b) too complicated to follow unless you already know about PBT.

PBT doesn’t offer any value for validating simple algorithms or for validating the behavior of simple types. Yet, most of the introductory examples show testing of theories about the properties of simple types. Types such as integer and string are unlikely to exhibit surprising behavior. An application that consumes integer and string inputs could misbehave for edge cases, but it’s easy to define example-based cases that cover those situations.

Here’s an example from the ScalaTest website that illustrates the point:

import org.scalatest.junit.JUnitSuite
import org.scalatest.prop.Checkers
import org.scalacheck.Arbitrary._
import org.scalacheck.Prop._

class MySuite extends JUnitSuite with Checkers {
  @Test
  def testConcat() {
    check((a: List[Int], b: List[Int]) => a.size + b.size == (a ::: b).size)
  }
}

The example uses ScalaCheck, a PBT tool, which in turn uses ScalaTest and Checker. This is typical of PBT tools in all languages, as they make use of existing libraries for a good deal of their underlying functionality, especially things like assertions, assumptions, and theories.

The example is verifying that when two List objects are joined together, the length of the resulting List is the sum of the lengths of the two input Lists. The question is, do we really think that will ever be false? Is it worth the effort to set this up as a property-based test? Is it worth the runtime cost?

Other examples you will find online are similarly trivial. Here’s an example of Quick Theories for Java:

import static org.quicktheories.QuickTheory.qt;
import static org.quicktheories.generators.SourceDSL.*;

public class SomeTests {
  @Test
  public void addingTwoPositiveIntegersAlwaysGivesAPositiveInteger(){
    qt()
    .forAll(integers().allPositive()
          , integers().allPositive())
    .check((i,j) -> i + j > 0); 
  }
}

Yes, adding two positive integers yields a positive integer <yawn>. But this isn’t application behavior at all; it’s just basic arithmetic. If addition didn’t “work,” it would mean the compiler was broken. It wouldn’t say anything about the application under test.

And here’s one more, demonstrating JUnit Quickcheck for Java. It tests the theory that concatenating two strings will result in a string whose length is the sum of the lengths of the original strings.

import com.pholser.junit.quickcheck.Property;
    import com.pholser.junit.quickcheck.runner.JUnitQuickcheck;
    import org.junit.runner.RunWith;
    import static org.junit.Assert.*;

    @RunWith(JUnitQuickcheck.class)
    public class StringProperties {
        @Property public void concatenationLength(String s1, String s2) {
            assertEquals(s1.length() + s2.length(), (s1 + s2).length());
        }
    }
}    

A Better Example

Why do people post examples like those? They’re trying to show the basic structure of a property-based test without diving in too deep too soon. That’s a laudable goal. To generate anything beyond simple types that are built into the programming language, you have to write custom generators. Things start to get tricky from that point forward, and a person who is only looking into PBT for the first time can be overwhelmed quickly.

If your system operates on custom types and has many permutations of input values, then PBT could help you detect issues in the system that you would miss using example-based testing. However, the effort to create numerous custom generators and configure the test cases to constrain input values appropriately may be too great to justify. After all, the first principle of Lean-Agile Thinking is to take an economic view.

There is another reason to consider PBT. The tools don’t just run examples. They also apply logic to the test results to reduce the number of unique test cases necessary to disprove a theory. If you’re working with legacy code, as most of us do most of the time, this offers a useful tool for exploring the behavior of an application without having to write a large number of example-based characterization tests.

Here’s an example from Shawn Anderson that shows how this reduction process, called shrinking, looks. The example is in Ruby and it uses the Theft gem to support PBT.

t = Theft::Runner.new autosize: true
property_sorting_should_be_stable = lambda do |generated_arg| # , other_generated_arg, etc 
    sorted = Sorter.sort(generated_arg) 
    sorted.chunk{|item| item.n}.each do |n, items|
      if items.size > 1
        # lazily spot check
        first = items.first
        last = items.last 
        if generated_arg.index(first) > generated_arg.index(last)
          return :fail unless sorted.index(first) > sorted.index(last)
        else
          return :fail unless sorted.index(first) < sorted.index(last)
        end
      end
    end 
    :pass
  end  
  config = {
    description: "sorting should be stable",
    property: property_sorting_should_be_stable,
    arg_descriptors: [BoxedFixNumArgDescriptor],
    trials: 30
  } 
  t.run config    

I omitted the code of BoxedFixNumDescriptor, but you can see it on Shawn’s site if you’re interested. The more interesting point is what happens when the test runs:

ruby examples/stable_sort.rb =>
seed: 1408724650
.failed on trial: 1
[66] 26,14,63,70,23,100,38,64,41,92,74,88,28,38,79,76,3,3,90,80,81,61,54,19,39,
66,31,4,11,100,43,53,15,96,23,16,25,99,13,76,19,1,69,41,44,52,60,17,16,60,76,8,
84,76,24,16,53,51,18,0,69,79,18,88,89,59
shrunk to
[3] 76,3,3

Notice that the tool has determined the minimum number of unique test cases necessary to disprove the theory that the application exhibits stable sorting behavior. The economic view in this case is that in exchange for a bit of up-front effort, we have learned how to verify the application’s behavior in a cost-effective way. This can be very useful when the application is sufficiently complicated that it isn’t obvious how to cover it properly.

Data-Driven Testing

Do you need a complicated PBT tool to accomplish most of this? Not really. If your needs are more basic, you can get value from data-driven tests. Many unit testing libraries support them. In a data-driven test case, the same invocation of the application under test is done multiple times, passing different input values each time. The input data source is usually a table coded in the same programming language or an external source such as a text file or spreadsheet file.

Here’s an example using MSTest for C#, from Microsoft’s documentation (snippets only):

private TestContext testContextInstance;  
public TestContext TestContext  
{  
    get { return testContextInstance; }  
    set { testContextInstance = value; }  
}  
[DataSource(@"Provider=Microsoft.SqlServerCe.Client.4.0; Data Source=C:\Data\MathsData.sdf;", "Numbers")]  
[TestMethod()]  
public void AddIntegers_FromDataSourceTest()  
{  
    var target = new Maths();  

    // Access the data  
    int x = Convert.ToInt32(TestContext.DataRow["FirstNumber"]);  
    int y = Convert.ToInt32(TestContext.DataRow["SecondNumber"]);   
    int expected = Convert.ToInt32(TestContext.DataRow["Sum"]);  
    int actual = target.IntegerMethod(x, y);  
    Assert.AreEqual(expected, actual,  
        "x:<{0}> y:<{1}>",  
        new object[] {x, y});  

}  

The data source is a table with three columns: Two input values and the expected result.

This doesn’t give you randomly-generated data and doesn’t do any shrinking, but you may not always need those features. You could, conceivably, write your own randomizer to generate test values. Rolling your own shrinking functionality would be somewhat more challenging, and probably not worth the effort in view of the fact numerous PBT tools are available for many programming languages.

Theories

A step further than data-driven testing is the notion of theories. Libraries are available that help you test theories about the general behavior of applications. This is baked into PBT, but can also be done without the additional features offered by PBT tools. You can use the same libraries the PBT tools use independently.

Here’s a JUnit example:

@RunWith(Theories.class)
public class UserTest {
    @DataPoint
    public static String GOOD_USERNAME = "optimus";
    @DataPoint
    public static String USERNAME_WITH_SLASH = "optimus/prime";

    @Theory
    public void filenameIncludesUsername(String username) {
        assumeThat(username, not(containsString("/")));
        assertThat(new User(username).configFileName(), containsString(username));
    }
}

QuickCheck

QuickCheck for Haskell was one of the first, if not the first PBT tool. Here’s an example from Pedro Vasconcelos that tests the distributivity law for reverse:

import Test.QuickCheck

prop_revapp :: [Int] -> [Int] -> Bool
prop_revapp xs ys = reverse (xs++ys) == reverse xs ++ reverse ys

main = quickCheck prop_revapp

If that example looks more concise than the previous ones, it’s because PBT came out of the functional programming community, and it’s natural to express properties in a functional language like Haskell. Even so, QuickCheck has been re-implemented in a lot of other programming languages, most of which are not functional languages. Numerous other PBT tools exist, but those based on QuickCheck are in the center of a tradition and are more likely to be known and understood by many programmers whom you might hire.

With that in mind, I wanted to choose a QuickCheck-like tool for my first venture into PBT. I decided on JUnit Quickcheck.

Getting started with JUnit Quickcheck

The first order of business was to find an application that was just complicated enough to benefit from PBT without being too complicated for a newbie. I decided an application that scores poker hands would be appropriate. The logic isn’t too simple or too complicated, and there are many possible combinations and permutations of hands; too many to cover adequately using example-based test cases alone.

I wrote a toy poker scoring application, which you can examine on Github. The application is more complicated than strictly necessary to score poker hands, but real-world applications are the same way. A toy application that didn’t present challenges similar to a real application would not be a useful test bed for learning PBT. On the other hand, you wouldn’t want your first application to be so full of cruft that it was incomprehensible.

JUnit Quickcheck is in Java, so I wrote the poker application in Java, too. Nothing about PBT is language-specific, however. It’s a testing approach, not merely a tool.

Generating test data

One of the aspects of PBT that distinguishes it from data-driven testing and theories is the generation of pseudo-random test data. As mentioned above, PBT tools are able to generate data for basic data types, but for custom data types we must write our own generators.

The subject of my first property-based test is the method applyScoringRules() in class FiveCardStudGame. The method knows how to compare two hands using the rules of five-card stud with no suit ranking and nothing wild. The Hand class has a method, beats(), that looks like this:

public Result beats(Hand other, Game game) {
    return game.applyScoringRules(this, other);
}

When building the poker application, I wrote microtests expressing concrete examples to drive the development of applyScoringRules(). Most of them look like this:

@Test
public void straight_flush_beats_four_of_a_kind() {
    Game game = new FiveCardStudGame();
    Hand hand1 = new Hand(
        new PlayingCard(Suit.SPADES, Rank.SIX),
        new PlayingCard(Suit.SPADES, Rank.FOUR),
        new PlayingCard(Suit.SPADES, Rank.FIVE),
        new PlayingCard(Suit.SPADES, Rank.THREE),
        new PlayingCard(Suit.SPADES, Rank.TWO));
    Hand hand2 = new Hand(
        new PlayingCard(Suit.HEARTS, Rank.ACE),
        new PlayingCard(Suit.DIAMONDS, Rank.ACE),
        new PlayingCard(Suit.CLUBS, Rank.ACE),
        new PlayingCard(Suit.HEARTS, Rank.ACE),
        new PlayingCard(Suit.SPADES, Rank.TEN));
    assertEquals(Result.WIN, hand1.beats(hand2, game));
}

Looking at that sample and the rest of the test class, two things jump out. First, the examples are very repetitive. The test source code looks like wallpaper. Second, only a small subset of possible Hand combinations is checked. Yes, a straight flush beats four of a kind, but it also beats a full house, three of a kind, two pair, one pair, and a junk hand. None of those combinations is represented. Doing so would result in a very large number of very similar examples, all hard-coded. And so far, it only supports plain five-card stud. Many other poker variants exist. This is just the sort of situation where PBT can be useful.

Eventually, with a little luck and a lot of practice, I hope to be able to sit down a “just write” a property-based test class for any given piece of functionality. For my first attempt, I had to sneak up on it step by step.

One of the challenges I faced in learning from examples found online is that those examples are complete, and the accompanying explanations are poor. The authors solved all the problems they encountered along the way, and presented their finished work. I’m going to show the steps I took to learn how to write generators.

Step 1: Gutless Wonder

I started by laying out a test class that expressed the end goal, but lacked any “guts.” It’s gutless, and I wondered if it was correct.

@RunWith(JUnitQuickcheck.class)
public class HandScoringPropertyTest {
    @Property 
    public void strongerHandWins(Hand hand1, Hand hand2, Result result) {

        // This is where the guts should be    

        Game game = new FiveCardStudGame();
        assertEquals(result, hand1.beats(hand2, game));
    }
}

Notice the similarity between this test case and the MSTest/C# data-driven sample shown above. The MSTest sample takes three values from a row of a data table; two input values and the expected output value. The gutless wonder test case takes three values from data generators; two input values and the expected output value. Conceptually, not too different. The tricky part lies in gaining proficiency with PBT tools.

To fill in the guts, we need to be able to instantiate Hand and Result objects with different values. The applyScoringRules() method takes two arguments of type Hand. The Hand constructor takes an array of Card objects. For five-card stud, it’s five cards. The constructor for PlayingCard takes two arguments, a Suit and a Rank, both of which are enums.

Out of the box, Quickcheck has no idea how to generate data values for Suit, Rank, Card, and Hand. We must write a custom generator for each one of them. The test case depends on the Hand and Result generators. The Hand generator depends on the Card generator. The Card generator depends on the Suit and Rank generators. The way I approached it was to start at the bottom of the pile and build up from there.

Step 2: Create a Generator for Result

The Quickcheck documentation is both extensive and poorly-written (no offense, Paul). Here’s what I had to go on to figure out how to write custom generators for my test case: Generating values of other types. Fortunately, the source code is available.

Trying to follow the available examples, I wasn’t able to get Quickcheck to generate values for the Result enum. I found the source code for Quickcheck’s built-in EnumGenerator, which isn’t exactly what I needed but might offer some hints as to how to write a Result generator.

My first cut was this (omitting the imports):

public class Results extends Generator {
    private final Class enumType;

    public Results(Class enumType) {
        super(Enum.class);

        this.enumType = enumType;
    }

    @Override public Result generate(SourceOfRandomness random, GenerationStatus status) {
        Object[] values = enumType.getEnumConstants();
        int index = status.attempts() % values.length;
        return (Result) values[index];
    }

    @Override public boolean canShrink(Object larger) {
        return enumType.isInstance(larger);
    }
}

To see if it worked, I tweaked the test case a bit. Bear in mind this is not the finished product; I was feeling my way along in the dark. I wanted to show these intermediate stages because it seems a lot of people act as if they never do this; they just sit down and write the finished product. Well, they do this. They do it plenty. So don’t feel bad if you do it, too.

@RunWith(JUnitQuickcheck.class)
public class HandScoringPropertyTest {
    //@Property public void strongerHandWins(Hand hand1, Hand hand2, Result result) {
    @Property(trials=20)
    public void strongerHandWins(Result result) {

        System.out.println("Result: " + result);

//        Game game = new FiveCardStudGame();
//        assertEquals(result, hand1.beats(hand2));
    }
}

It produced this output:

Result: WIN
Result: LOSE
Result: TIE
Result: LOSE
Result: TIE
Result: TIE
Result: TIE
Result: WIN
Result: LOSE
Result: WIN
Result: TIE
Result: WIN
Result: WIN
Result: WIN
Result: WIN
Result: LOSE
Result: TIE
Result: WIN
Result: TIE
Result: WIN

A promising start!

Step 3: Create a Generator for Suit

Now to write the generator for Suit. It’s very similar to the one for Result.

public class Suits extends Generator {
    private final Class enumType;

    public Suits(Class enumType) {
        super(Enum.class);

        this.enumType = enumType;
    }

    @Override public Suit generate(SourceOfRandomness random, GenerationStatus status) {
        Object[] values = enumType.getEnumConstants();
        int index = status.attempts() % values.length;
        return (Suit) values[index];
    }

    @Override public boolean canShrink(Object larger) {
        return enumType.isInstance(larger);
    }
}

Tweaking the test class again:

@RunWith(JUnitQuickcheck.class)
public class HandScoringPropertyTest {
    //@Property public void strongerHandWins(Hand hand1, Hand hand2, Result result) {
    @Property(trials=20)
    public void strongerHandWins(Suit suit, Result result) {

        System.out.println("Suit: " + suit + ", Result: " + result);

//        Game game = new FiveCardStudGame();
//        assertEquals(result, hand1.beats(hand2));
    }
}

That gets us this far:

Suit: DIAMONDS, Result: TIE
Suit: HEARTS, Result: TIE
Suit: HEARTS, Result: WIN
Suit: SPADES, Result: LOSE
Suit: DIAMONDS, Result: TIE
Suit: NONE, Result: TIE
Suit: DIAMONDS, Result: TIE
...

Step 4: Filtering Generated Values

Notice the value Suit.NONE is generated. This is defined to support the Null Object pattern in the poker application. We don’t want it included in the input values for this test case.

I tried to force the generator to omit that value, to no avail. There’s probably a way. In the meantime, I used the Assume class to filter out occurrences of Suit.NONE from the generated data:

@RunWith(JUnitQuickcheck.class)
public class HandScoringPropertyTest {
    //@Property public void strongerHandWins(Hand hand1, Hand hand2, Result result) {
    @Property(trials=20)
    public void strongerHandWins(Suit suit, Result result) {
        assumeThat(suit, not(equalTo(Suit.NONE)));
        System.out.println("Suit: " + suit + ", Result: " + result);
//        Game game = new FiveCardStudGame();
//        assertEquals(result, hand1.beats(hand2));
    }
}

Now I was getting all the values of Suit that I wanted:

Suit: SPADES, Result: LOSE
Suit: CLUBS, Result: TIE
Suit: HEARTS, Result: LOSE
Suit: DIAMONDS, Result: TIE
Suit: CLUBS, Result: TIE
Suit: HEARTS, Result: TIE
Suit: CLUBS, Result: LOSE
Suit: DIAMONDS, Result: TIE
...

Writing a generator for Rank was essentially the same as writing one for Suit, so there’s no need to reiterate that here. Let’s skip ahead to the point where I have generators for Result, Suit, and Rank working individually.

Step 5: Create a Generator for Card

The next step is to generate values for Card. The PlayingCard constructor takes Suit and Rank arguments. That means the generator for PlayingCard must work in conjunction with those for Suit and Rank. This was new for me.

This is the part where I want to be open about the difficulty I had in getting this to work. I don’t want you to feel stupid if you have trouble with this. The examples and explanations you find online will never suggest that any of this is hard. When you’re first starting, it seems hard enough. I spent several hours trying different things to see if I could get Quickcheck to do anything at all. A generator class will compile with just about any code, but runtime is a different story.

The tool does a lot of things dynamically at runtime, and it has no way to know there’s a problem at compile time. That gives it some flexibility, but at a cost in time and effort when you’re trying to learn how it works.

I found some information on the Quickcheck documentation page for Generating Values for Complex Types. I did not find the examples to be very clear. They were more like vague hints. It looked as if I could use the gen() and constructor() methods to instantiate PlayingCard objects.

But it was not so. The arguments to the PlayingCard constructor are enums, and enums don’t have public constructors, so Quickcheck could not invoke a constructor via reflection. Rather than saying so, the tool generated an exception that said it could not find a generator for PlayingCard. I spent some time chasing that squirrel, trying to figure out why it couldn’t find the generator, and I was unable to find any obvious reason. It turns out the exception message did not describe the actual problem; and maybe it would be asking too much for it to do so, given its dynamic operation.

I switched to the type() method, and made progress. I progressed to the next exception. Long story short, I needed to tell Quickcheck which generator to use. It was unable to infer the correct generator based on the fact the argument was of type PlayingCard. Another squirrel, another hour. Eventually I found documentation for the @From annotation, and tried it. Finally, it worked.

public class PlayingCards extends Generator {
    public PlayingCards() {
        super(PlayingCard.class);
    }
    @Override
    public PlayingCard generate(SourceOfRandomness random, GenerationStatus status) {
        Suit suit = gen().type(Suit.class).generate(random, status);
        Rank rank = gen().type(Rank.class).generate(random, status);
        return new PlayingCard(suit, rank);
    }
}

...
 
@RunWith(JUnitQuickcheck.class)
public class HandScoringPropertyTest {
    @Property(trials=20)
    public void strongerHandWins(@From(PlayingCards.class) PlayingCard card, Result result) {
        assumeThat(card.suit(), not(equalTo(Suit.NONE)));
        assumeThat(card.rank(), not(equalTo(Rank.NONE)));
        System.out.println("Card: " + card.rank() + " of " + card.suit() + ", Result: " + result);
//        Game game = new FiveCardStudGame();
//        assertEquals(result, hand1.beats(hand2, game));
    }
}

...

Card: SEVEN of DIAMONDS, Result: WIN
Card: EIGHT of HEARTS, Result: LOSE
Card: ACE of HEARTS, Result: TIE
Card: JACK of CLUBS, Result: WIN
Card: FOUR of SPADES, Result: TIE
Card: FIVE of SPADES, Result: TIE
Card: FOUR of HEARTS, Result: LOSE
Card: EIGHT of SPADES, Result: WIN
Card: SEVEN of DIAMONDS, Result: LOSE
Card: JACK of SPADES, Result: LOSE
Card: ACE of CLUBS, Result: LOSE
Card: THREE of SPADES, Result: LOSE
Card: FOUR of CLUBS, Result: TIE
Card: JACK of SPADES, Result: TIE 

There’s another lesson here you need to remember. Notice that the test case specifies trials=20, but there are 14 lines of output. The assumeThat() calls filtered out 6 of 20 generated data points. When you filter out generated data points, there’s a chance the randomly-generated values will result in 0 test cases running. When that happens, the behavior of the various tools differ. For JUnit Quickcheck, expect the test run to abort with an exception. The point is, you may have to fine-tune the configuration of the generator to ensure at least one test case will run. It will depend on the characteristics of the application under test.

Step 6: Create a Generator for Hand

Getting the PlayingCard generator to work had boosted my spirits. The Hand generator would be the first one that invoked generators more than one level down from itself. It looked as if each generator dealt with its own immediate dependencies, so the generation process ought to cascade down gracefully. On the other hand, the forest had been rife with squirrels today.

There were a couple of other new things with the Hand generator. First, it needed five instances of PlayingCard, not just one. Second, PlayingCard is a real class and not an enum, so the gen().constructor() pattern might actually be appropriate here. One way to find out.

The Hand constructor takes a single vararg argument of type Card. JUnit Quickcheck doesn’t work with interfaces or abstract classes, as it has to invoke a constructor. So, my first attempt at a Hand generator looked like this:

public class Hands extends Generator {
    public Hands() {
        super(Hand.class);
    }
    @Override
    public Hand generate(SourceOfRandomness random, GenerationStatus status) {
        PlayingCard playingCard = gen().constructor(PlayingCard.class).generate(random, status);
        return new Hand(playingCard, playingCard, playingCard, playingCard, playingCard);
    }
}

Before trying this, I had a couple of worries:

  • Creation of the Suit and Rank objects was now “under the covers” from the point of view of the test case. The assumeThat() calls that had filtered out Suit.NONE and Rank.NONE values would have no effect.
  • There was no code to prevent duplicate cards being “dealt” to a hand, or the same card being “dealt” to both hands. I anticipated having to figure out a practical way to control this. The poker project as it stood at the time had no concept of a Deck and no function to deal hands. That would be the obvious place to control this issue. But the obvious place did not exist in the application. A temporary workaround in the test case would probably be needed.

But first things first: Let’s see if the test case can run the new generator at all. Squirrels, you know. Squirrels everywhere.

@RunWith(JUnitQuickcheck.class)
public class HandScoringPropertyTest {
    @Property(trials=20)
    public void strongerHandWins(@From(Hands.class) Hand hand1, @From(Hands.class) Hand hand2, Result result) {
        System.out.println("Hand 1: ");
        for (Card card : hand1.show()) {
            System.out.println("\t" + card.rank() + " of " + card.suit());
        }
        System.out.println("Hand 2: ");
        for (Card card : hand2.show()) {
            System.out.println("\t" + card.rank() + " of " + card.suit());
        }
//        Game game = new FiveCardStudGame();
//        assertEquals(result, hand1.beats(hand2, game));
    }
}

This resulted in:

ReflectionException: java.lang.NoSuchMethodException: com.neopragma.poker.PlayingCard.()

It was looking for a no-arg constructor, which PlayingCard does not have. By design, I don’t want it to have a no-arg constructor.

The squirrel led me to try adding class references to the constructor() call for the PlayingCard constructor arguments:

public class Hands extends Generator {
    public Hands() {
        super(Hand.class);
    }
    @Override
    public Hand generate(SourceOfRandomness random, GenerationStatus status) {
        PlayingCard playingCard = gen().constructor(PlayingCard.class, Suit.class, Rank.class).generate(random, status);
        return new Hand(playingCard, playingCard, playingCard, playingCard, playingCard);
    }
}

It worked! I was so excited. It isn’t often I can catch a squirrel. But the squirrel had friends:

Hand 1: 
    SEVEN of SPADES
    SEVEN of SPADES
    SEVEN of SPADES
    SEVEN of SPADES
    SEVEN of SPADES
Hand 2: 
    NONE of CLUBS
    NONE of CLUBS
    NONE of CLUBS
    NONE of CLUBS
    NONE of CLUBS

The NONEs were back, as expected. But it was obvious the generator was using the output from a single PlayingCard generator five times, instead of invoking the PlayingCard generator five times. In hindsight, this should have been obvious even to me. But sometimes you just can’t see the forest for the squirrels.

public class Hands extends Generator {
    public Hands() {
        super(Hand.class);
    }
    @Override
    public Hand generate(SourceOfRandomness random, GenerationStatus status) {
        PlayingCard[] playingCards = new PlayingCard[5];
        for (int i = 0 ; i < playingCards.length ; i++) {
            playingCards[i] = gen().constructor(PlayingCard.class, Suit.class, Rank.class).generate(random, status);
        }
        return new Hand(playingCards);
    }
}

And the output starts with…

Hand 1: 
    FIVE of DIAMONDS
    SEVEN of CLUBS
    ACE of CLUBS
    EIGHT of HEARTS
    ACE of CLUBS
Hand 2: 
    NONE of NONE
    ACE of DIAMONDS
    QUEEN of DIAMONDS
    THREE of DIAMONDS
    SEVEN of DIAMONDS
...

Better. But this implementation will only work for poker games that call for hands of 5 cards. Well…YAGNI for now. Just so happy to see something other than an exception.

Meanwhile, the question is: Does the NONE of NONE beat the ACE of CLUBS?

@RunWith(JUnitQuickcheck.class)
public class HandScoringPropertyTest {
    @Property(trials=20)
    public void strongerHandWins(@From(Hands.class) Hand hand1, @From(Hands.class) Hand hand2, Result result) {

        for (Hand hand : new Hand[] { hand1, hand2 }) {
            for (Card card : hand.show()) {
                assumeThat(card.rank(), not(equalTo(Rank.NONE)));
                assumeThat(card.suit(), not(equalTo(Suit.NONE)));
            }
        }

        System.out.println("Hand 1: ");
        for (Card card : hand1.show()) {
            System.out.println("\t" + card.rank() + " of " + card.suit());
        }
        System.out.println("Hand 2: ");
        for (Card card : hand2.show()) {
            System.out.println("\t" + card.rank() + " of " + card.suit());
        }
//        Game game = new FiveCardStudGame();
//        assertEquals(result, hand1.beats(hand2, game));
    }
}

The filtering resulted in:

Hand 1: 
    QUEEN of CLUBS
    TEN of DIAMONDS
    TEN of SPADES
    JACK of SPADES
    SEVEN of HEARTS
Hand 2: 
    JOKER of CLUBS
    NINE of SPADES
    QUEEN of SPADES
    SEVEN of CLUBS
    THREE of CLUBS
Hand 1: 
    THREE of SPADES
    TEN of CLUBS
    QUEEN of HEARTS
    NINE of HEARTS
    NINE of DIAMONDS
Hand 2: 
    FOUR of SPADES
    JOKER of DIAMONDS
    SIX of HEARTS
    EIGHT of HEARTS
    JOKER of HEARTS

That’s only two results out of 20 trials. Most of the generated values are filtered out. There’s also the problem of duplicated values, as shown in this sample output from another run:

Hand 1: 
    FIVE of HEARTS
    FIVE of HEARTS
    THREE of NONE
    THREE of DIAMONDS
    FOUR of NONE
Hand 2: 
    NINE of HEARTS
    TEN of SPADES
    FIVE of SPADES
    NONE of DIAMONDS
    NINE of NONE

The duplicate card problem will be solved by enhancing the poker application to have a domain concept of Deck and a function deal(). The Deck object will, of course, ensure each card occurs exactly once in a deck. The deal() operation will “know” which cards remain in the deck as it deals multiple hands. That design will model the real world more closely than the current, incomplete design. As a result, the property-based test case can be simplified.

One other small thing I just noticed: The hands can include Jokers. It isn’t my intent to test that functionality yet. There’s no logic in place in applyScoringRules() to deal with wild cards. I’ll add another filter.

Step 7: Run the Test Case with the Guts Installed

For now, lets increase the number of trials to ensure we get a meaningful number of examples, and see if the assertion will work. After all, the immediate goal isn’t to “fix” the poker application, but only to get started with JUnit Quickcheck.

By the way, this is the exception you get when you filter out everything:

AssertionError: No values satisfied property assumptions. Violated assumptions: [org.junit.AssumptionViolatedException: got: <NONE>, expected: not <NONE>

Revised test case

@RunWith(JUnitQuickcheck.class)
public class HandScoringPropertyTest {
    @Property(trials=100)
    public void strongerHandWins(@From(Hands.class) Hand hand1, @From(Hands.class) Hand hand2, Result result) {
        for (Hand hand : new Hand[] { hand1, hand2 }) {
            for (Card card : hand.show()) {
                assumeThat(card.rank(), not(equalTo(Rank.JOKER)));
                assumeThat(card.rank(), not(equalTo(Rank.NONE)));
                assumeThat(card.suit(), not(equalTo(Suit.NONE)));
            }
        }
        Game game = new FiveCardStudGame();
        assertEquals(result, hand1.beats(hand2, game));
    }
}

The triumph is that we ran the test case with its assertion, and we got a result.

java.lang.AssertionError: Property strongerHandWins falsified.expected:<WIN> but was:<LOSE>
Args: [com.neopragma.poker.Hand@76f2b07d, com.neopragma.poker.Hand@704a52ec, WIN]
Seeds: [851557390624495076, 2259324191015071640, 5899336532141574678]

An obvious problem is that we’re generating random Result values. We should intentionally generate hands with specific values, such as “full house” or “two pair,” and generate the appropriate Result for comparing hands that have those values. Using completely randomly-generated data points isn’t producing meaningful test input.

Another problem is that the output for a failed example doesn’t show us human-readable values for the inputs and results. Let’s deal with that problem first.

Step 8: Improving Output from Failed Examples

Adding a fairly elaborate failure message to the assertEquals() call, our test case looks like this:

@RunWith(JUnitQuickcheck.class)
public class HandScoringPropertyTest {
    @Property(trials=100)
    public void strongerHandWins(@From(Hands.class) Hand hand1, @From(Hands.class) Hand hand2, Result result) {
        for (Hand hand : new Hand[] { hand1, hand2 }) {
            for (Card card : hand.show()) {
                assumeThat(card.rank(), not(equalTo(Rank.JOKER)));
                assumeThat(card.rank(), not(equalTo(Rank.NONE)));
                assumeThat(card.suit(), not(equalTo(Suit.NONE)));
            }
        }
        Game game = new FiveCardStudGame();
        assertEquals(showHands(hand1, hand2), result, hand1.beats(hand2, game));
    }

    private String showHands(Hand...hands) {
        StringBuilder message = new StringBuilder();
        for (int i = 0 ; i < hands.length ; i++) {
            message.append("\nHand " + (i+1) + ":\n");
            for (Card card : hands[i].show()) {
                message.append("\t" + card.rank() + " of " + card.suit() + "\n");
            }
        }
        return message.toString();
    }
}

We can understand the reason for test failures a little more easily now:Caused by:

java.lang.AssertionError: 
Hand 1:
    THREE of SPADES
    SEVEN of CLUBS
    NINE of CLUBS
    THREE of HEARTS
    SEVEN of HEARTS

Hand 2:
    FIVE of HEARTS
    NINE of SPADES
    TWO of SPADES
    NINE of SPADES
    EIGHT of SPADES
 expected:<TIE> but was:<WIN>

Yes, two pair beats a pair. What I’d like to do is raise the level of abstraction for generating data points so that the generator produces hands with intentional values, like “full house” or “straight,” along with the corresponding Result value. We’ll leave that for another day.

We’ve turned a corner. We’re seeing issues with the tests that will drive improvements in the application. We’re no longer struggling with the tool. So, we’ve achieved our goal for today.

Conclusion

Some lessons I’ve learned from this exercise:

  • PBT can be useful for functionality that has certain characteristics, such as numerous different combinations of inputs that may lead to unexpected behaviors, difficult to test through example-based testing alone.
  • The cost of setting up PBTs is significantly higher than that of setting up conventional example-based test cases, where “cost” is measured in terms of effort, amount of code, and potential for error in the test cases themselves as opposed to the application under test. (It took me nine hours to complete this learning exercise, and the result is still not entirely satisfactory. It will probably take another nine hours to wrangle it all into shape, including application enhancements. At the end of all that, there will be just one test case.) Therefore, PBT should be used prudently where it adds value, and not applied across the board as a dictated “standard practice.”
  • The algorithmic complexity of the code under test is not the decision factor for choosing to do PBT. Rather, it’s the number of combinations of input values. It would be useful for developers to cultivate skills in combinatorial testing to help them craft the right number of PBT cases covering the right conditions.
  • The shrinking feature of PBT tools is one of the most compelling reasons to learn them, especially if we must explore the functionality of unfamiliar, existing code bases.

If you had any insights from reading this piece that I overlooked, I’d love to hear them.

Next Agile Myth Busting Is There Really No Documentation in Agile ? w/ Anil Jaising

Comments (2)

  1. Johannes Link
    Reply

    Building generators in PBT can be hard, but having them can give your testing a real boost. I think that the building blocks of jqwik.net make it much easier than junit-quickcheck does. I might translate your example to jqwik or, if you’re interested, we can do it together.

    Reply

Leave a comment

Your email address will not be published. Required fields are marked *