Controlling how test data is generated in QuickChe

2019-02-03 04:17发布

问题:

I wrote an algorithm to find a solution to the subset sum problem in Haskell. The signature is

subsetSum :: (Ord a, Num a) => [a] -> a -> Maybe [a]

QuickCheck seems to be a good fit to test that. For example I here is one of the properties that I could check:

prop_sumEqualsS l s = case subsetSum l s of
                        Just solution -> (sum solution) == s
                        Nothing       -> True

The problem is that the algorithm is quite computationally intensive and running 100 tests with big input lists takes ages to run.

I tried with QuickCheck 1 and it did run quickly, but the data sets used for testing were very small. After moving to QuickCheck 2 it seems to be the opposite problem. There is a manual for QC but it seems to be old (there is no date information) and I don't know how much still applies to QC2. A Tutorial is available on the Haskell Wiki but there is not much details, just a few words on instantiating Arbitrary.

So I have two questions:

  • What changes in QuickCheck 2 make it become so much slower than QuickCheck?
  • What is the best way to control data sets creation to make them useful for a given test?

Edit: To be more specific, I would like to test my solution with a list size from 0 to 100, containing integers between -10000 and 10000.

回答1:

One thing that QuickCheck 2 introduced that could be a source of inefficiency is the shrink function. If a property fails, then the shrink function is called which gives candidates on smaller test values, and they are shrunk until they cannot give a smaller value for which the property fails. But this only happens when properties fail.

The changes for the arbitrary instance for lists has not changed much between version 1 and version 2. The difference is in wording, version 1 uses vector, and version 2 uses a list comprehension, but then vector is implemented with exactly such a list comprehension of sequenced calls to arbitrary.

Both implementations used the size parameter. In QuickCheck 1, the size of a generated value is by default div n 2 + 3, where n is the number of the test. QuickCheck 2 uses another approach, where the maximum size and the number of tests is configured. The test sizes will be distributed in that range, growing linearly in the number of tests (see computeSize' in quickCheckWithResult). This means, with the default setting of 100 tests and maximum size, the maximum size from QuickCheck 1 would be (div 100 2 + 3) = 53, and with QuickCheck 2 it would simply be 100.

As subset sum is NP-complete you probably have an exponential algorithm ;) Then the difference in running time between a list of length 50 and 100 can of course be enormous. Understandably you want small lists to test with. You have two choices: make a new data type (preferably with newtype) or make a stand-alone generator and use forAll. By using newtype you can also specify how values should be shrunk. This is an example implementation using the newtype approach:

newtype SmallIntList = SmallIntList [Int] deriving (Eq,Show)

instance Arbitrary SmallIntList where
  arbitrary = sized $ \s -> do
                 n <- choose (0,s `min` 50)
                 xs <- vectorOf n (choose (-10000,10000))
                 return (SmallIntList xs)
  shrink (SmallIntList xs) = map SmallIntList (shrink xs)

This Arbitrary instance does not make lists longer than 50 elements. The elements do not use the size parameter, so they are always in the range [-10000,10000], so there is some room for improvement. The shrink function simply uses the available shrinks for lists and Ints.

To use this instance with your property, just use a SmallIntList as an argument:

prop_sumEqualsS (SmallIntList l) s = case subsetSum l s of
                                         ...