Thursday, February 8, 2007

randomizing

Randomization has its place in automated testing, but it can also indicate the tester neglected failed to determine the boundary conditions.

Here's an example. Imagine you're testing a function that takes an integer as input. Any input value seems as good as another. Rather than picking a value, you generate one randomly. You run the test and it passes, so you declare the widget works.

Now, why would you use a random number generator rather than just picking a value? If one value is really as good as any other, wouldn't it be simpler to use a constant?

Perhaps the test logs all its inputs, and you want the values to differ so it's easier to distinguish one test from another. Fair enough; use a counter. After all, there is no guarantee that the random number generator will generate unique values. The values will be predicable.

Or perhaps you never got around to figuring out the boundary conditions. You figure using a random number generator increases the odds that you'll find a bug. Well, you might, but now you have an unrepeatable test. One time you run it, it works and you declare the code is ready to ship. The next time you run it, it fails. That's a lousy test. If you haven't figured out the boundary conditions, and you aren't in a position to try every possible value, you are better off looping over multiple values. It doesn't really matter which values you use since it's just a shot in the dark, but I suggest using a repeatable set. If you really want to use a random number generator, you can write a different program that generates a bunch of random values, and then hard-code those values into your test. But that seems like a lot of unnecessary work.

If you don't know the boundary conditions, don't lean on randomization. You are better off having a test that's repeatable. Later on, if someone discovers a bug that the test did not catch, you can augment the test.