TL;DR
If you have a lot of known good inputs for your system, use them to create a baseline test suite. This helps catch unexpected changes and regressions when you make updates.
1. Why Use Good Data for Testing?
Testing with only bad data (like trying to break things) is important, but it doesn’t tell you if your system still works normally. Good data confirms that the core functionality hasn’t been accidentally broken by changes.
- Regression Testing: Ensures new code doesn’t ruin existing features.
- Baseline Performance: Measures how quickly things should run when everything is working as expected.
- Confidence in Updates: Gives you more trust that your changes are safe to deploy.
2. Gathering Your Benign Inputs
You need a collection of inputs that you know will work correctly with your system. Where do these come from?
- Existing Data: Use real-world data that has been successfully processed before (ensure it doesn’t contain sensitive information!).
- Sample Files: Create a set of representative sample files covering different valid scenarios.
- Automated Generation: If possible, write scripts to automatically generate good inputs based on your system’s specifications.
3. Creating Your Baseline Test Suite
Now you’ll turn those inputs into a repeatable test suite.
- Choose a Testing Framework: Select a framework appropriate for your language and system (e.g., pytest for Python, JUnit for Java).
- Write Test Cases: For each good input, write a test case that:
- Loads the input data.
- Runs it through your system.
- Verifies the expected output.
- Automate Execution: Configure your framework to run all tests automatically (e.g., as part of a build process).
4. Example Test Case (Python with pytest)
Let’s say you have a function process_data(input_file) that reads data from a file and returns a result.
import pytest
def process_data(input_file):
# Your actual processing logic here
with open(input_file, 'r') as f:
data = f.read()
return data.upper() # Example: convert to uppercase
def test_good_input():
# Create a sample input file
with open('test_input.txt', 'w') as f:
f.write('hello world')
expected_output = 'HELLO WORLD'
actual_output = process_data('test_input.txt')
assert actual_output == expected_output
5. Running and Interpreting Results
Run your test suite regularly.
- All Tests Pass: Great! Your system is likely still working as expected.
- Tests Fail: Investigate immediately to find the cause of the failure. This could be a bug in new code, or an unexpected change in your environment.
6. Maintaining Your Test Suite
Good data tests aren’t ‘set it and forget it’.
- Add New Tests: As you add features, create new test cases to cover them with good inputs.
- Update Existing Tests: If the expected output changes (due to a valid feature update), modify your tests accordingly.
- Regular Review: Periodically review your tests to ensure they are still relevant and effective.