Testing with Good Data

TL;DR

If you have a lot of known good inputs for your system, use them to create a baseline test suite. This helps catch unexpected changes and regressions when you make updates.

1. Why Use Good Data for Testing?

Testing with only bad data (like trying to break things) is important, but it doesn’t tell you if your system still works normally. Good data confirms that the core functionality hasn’t been accidentally broken by changes.

Regression Testing: Ensures new code doesn’t ruin existing features.
Baseline Performance: Measures how quickly things should run when everything is working as expected.
Confidence in Updates: Gives you more trust that your changes are safe to deploy.

2. Gathering Your Benign Inputs

You need a collection of inputs that you know will work correctly with your system. Where do these come from?

Existing Data: Use real-world data that has been successfully processed before (ensure it doesn’t contain sensitive information!).
Sample Files: Create a set of representative sample files covering different valid scenarios.
Automated Generation: If possible, write scripts to automatically generate good inputs based on your system’s specifications.

3. Creating Your Baseline Test Suite

Now you’ll turn those inputs into a repeatable test suite.

Choose a Testing Framework: Select a framework appropriate for your language and system (e.g., pytest for Python, JUnit for Java).
Write Test Cases: For each good input, write a test case that:
Loads the input data.
Runs it through your system.
Verifies the expected output.
Automate Execution: Configure your framework to run all tests automatically (e.g., as part of a build process).

4. Example Test Case (Python with pytest)

Let’s say you have a function process_data(input_file) that reads data from a file and returns a result.

import pytest

def process_data(input_file):
  # Your actual processing logic here
  with open(input_file, 'r') as f:
    data = f.read()
  return data.upper() # Example: convert to uppercase

def test_good_input():
  # Create a sample input file
  with open('test_input.txt', 'w') as f:
    f.write('hello world')

  expected_output = 'HELLO WORLD'
  actual_output = process_data('test_input.txt')
  assert actual_output == expected_output

5. Running and Interpreting Results

Run your test suite regularly.

All Tests Pass: Great! Your system is likely still working as expected.
Tests Fail: Investigate immediately to find the cause of the failure. This could be a bug in new code, or an unexpected change in your environment.

6. Maintaining Your Test Suite

Good data tests aren’t ‘set it and forget it’.

Add New Tests: As you add features, create new test cases to cover them with good inputs.
Update Existing Tests: If the expected output changes (due to a valid feature update), modify your tests accordingly.
Regular Review: Periodically review your tests to ensure they are still relevant and effective.

TL;DR

1. Why Use Good Data for Testing?

2. Gathering Your Benign Inputs

3. Creating Your Baseline Test Suite

4. Example Test Case (Python with pytest)

5. Running and Interpreting Results

6. Maintaining Your Test Suite

Something Fresh

Zip Codes & PII: Are They Personal Data?

ZeroNet: 51% Attack Risks & Mitigation

Zero Knowledge Voting with Trusted Server

What People Reading

YubiKey Security: Initial Setup with Yubi Cloud

Zero-Day Vulnerabilities: User Defence Guide

Feedback and data-driven updates to Googles disclosure policy

ZAP: Brute Force Passwords

Security Insider Interview Series: John McArthur, Senior Product Manager, IP Intelligence; and Rupert Young, Senior Director Software Engineering, Data Compilation and Identity, Neustar

Categories

Partners

Just add here your partners image or promo text

Testing with Good Data

TL;DR

1. Why Use Good Data for Testing?

2. Gathering Your Benign Inputs

3. Creating Your Baseline Test Suite

4. Example Test Case (Python with pytest)

5. Running and Interpreting Results

6. Maintaining Your Test Suite

Related posts

Something Fresh

What People Reading

Categories

Partners

Just add here your partners image or promo text