Automated Testing Guide
Overview
There is a great need within our development process to have easy to create, regular,
automatic regression and component tests of partial or full pipeline recipes. This guide
describes how to create new automated tests using the PipelineTester framework. These
tests run on 'main' each evening (Charlottesville time) if a change is made to either
pipeline or CASA code. Because we want the tests to run within a daily build process, they
should not run for more than a few hours total.
Test Types
The pipeline testing framework supports two distinct test modes:
Regression Tests
Regression tests validate complete pipeline recipes or PPR (Pipeline Processing Request) executions. These tests ensure that the full end-to-end pipeline processing produces expected results and catches regressions in pipeline behavior.
Located in
tests/regression/directoryOrganized into
fast/andslow/subdirectoriesRun complete recipes using either PPR files or recipe XML files
Compare comprehensive output metrics against versioned expected results
Component Tests
Component tests validate individual pipeline tasks or small sequences of tasks in isolation. These tests focus on specific functionality and are typically faster than full regression tests.
Located in
tests/component/directoryExecute specific tasks with controlled inputs
Test edge cases and specific functionality
Can validate task behavior without running the entire pipeline
Example component test structure:
tasks = [
('hifa_importdata', {'vis': 'path/to/data.ms'}),
('hif_selfcal', {}),
('hif_selfcal', {'restore_only': True}),
]
pt = PipelineTester(
visname=['data.ms'],
mode='component',
tasks=tasks,
output_dir='test_output',
expectedoutput_dir='pl-componenttest/test_name',
)
pt.run()
Test Organization
The test suite is organized hierarchically:
tests/- Root test directorycomponent/- Component testscomponent_test.py- Component test definitions
regression/- Regression testsfast/- Quick regression tests (typically < 30 minutes)alma_if_fast_test.py- ALMA interferometry fast testsalma_sd_fast_test.py- ALMA single-dish fast testsvla_fast_test.py- VLA fast testsvlass_fast_test.py- VLASS fast testsnobeyama_sd_fast_test.py- Nobeyama single-dish fast tests
slow/- Longer regression tests (typically > 30 minutes)alma_if_slow_test.py- ALMA interferometry slow testsalma_sd_slow_test.py- ALMA single-dish slow testsvla_slow_test.py- VLA slow testsvlass_slow_test.py- VLASS slow tests
testing_utils.py- CorePipelineTesterframeworktest_pipeline_testing_framework.py- Unit tests for the framework itself
Pytest Markers
Tests are automatically marked based on their location and can also have manual markers. Key markers include:
regression- Regression tests (auto-applied to tests inregression/)component- Component tests (auto-applied to tests incomponent/)fast- Fast-running tests (auto-applied to tests infast/subdirectories)slow- Slow-running tests (auto-applied to tests inslow/subdirectories, requires--longtestsflag)seven- 7m array teststwelve- 12m array testsimportdata- Tests involving data importselfcal- Tests involving self-calibrationmakeimages- Tests involving imaging
To run tests with specific markers:
pytest -m "fast and alma"
pytest -m "regression and not slow"
pytest -m "component and selfcal"
Inputs
To write a Pipeline regression test requires the following inputs to be available in the |testdata_repo| repository. Before adding new data to the repository, please review the |testdata_repo_readme|.
Required inputs:
Input SDM(s) or MS(s)
Expected test output
PPR (optional)
These inputs should be stored in a dataset-specific directory in the pl-regressiontest/
subdirectory of the |testdata_repo| git repo. Each of these files can have any name,
but in the interest of clarity you should name using the following convention for expected
results:
pl-regressiontest/
`-- <dataset basename>
| -- <dataset basename>.<release version>.results.txt
For example:
pl-regressiontest/
`-- uid___A002_Xc46ab2_X15ae_repSPW_spw16_17_small
|-- PPR.xml
|-- uid___A002_Xc46ab2_X15ae_repSPW_spw16_17_small.casa-6.1.1-15-pipeline-2020.1.0.40.casa.log
|-- uid___A002_Xc46ab2_X15ae_repSPW_spw16_17_small.casa-6.1.1-15-pipeline-2020.1.0.40.results.txt
`-- uid___A002_Xc46ab2_X15ae_repSPW_spw16_17_small.casa-6.1.3-2-pipeline-2021.1.1.6.results.txt
Input SDM/MS
Whenever possible, we should use the smallest, fastest running test datasets we can find. The first place to look is in the |testdata_repo| repository to see if it is already available. The next place to look would be our list of Small verification test datasets.
If the desired test dataset is already stored elsewhere in the repository, there is no need to add another. For example:
pl-unittest/
|-- uid___A002_Xc46ab2_X15ae_repSPW_spw16_17_small.ms
To add a new dataset, add a directory and data to pl-regressiontest/ or pl-componenttest/
in the testdata repository, using the instructions in the repository |testdata_repo_readme|.
Expected Output
Expected output should follow the format generated by
infrastructure/renderer/regression.py, with the addition of ::: and an optional
relative tolerance value at the end of the line. For example:
s15.hifa_gfluxscaleflag.uid___A002_Xc46ab2_X15ae_repSPW_spw16_17_small.num_rows_flagged.after=268048:::
s16.hifa_gfluxscale.uid___A002_Xc46ab2_X15ae_repSPW_spw16_17_small.field_0.spw_0.I=2.2542400978094923:::1e-6
If no tolerance is supplied, the test output will be compared with expected output using a
default tolerance set in the PipelineTester.run() method; currently 1e-7.
PPR (Optional)
If a PPR is supplied, the test framework will use executeppr for ALMA or
executevlappr for VLA. If no PPR is supplied, the tests will use recipereducer.py
and a given recipe. The test runner will create and clean up the necessary directory
structures to run the test.
Adding a New Regression Test
Regression tests use the PipelineTester framework with mode='regression' (the
default). To create a new test, add the required data to the data repository and add a new
function to the appropriate file in tests/regression/fast/ or tests/regression/slow/.
Basic Structure
Example regression test using a PPR:
@pytest.mark.seven
def test_uid___A002_Xc46ab2_X15ae_repSPW_spw16_17_small__PPR__regression():
"""Run ALMA cal+image regression on a small test dataset with a PPR file.
PPR: pl-regressiontest/uid___A002_Xc46ab2_X15ae_repSPW_spw16_17_small/PPR.xml
Dataset: uid___A002_Xc46ab2_X15ae_repSPW_spw16_17_small.ms
"""
ref_directory = 'pl-regressiontest/uid___A002_Xc46ab2_X15ae_repSPW_spw16_17_small'
pt = PipelineTester(
visname=['uid___A002_Xc46ab2_X15ae_repSPW_spw16_17_small.ms'],
ppr=f'{ref_directory}/PPR.xml',
input_dir='pl-unittest',
expectedoutput_dir=ref_directory,
)
pt.run()
Example regression test using a recipe:
def test_dataset__recipe_name__regression():
"""Run test with recipe XML file.
Recipe name: procedure_hifa_image
Dataset: uid___A002_Xef72bb_X9d29
"""
ref_directory = 'pl-regressiontest/uid___A002_Xef72bb_X9d29'
pt = PipelineTester(
visname=['uid___A002_Xef72bb_X9d29'],
recipe='procedure_hifa_image.xml',
input_dir=ref_directory,
expectedoutput_dir=ref_directory,
)
pt.run()
Adding a New Component Test
Component tests use the PipelineTester framework with mode='component' and require
a list of tasks to execute. Add new component tests to tests/component/component_test.py.
Basic Structure
Example component test:
@pytest.mark.importdata
@pytest.mark.selfcal
def test_dataset__task_sequence__component():
"""Run test of specific task sequence.
Dataset(s): dataset_name.ms
Task(s): hifa_importdata, hif_selfcal
"""
data_dir = 'pl-unittest'
visname = 'dataset_name.ms'
tasks = [
('hifa_importdata', {'vis': casa_tools.utils.resolve(os.path.join(data_dir, visname)),
'datacolumns': {'data': 'regcal_contline'}}),
('hif_selfcal', {}),
('hif_selfcal', {'restore_only': True}),
]
pt = PipelineTester(
visname=[visname],
mode='component',
tasks=tasks,
output_dir='test_output_dir',
expectedoutput_dir='pl-componenttest/test_name',
)
pt.run()
The tasks parameter is a list of tuples where each tuple contains:
The task name as a string (e.g.,
'hifa_importdata')A dictionary of task parameters (can be empty:
{})
Versioned Results Files
The PipelineTester framework automatically selects the most appropriate expected results
file based on the current CASA and Pipeline versions. This allows tests to maintain multiple
versions of expected results as software evolves.
File Naming Convention
Results files should follow this naming pattern:
<dataset_name>.casa-<CASA_version>-pipeline-<Pipeline_version>.results.txt
Examples:
uid___A002_Xc46ab2_X15ae.casa-6.5.1-15-pipeline-2023.1.0.8.results.txt
uid___A002_Xc46ab2_X15ae.casa-6.6.0-21-pipeline-2024.1.0.12.results.txt
Version Selection Logic
When expectedoutput_dir is specified instead of expectedoutput_file, the framework:
Scans the directory for all
*.results.txtfilesParses CASA and Pipeline versions from filenames
Filters out files with versions exceeding the current running versions
Selects the file with versions closest to (but not exceeding) the current versions
This automatic selection ensures tests use the most relevant expected results without manual intervention when CASA or Pipeline versions change.
Adding New Values to Compare
To add new regression values to check, one needs to add to or modify a class in infrastructure/renderer/regression.py.
For example, class FluxcalflagRegressionExtractor().
Running Tests Manually
Setup
To run these tests locally, you need to:
Check out the | repository
Configure CASA to find your test data by modifying
~/.casa/config.py:datapath = ['/path/to/pipeline-testdata']
Optionally, for larger test datasets, use the
--data-directoryoption to specify an alternate location
Running All Tests
Run all fast tests (skips slow tests by default):
pytest tests/
Run all tests including slow tests:
pytest --longtests tests/
Running Specific Test Categories
Run only component tests:
pytest tests/component/
Run only fast regression tests:
pytest tests/regression/fast/
Run only ALMA interferometry tests:
pytest tests/regression/fast/alma_if_fast_test.py
Running with Pytest Markers
Run only fast tests (auto-marked):
pytest -m fast
Run regression tests but skip slow ones:
pytest -m "regression and not slow"
Run component tests with selfcal:
pytest -m "component and selfcal"
Running Individual Tests
Run a specific test by name:
pytest tests/regression/fast/alma_if_fast_test.py::test_uid___A002_Xc46ab2_X15ae_repSPW_spw16_17_small__PPR__regression
Or using the shorter form:
pytest -k "test_uid___A002_Xc46ab2_X15ae"
Useful Command-Line Options
-vor-vv- Verbose output (use-vvfor extra detail)--longtests- Include slow tests (required for tests marked asslow)--compare-only- Skip pipeline execution, only compare against existing results--remove-workdir- Clean up working directories after tests complete--nologfile- Suppress CASA log file creation (keeps local repo clean)--junitxml=results.xml- Generate JUnit XML report-n <num>- Run tests in parallel using pytest-xdist (e.g.,-n 4)
Example with multiple options:
pytest -vv --longtests --nologfile --junitxml=results.xml tests/regression/fast/
Compare-Only Mode
If you have already run tests and want to re-evaluate results without re-running the pipeline:
pytest --compare-only tests/regression/fast/
This is useful for:
Tweaking comparison tolerances
Updating expected results files
Debugging test comparison logic
Test Results
Build success and failure is reported in Bamboo, and by email notification, but failure will not prevent a pre-release tarball from being published for download.
A successful test will show a passing status in Bamboo, while a failed test will include details about how your values differed from expected.
Note
Originally created by Joseph Masters, last updated by Shawn Booth on Jan 27, 2026.