Testing Guide#
Testing is a core part of Earth2Studio development and we have strict requirements for unit testing any new feature. Presently we expect new functionality to have over 90% coverage within reason. The core tool used for testing in Earth2Studio is PyTest and all unit tests can be found inside the test folder. Ensure that developer dependencies are install as outline in the developer overview section.
PyTest#
Earth2Studio uses tox to manage test environments and dependencies. Many unit tests, particularly ones for data sources, have timeout limits and are allowed to fail using the xfail decorator. It is normal for some tests to fail, particularly for machines with slower internet connections. Users should always look at the test summary to understand which tests produced an error but did not fail.
Test Environments#
Earth2Studio organizes tests into separate tox environments based on functionality:
test- Core tests that need no extra dependencies (IO, utilities, auto-models, batch)test-data- Data source and lexicon tests (requiresdataandcbottleextras)test-perturb- Perturbation method tests (requiresperturbationextra)test-stats- Statistical method tests (requiresstatisticsandutilsextras)test-px-models- Prognostic model tests (requiresallextra)test-dx-models- Diagnostic model tests (requiresallextra)test-serve- Serve/API tests (requiresserveextra)
Running Tests with Tox#
The recommended way to run tests is using the Makefile with a specific TOX_ENV:
# Run core tests
make pytest TOX_ENV=test
# Run data source tests
make pytest TOX_ENV=test-data
# Run model tests
make pytest TOX_ENV=test-px-models
make pytest TOX_ENV=test-dx-models
You can also run tox directly:
# Run a specific test environment
uvx tox -c tox.ini run -e test-data
# Run with additional pytest arguments
uvx tox -c tox.ini run -e test-data -- -v
Running Tests Directly with Pytest#
For quick iteration during development, you can run pytest directly:
# Quick test suite (skips slow tests)
pytest test/
# Standard test suite (includes slow tests)
pytest --slow test/
# Run specific test module
pytest test/data/test_gfs.py
Note
When running pytest directly, ensure you have the appropriate dependencies installed for the tests you’re running. The tox environments handle dependency management automatically.
CI Pipeline Structure#
The CI pipeline runs tests in separate jobs for each test environment, which improves debugging and allows parallel execution. Each job only runs when relevant files change:
test- Always runs (core functionality)test-data- Runs whenearth2studio/data,earth2studio/lexicon, or their tests changetest-perturb- Runs when perturbation code or tests changetest-stats- Runs when statistics code or tests changetest-px-models- Runs when prognostic model code or tests changetest-dx-models- Runs when diagnostic model code or tests changetest-serve- Runs when serve/API code or tests change
To force all tests to run regardless of changes, set the CI_PYTEST_ALL environment variable
to 1, true, yes, or on.
CI Model Package Cache#
In the CI pipeline a NFS mounted folder is used as a pre-cached store of all
pre-trained model checkpoint files (automodels) so each pipeline does not need to rerun
the download.
The cache folder is managed internally by the developers.
This should reflect a local cache if a user was to fetch and load all possible models.
The point of truth for generating this cache of model checkpoints should always be
test/models/test_auto_models.py.
This test all listed auto-models and create a ./cache folder with the following
command:
pytest --model-download -s test/models/test_auto_models.py
ls -l ./cache
Coverage#
Coverage is calculated using the coverage.py package configured in the pyproject.toml. Presently the CI pipeline will fail if unit test coverage drops below 90%.
The CI pipeline runs tests using tox environments which automatically collect coverage data. After all test jobs complete, coverage is combined and reported:
# Run tests (coverage is collected automatically by tox)
make pytest TOX_ENV=test-data
# Combine coverage from all test runs
make coverage
The make coverage command combines coverage data from all test environments and generates
a report. The CI pipeline runs this automatically after all test jobs complete.
Contributing Tests#
We expect all contributors to also provide unit tests for their feature. Tests are always evolving based on new errors discovered and features added. Right now the best way to contribute unit tests for a new feature is to modify the tests for a feature of a similar type. For example, for a new data source, copy and modify the tests for say GFS or ARCO. For additional information about the tests, reach out with an issue and the developers can provide more information.
If you are contributing a model, be sure to make the developers aware of this to properly integrate it into the CI pipeline.