How to run PUF-based tests on PRs? #2826
Replies: 6 comments
-
@MattHJensen I dropped a similar comment in @Peter-Metz's state-taxdata project, but you could read the PUF file from the OSPC AWS bucket like C/S does. To do this, you'd just need to set your AWS access variables as Github secrets. And then you can access them from a Github Actions run. At some point, I read that only users with collaborator-level access to a repository have the ability to read GH secrets. However, I'm having trouble finding that link right now. There is this quote from the Github Actions docs (emphasis is mine):
However, I'd like to test this out before making the PUF available. GitHub will redact secret values if they are printed, too. There's some more info here: https://docs.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets#naming-your-secrets |
Beta Was this translation helpful? Give feedback.
-
Could the synthetic PUF help? |
Beta Was this translation helpful? Give feedback.
-
The challenges of data privacy seem hard to overcome with outputs from tests being printed to the What is the reason behind testing with the PUF exactly? Do these tests pick up (1) errors in the logic? (2) errors related to using variables that don't exist in the PUF? (3) errors in assuming certain values for variables that might differ in the PUF vs CPS? (4) something else? I think if you identify the issues that are behind your motivation for wanting to include the PUF-based tests, you could perhaps write tests that check against such issues without using the PUF. E.g., I don't see very good testing of the internal logic of Tax-Calculator. |
Beta Was this translation helpful? Give feedback.
-
We have had this question in OG-USA both with regard to the CI computing time restraint and with regard to the PUF. And we have an additional question about storing data files used only for testing outside of the GitHub repository.
|
Beta Was this translation helpful? Give feedback.
-
@jdebacker asked why the PUF is used in testing:
The PUF tests help with (1) and (3) in that we keep track of baseline and can see deviations form them, but those purposes are also backstopped with our validation scripts against TAXSIM, the CPS-based tests, and some other unit testing. As you note, additional unit testing could be added as a further backstop, but I don't see that as a panacea because there's nothing that would prevent logic errors introduced in calcfunctions from being duplicated in the unit tests. The TAXSIM validation scripts and comparison of PUF-based results against other modeling outfits provides 3rd party validation, if incomplete and imperfect. We also use PUF for (2) to identify which parameters are "compatible" with the PUF, i.e., which parameters influence PUF-based results. See https://github.com/PSLmodels/Tax-Calculator/blob/3b9f4642ba53ed8e5d01155f543ad5d0e6f2f526/taxcalc/tests/test_compatible_data.py. Mostly we use the PUF for (4), "something else." Specifically, we keep an extensive record of baseline and reform results for Tax-Calculator when used with the puf.csv. These results give users a historical review of the Tax-Calculator + TaxData integrated capabilities and allow users to roughly compare Tax-Calculator baseline and reform against other sources like CBO. We rely on the reform comparisons regularly for development purposes when adding new parameters and reform capabilities. |
Beta Was this translation helpful? Give feedback.
-
Any reactions to this approach?
Downsides (and mitigating factors).
|
Beta Was this translation helpful? Give feedback.
-
Currently, only CPS-based tests (and other tests not requiring external tests data) are run automatically on PRs to GH.
This issue is to encourage the exploration of running PUF-based tests on PRs as well.
Running the full test suite on PRs would give contributors greater ownership of their commits, it would result in fewer problems/bugs founds during the release process, and it would be a significant step towards adopting continuous delivery or continuous deployment (to be discussed further in a forthcoming issue).
Challenges:
(1) is unlikely to present a significant problem, but it is something to be aware of.
For (2), a primary concern is that someone without access to the PUF could write a new test that prints or otherwise discloses secure data and then use the automated tests, triggered by a PR, to release those data.
One approach to resolving (2) would be that the tests only run automatically on PRs from whitelisted developers; for other developers, a reviewer would need to manually trigger the tests. We would need to find a testing service / automation service that offers this. There may be other solutions. All ideas regarding automation services or other solutions welcome!
Beta Was this translation helpful? Give feedback.
All reactions