Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Do not merge] Run Eval Engine Against the existing unit test. #1376

Closed
wants to merge 7 commits into from

Conversation

yliuuuu
Copy link
Contributor

@yliuuuu yliuuuu commented Feb 23, 2024

Relevant Issues

  • [Closes/Related To] Issue #XXX

Description

  • This PR sets up a testing pipeline that allows us to run the eval engine against the existing unit tests.
  • The purpose is to understand and identify the gap between the existing engines and the new engine (in terms of bugs and purposeful behavior difference).
  • Ergonomically speaking units tests are a little better comparing to conformance tests as well, as unit tests allows for running different test suites...

Other Information

  • Updated Unreleased Section in CHANGELOG: [YES/NO]

    • < If NO, why? >
  • Any backward-incompatible changes? [YES/NO]

    • < If YES, why? >
    • < For this purpose, we define backward-incompatible changes as changes that—when consumed—can potentially result in
      errors for users that are using our public APIs or the entities that have public visibility in our code-base. >
  • Any new external dependencies? [YES/NO]

    • < If YES, which ones and why? >
    • < In addition, please also mention any other alternatives you've considered and the reason they've been discarded >
  • Do your changes comply with the Contributing Guidelines
    and Code Style Guidelines? [YES/NO]

License Information

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link

github-actions bot commented Feb 23, 2024

Conformance comparison report-Cross Engine

Base (eval) legacy +/-
% Passing 77.41% 92.47% 15.06%
✅ Passing 4504 5380 876
❌ Failing 1314 438 -876
🔶 Ignored 0 0 0
Total Tests 5818 5818 0
Number passing in both: 4335

Number failing in both: 269

Number passing in eval engine but fail in legacy engine: 169

Number failing in eval engine but pass in legacy engine: 1045
⁉️ CONFORMANCE REPORT REGRESSION DETECTED ⁉️
The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.
1045 test(s) were failing in eval but now pass in legacy. Before merging, confirm they are intended to pass.
The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

Conformance comparison report-Cross Commit-EVAL

Base (f1aeb6f) 26bf3da +/-
% Passing 59.42% 77.41% 18.00%
✅ Passing 3457 4504 1047
❌ Failing 2361 1314 -1047
🔶 Ignored 0 0 0
Total Tests 5818 5818 0
Number passing in both: 3455

Number failing in both: 1312

Number passing in Base (f1aeb6f) but now fail: 2

Number failing in Base (f1aeb6f) but now pass: 1049
⁉️ CONFORMANCE REPORT REGRESSION DETECTED ⁉️. The following test(s) were previously passing but now fail:

Click here to see
  • PG_JOIN_04, compileOption: PERMISSIVE
  • PG_JOIN_04, compileOption: LEGACY
1049 test(s) were previously failing but now pass. Before merging, confirm they are intended to pass The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

Conformance comparison report-Cross Commit-LEGACY

Base (f1aeb6f) 26bf3da +/-
% Passing 92.47% 92.47% 0.00%
✅ Passing 5380 5380 0
❌ Failing 438 438 0
🔶 Ignored 0 0 0
Total Tests 5818 5818 0
Number passing in both: 5380

Number failing in both: 438

Number passing in Base (f1aeb6f) but now fail: 0

Number failing in Base (f1aeb6f) but now pass: 0

@yliuuuu
Copy link
Contributor Author

yliuuuu commented May 21, 2024

Before 1.0 release, we will port existing unit test for the new eval engine.

@yliuuuu yliuuuu closed this May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants