New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Integration tests #149

Open

roxanne-o wants to merge 42 commits into main from integration-tests

Contributor

roxanne-o commented Dec 9, 2024 •

edited

Loading

These tests expand on the basic integration we had for k3s. They are a bit unusual as they weave python and bash together. We test that we can setup the inference pods, submit to edge and get a low confidence answer at first, submit to edge and escalate to cloud to train the model, pull the updated model back down to the cloud into a new inference pod, and now make a confident edge prediction.

roxanne-o and others added 30 commits

November 23, 2024 00:16


          test with k3s

25d1aaa


          more stuff


          fix a few things up

a9ac588


          Automatically reformatting code with black and isort

7bcf454


          update this

9e4312f


          get rid of this

0fa2948


          Merge branch 'test-with-k3s' of https://github.com/groundlight/edge-e…

c8d869b

…ndpoint into test-with-k3s


          some more stuff

7f0f8e0


          fix yaml

20d4fda


          add poetry install

9a66be7


          watch for rollout status

a7711db


          describe pods for debugging

b2e7131


          try this

40b13d6


          try this??

bf624b1


          try this?

f3afd07


          hm okay try this

3da9411


          add this as well

9364ee8


          add rollout stuff

46a72dc


          Update test/setup_k3s_test_environment.sh

35a7e22

Co-authored-by: Tom Faulhaber <[email protected]>


          Merge branch 'main' of https://github.com/groundlight/edge-endpoint i…

25b90d1

…nto test-with-k3s


          some stuff

c054d04

fix

7f73916


          Merge branch 'main' of https://github.com/groundlight/edge-endpoint i…

671f0a2

…nto test-with-k3s


          stuff passes now

e51557b


          update pipeline

183ecd8


          fix pipeline

767d88f


          fix sequencing

ee07cca


          nevermind

21d66f3


          setup basic

535fab1


          more stuff

b3958aa

Auto-format Bot and others added 12 commits

December 9, 2024 17:20


          Automatically reformatting code with black and isort

501fb98


          more stuff

75e26b9


          Merge branch 'integration-tests' of https://github.com/groundlight/ed…

f621047

…ge-endpoint into integration-tests


          Automatically reformatting code with black and isort


          finishing up

ca40e11


          Automatically reformatting code with black and isort

e6d8ad8


          tweak some stuff

b63eedf


          Merge branch 'integration-tests' of https://github.com/groundlight/ed…

142d673

…ge-endpoint into integration-tests


          Automatically reformatting code with black and isort

6175cc4


          cleaning up pt2

ffac9e5


          Merge branch 'integration-tests' of https://github.com/groundlight/ed…

8d0280e

…ge-endpoint into integration-tests


          Automatically reformatting code with black and isort

471a60f

roxanne-o requested review from tomfaulhaber and CoreyEWood

December 13, 2024 15:56

tomfaulhaber approved these changes

View reviewed changes

Contributor

tomfaulhaber left a comment

LGTM with one spelling fix

test/integration/dog.jpg Outdated

Contributor

tomfaulhaber Dec 13, 2024

That's a good dog! 14/10!

test/integration/integration_test.py

+              from model import Detector
+              NUM_IQS_TO_IMPROVE_MODEL = 10
+              ACCETABLE_TRAINED_CONFIDENCE = 0.8

Contributor

tomfaulhaber Dec 13, 2024

Suggested change

      
            ACCETABLE_TRAINED_CONFIDENCE = 0.8
          
            ACCEPTABLE_TRAINED_CONFIDENCE = 0.8

And wherever this value is used

CoreyEWood approved these changes

View reviewed changes

Contributor

CoreyEWood left a comment

Left some various comments about small things, but overall this is great! Very excited to have this.

test/integration/integration_test.py

+                      "--mode",
+                      type=str,
+                      choices=["create_detector", "initial", "improve_model", "final"],
+                      help="Mode of operation: 'initial', 'many', or 'final'",

Contributor

CoreyEWood Dec 13, 2024

Should this have "create_detector", "initial", "improve_model", and "final" as the options? Or is this saying something different?

test/integration/integration_test.py

+              def create_cat_detector() -> str:
+                  """Create the intial cat detector that we use for the integration tests. We create
+                  a new one each time."""
+                  random_number = random.randint(0, 9999)

Contributor

CoreyEWood Dec 13, 2024

Is it worth increasing the range here just to make it even more unlikely that we get a collision? You could also generate a ksuid like we do here to ensure there's no issues.

test/integration/integration_test.py

Comment on lines +86 to +89

+                  # a bit dependent on the current default model,
+                  # but that one always defaults to 0.5 confidence at first.
+                  assert iq_yes.result.confidence == 0.5
+                  assert iq_no.result.confidence == 0.5

Contributor

CoreyEWood Dec 13, 2024

At some point we're planning to make the default edge binary pipeline be our normal default binary pipeline, which does make actual zeroshot predictions (which are still close to 0.5, but not exactly 0.5). Maybe this should check that the confidence is in a slightly wider range? I'm worried we won't remember to update this when we change the default edge pipeline.

test/integration/integration_test.py

+              from groundlight import Groundlight, GroundlightClientError
+              from model import Detector
+              NUM_IQS_TO_IMPROVE_MODEL = 10

Contributor

CoreyEWood Dec 13, 2024

This name is slightly misleading because I think double this amount of IQs actually get submitted? Maybe it should be renamed to something likeNUM_IQS_PER_CLASS_TO_IMPROVE_MODEL, though I don't think it's too important either way.

test/integration/run_tests.sh

		time_difference=$((current_time - pod_creation_time_seconds))


		# Check if the pod was created within 1.1 times the refresh rate

Contributor

CoreyEWood Dec 13, 2024

Should say 3 times the refresh rate here I think.

Base automatically changed from test-with-k3s to main

December 17, 2024 22:47

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet