Edge only inference with cloud training (#101)

[[COM-1567](https://positronixcorp.atlassian.net/browse/COM-1567)] This PR adds an `edge_only_inference` mode which allows the model to escalate to the cloud for future training, while still always returning the edge inference answer for fast results. Tested with 3 binary detectors with 1. `edge_only` enabled -- nothing escalated to cloud 2. `edge_only_inference` enabled -- low confidence IQs escalated to cloud, edge answer returned 3. Neither `edge_only` nor `edge_only_inference` enabled -- low confidence edge IQs escalated to cloud and cloud answer returned. 4. `edge_only` and `edge_only_inference` enabled -- pods do not launch, logs show the config validation error Currently not adding many unit tests as they aren't set up to test `post_image_query`, and we want to get this change out quickly (discussed with Tyler). This is a first step -- in the future we may want to add: 1. A task queueing system, instead of using FastAPI background tasks 2. Rate limiting for IQs sent to the cloud (either add an element of randomness for whether a query is sent to the cloud, or limit escalation to a certain number of queries over a period of time) --------- Co-authored-by: f-wright <[email protected]> Co-authored-by: Auto-format Bot <[email protected]>
groundlight · Oct 8, 2024 · 3755ba4 · 3755ba4
1 parent af20f53
commit 3755ba4
Show file tree

Hide file tree

Showing 6 changed files with 74 additions and 8 deletions.
diff --git a/README.md b/README.md
@@ -50,23 +50,32 @@ print(f"The answer is {image_query.result}")
 See the [SDK's getting started guide](https://code.groundlight.ai/python-sdk/docs/getting-started) for more info.
 
 ### Experimental: getting only edge model answers
-If you only want to receive answers from the edge model for a detector, you can enable edge-only mode for it. To do this, edit the detector's configuration in the [edge config file](./configs/edge-config.yaml) like so:
+If you only want to receive answers from the edge model for a detector, you can enable edge-only mode for it. This will prevent the edge endpoint from sending image queries to the cloud API. If you want fast edge answers regardless of confidence but still want the edge model to improve, you can enable edge-only inference for that detector. This mode will always return the edge model's answer, but it will also submit low confidence image queries to the cloud API for training.
+
+To do this, edit the detector's configuration in the [edge config file](./configs/edge-config.yaml) like so:
 ```
 detectors:
   - detector_id: 'det_xyz'
     motion_detection_template: "disabled"
     local_inference_template: "default"
     edge_only: true
 
+  - detector_id: 'det_ijk'
+    motion_detection_template: "disabled"
+    local_inference_template: "default"
+    edge_only_inference: true
+
   - detector_id: 'det_abc'
     motion_detection_template: "default"
     local_inference_template: "default"
 ```
-In this example, `det_xyz` will have edge-only mode enabled because `edge_only` is set to `true`. If `edge_only` is not specified, it defaults to false, so `det_abc` will have edge-only mode disabled.
+In this example, `det_xyz` will have edge-only mode enabled because `edge_only` is set to `true`. `det_ijk` will have edge-only inference enabled because `edge_only_inference` is set to `true`. If `edge_only` or `edge_only_inference` are not specified, they default to false, so `det_abc` will have edge-only mode disabled. Only one of `edge_only` or `edge_only_inference` can be set to `true` for a detector.
 
 With edge-only mode enabled for a detector, when you make requests to it, you will only receive answers from the edge model (regardless of the confidence). Additionally, note that no image queries submitted this way will show up in the web app or be used to train the model. This option should therefore only be used if you don't need the model to improve and only want fast answers from the edge model.
 
-If edge-only mode is enabled on a detector and the edge inference model for that detector is not available, attempting to send image queries to that detector will return a 500 error response.
+With edge-only inference enabled for a detector, when you make requests to it, you will only receive answers from the edge model (regardless of the confidence). However, image queries submitted this way with confidences below the threshold will be escalated to the cloud and used to train the model. This option should be used when you want fast edge answers (regardless of confidence) but still want the model to improve.
+
+If edge-only or edge-only inference mode is enabled on a detector and the edge inference model for that detector is not available, attempting to send image queries to that detector will return a 500 error response.
 
 This feature is currently not fully compatible with motion detection. If motion detection is enabled, some image queries may still be sent to the cloud API.
 

diff --git a/app/api/routes/image_queries.py b/app/api/routes/image_queries.py
@@ -3,7 +3,7 @@
 from typing import Optional
 
 import numpy as np
-from fastapi import APIRouter, Depends, HTTPException, Query, Request, status
+from fastapi import APIRouter, BackgroundTasks, Depends, HTTPException, Query, Request, status
 from groundlight import Groundlight
 from model import (
     Detector,
@@ -75,6 +75,7 @@ async def post_image_query(  # noqa: PLR0913, PLR0915, PLR0912
     want_async: Optional[str] = Query(None),
     gl: Groundlight = Depends(get_groundlight_sdk_instance),
     app_state: AppState = Depends(get_app_state),
+    background_tasks: BackgroundTasks = BackgroundTasks(),
 ):
     """
     Submit an image query for a given detector.
@@ -120,11 +121,14 @@ async def post_image_query(  # noqa: PLR0913, PLR0915, PLR0912
 
     detector_config = app_state.edge_config.detectors.get(detector_id, None)
     edge_only = detector_config.edge_only if detector_config is not None else False
+    edge_only_inference = detector_config.edge_only_inference if detector_config is not None else False
 
     # TODO: instead of just forwarding want_async calls to the cloud, facilitate partial
     #       processing of the async request on the edge before escalating to the cloud.
     _want_async = want_async is not None and want_async.lower() == "true"
-    if _want_async and not edge_only:  # If edge-only mode is enabled, we don't want to make cloud API calls
+    if _want_async and not (
+        edge_only or edge_only_inference
+    ):  # If edge-only mode is enabled, we don't want to make cloud API calls
         logger.debug(f"Submitting ask_async image query to cloud API server for {detector_id=}")
         return safe_call_sdk(
             gl.submit_image_query,
@@ -183,11 +187,25 @@ async def post_image_query(  # noqa: PLR0913, PLR0915, PLR0912
         results = edge_inference_manager.run_inference(detector_id=detector_id, image=image)
         confidence = results["confidence"]
 
-        if edge_only or _is_confident_enough(confidence=confidence, confidence_threshold=confidence_threshold):
+        if (
+            edge_only
+            or edge_only_inference
+            or _is_confident_enough(
+                confidence=confidence,
+                confidence_threshold=confidence_threshold,
+            )
+        ):
             if edge_only:
                 logger.debug(
                     f"Edge-only mode enabled - will not escalate to cloud, regardless of confidence. {detector_id=}"
                 )
+            elif edge_only_inference:
+                logger.debug(
+                    "Edge-only inference mode is enabled on this detector. The edge model's answer will be "
+                    "returned regardless of confidence, but will still be escalated to the cloud if the confidence "
+                    "is not high enough. "
+                    f"{detector_id=}"
+                )
             else:
                 logger.debug(f"Edge detector confidence is high enough to return. {detector_id=}")
 
@@ -207,6 +225,13 @@ async def post_image_query(  # noqa: PLR0913, PLR0915, PLR0912
                 text=results["text"],
             )
             app_state.db_manager.create_iqe_record(iq=image_query)
+
+            if edge_only_inference and not _is_confident_enough(
+                confidence=confidence,
+                confidence_threshold=confidence_threshold,
+            ):
+                logger.info("Escalating to the cloud API server for future training due to low confidence.")
+                background_tasks.add_task(safe_call_sdk, gl.ask_async, detector=detector_id, image=image)
         else:
             logger.info(
                 f"Edge-inference is not confident, escalating to cloud. ({confidence} < thresh={confidence_threshold})"
@@ -224,6 +249,10 @@ async def post_image_query(  # noqa: PLR0913, PLR0915, PLR0912
         # Fail if edge inference is not available and edge-only mode is enabled
         if edge_only:
             raise RuntimeError("Edge-only mode is enabled on this detector, but edge inference is not available.")
+        elif edge_only_inference:
+            raise RuntimeError(
+                "Edge-only inference mode is enabled on this detector, but edge inference is not available."
+            )
 
     # Finally, fall back to submitting the image to the cloud
     if not image_query:

diff --git a/app/core/configs.py b/app/core/configs.py
@@ -2,6 +2,7 @@
 from typing import Dict, Optional
 
 from pydantic import BaseModel, Field, model_validator
+from typing_extensions import Self
 
 logger = logging.getLogger(__name__)
 
@@ -51,8 +52,19 @@ class DetectorConfig(BaseModel):
     local_inference_template: str = Field(..., description="Template for local edge inference.")
     motion_detection_template: str = Field(..., description="Template for motion detection.")
     edge_only: bool = Field(
-        False, description="Whether the detector should be in edge-only mode or not. Optional; defaults to False."
+        default=False,
+        description="Whether the detector should be in edge-only mode or not. Optional; defaults to False.",
     )
+    edge_only_inference: bool = Field(
+        default=False,
+        description="Whether the detector should be in edge-only inference mode or not. Optional; defaults to False.",
+    )
+
+    @model_validator(mode="after")
+    def validate_edge_modes(self) -> Self:
+        if self.edge_only and self.edge_only_inference:
+            raise ValueError("'edge_only' and 'edge_only_inference' cannot both be True")
+        return self
 
 
 class RootEdgeConfig(BaseModel):

diff --git a/configs/edge-config.yaml b/configs/edge-config.yaml
@@ -38,3 +38,5 @@ detectors:
     - detector_id: ''
       motion_detection_template: "default"
       local_inference_template: "default"
+      edge_only: false
+      edge_only_inference: false
diff --git a/deploy/README.md b/deploy/README.md
@@ -82,7 +82,7 @@ image to ECR see [Pushing/Pulling Images from ECR](#pushingpulling-images-from-e
 We currently have a hard-coded docker image in our k3s deployment, which is not ideal. 
 If you're testing things locally and want to use a different docker image, you can do so
 by first creating a docker image locally, pushing it to ECR, retrieving the image ID and 
-then using that ID in the [edge_deployment](/edge-endpoint/deploy/k3s/edge_deployment.yaml) file. 
+then using that ID in the [edge_deployment](k3s/edge_deployment/edge_deployment.yaml) file. 
 
 Follow the following steps:
 

diff --git a/test/core/test_configs.py b/test/core/test_configs.py
@@ -0,0 +1,14 @@
+import pytest
+
+from app.core.configs import DetectorConfig
+
+
+def test_detector_config_both_edge_modes():
+    with pytest.raises(ValueError):
+        DetectorConfig(
+            detector_id="det_xyz",
+            local_inference_template="default",
+            motion_detection_template="default",
+            edge_only=True,
+            edge_only_inference=True,
+        )