[RSDK-9931] Let the config have an optional maximum bounding box (#31)

The intention here is that if the camera provides a corrupted frame and everything suddenly goes gray for a moment, that shouldn't count as motion. This also means that if someone bumps the camera and the entire scene shifts to the left, that also shouldn't count as motion. Tried on an Orin Nano: seems to work! If I don't mention it in the config, the behavior seems unchanged, and if I do mention it, large bounding boxes do not appear, while small ones still do. I was unable to get a corrupted image, so can't ensure that those are no longer considered noise. I've also made some other cleanup changes: - Remove trailing whitespace. Ideally we'd never have any to begin with - Don't raise raw `Exception`s, because the only way to catch them is to catch all exceptions. Instead, raise something specific (in these cases, I went with `ValueError`, but could be convinced of other types, too). My hope is that reviewing commit-by-commit is easy, and makes it clear why each change occurred. * remove trailing whitespace * raise ValueError instead of raw Exceptions * add validation for optional max_box_size in config * store max_box_size as a field in the class * flip if statement, and fix likely off-by-one error * pull out temporary value to a variable * check the max box size in addition to the min * get tests to pass * add new passing tests * remove print statements: they just print a useless '<CaptureAllResult object at 0xffff99489f30>' * getMD in the tests is a static method * refactor shared code to helper function, tests still pass
viam-modules · Feb 13, 2025 · 17a28a5 · 17a28a5
1 parent 36b20f6
commit 17a28a5
Show file tree

Hide file tree

Showing 6 changed files with 96 additions and 56 deletions.
diff --git a/README.md b/README.md
@@ -8,14 +8,14 @@ To transform your camera into a motion detecting camera, configure this vision s
 
 Start by [configuring a camera](https://docs.viam.com/components/camera/webcam/) on your robot. Remember the name you give to the camera, it will be important later.
 
-> [!NOTE]  
+> [!NOTE]
 > Before configuring your camera or vision service, you must [create a robot](https://docs.viam.com/manage/fleet/robots/#add-a-new-robot).
 
 ## Configuration
 
 Navigate to the **Config** tab of your robot’s page in [the Viam app](https://app.viam.com/). Click on the **Services** subtab and click **Create service**. Select the `vision` type, then select the `motion-detector` model. Enter a name for your service and click **Create**.
 
-On the new component panel, copy and paste the following attribute template into your base’s **Attributes** box. 
+On the new component panel, copy and paste the following attribute template into your base’s **Attributes** box.
 ```json
 {
   "cam_name": "myCam",
@@ -26,7 +26,7 @@ On the new component panel, copy and paste the following attribute template into
 
 Edit the attributes as applicable.
 
-> [!NOTE]  
+> [!NOTE]
 > For more information, see [Configure a Robot](https://docs.viam.com/manage/configuration/).
 
 ### Attributes
@@ -70,7 +70,7 @@ The following attributes are available for `viam:vision:motion-detector` vision
 
 ### Usage
 
-This module is made for use with the following methods of the [vision service API](https://docs.viam.com/services/vision/#api): 
+This module is made for use with the following methods of the [vision service API](https://docs.viam.com/services/vision/#api):
 - [`GetClassifications()`](https://docs.viam.com/services/vision/#getclassifications)
 - [`GetClassificationsFromCamera()`](https://docs.viam.com/services/vision/#getclassificationsfromcamera)
 - [`GetDetections()`](https://docs.viam.com/services/vision/#getdetections)
@@ -79,13 +79,13 @@ This module is made for use with the following methods of the [vision service AP
 
 The module behavior differs slightly for classifications and detections.
 
-When returning classifications, the module will always return a single classification with the `class_name` "motion". 
+When returning classifications, the module will always return a single classification with the `class_name` "motion".
 The `confidence` of the classification will be a percentage equal to the percentage of the image that moved (more than a threshold determined by the sensitivity attribute).
 
-When returning detections, the module will return a list of detections with bounding boxes that encapsulate the movement. 
-The `class_name` will be "motion" and the `confidence` will always be 0.5. 
+When returning detections, the module will return a list of detections with bounding boxes that encapsulate the movement.
+The `class_name` will be "motion" and the `confidence` will always be 0.5.
 
-## Visualize 
+## Visualize
 
 Once the `viam:vision:motion-detector` modular service is in use, configure a [transform camera](https://docs.viam.com/components/camera/transform/) to see classifications or detections appear in your robot's field of vision.
 

diff --git a/build.sh b/build.sh
@@ -12,7 +12,7 @@ fi
 python_version=$(python3 -c 'import sys; print(".".join(map(str, sys.version_info[:2])))')
 
 if command -v apt-get; then
-    $SUDO apt-get -y install python3-venv 
+    $SUDO apt-get -y install python3-venv
     if dpkg -l python3-venv; then
         echo "python3-venv is installed, skipping setup"
     else

diff --git a/meta.json b/meta.json
@@ -10,10 +10,10 @@
       }
     ],
     "build": {
-      "setup": "make setup", 
+      "setup": "make setup",
       "build": "make dist/archive.tar.gz",
-      "path": "dist/archive.tar.gz", 
-      "arch": ["linux/amd64", "linux/arm64"] 
+      "path": "dist/archive.tar.gz",
+      "arch": ["linux/amd64", "linux/arm64"]
   },
   "entrypoint": "dist/main"
-  }
+}
diff --git a/src/motion_detector.py b/src/motion_detector.py
@@ -57,13 +57,16 @@ def new_service(
     def validate_config(cls, config: ServiceConfig) -> Sequence[str]:
         source_cam = config.attributes.fields["cam_name"].string_value
         if source_cam == "":
-            raise Exception("Source camera must be provided as 'cam_name'")
+            raise ValueError("Source camera must be provided as 'cam_name'")
         min_boxsize = config.attributes.fields["min_box_size"].number_value
         if min_boxsize < 0:
-            raise Exception("Minimum bounding box size should be a positive integer")
+            raise ValueError("Minimum bounding box size should be a positive integer")
         sensitivity = config.attributes.fields["sensitivity"].number_value
         if sensitivity < 0 or sensitivity > 1:
-            raise Exception("Sensitivity should be a number between 0 and 1")
+            raise ValueError("Sensitivity should be a number between 0 and 1")
+        max_box_size = config.attributes.fields.get("max_box_size")
+        if max_box_size is not None and max_box_size.number_value <= 0:
+            raise ValueError("Maximum bounding box size, if present, must be a positive integer")
         return [source_cam]
 
     # Handles attribute reconfiguration
@@ -76,6 +79,9 @@ def reconfigure(
         if self.sensitivity == 0:
             self.sensitivity = 0.9
         self.min_box_size = config.attributes.fields["min_box_size"].number_value
+        self.max_box_size = config.attributes.fields.get("max_box_size")
+        if self.max_box_size is not None:
+            self.max_box_size = self.max_box_size.number_value
 
     # This will be the main method implemented in this module.
     # Given a camera. Perform frame differencing and return how much of the image is moving
@@ -91,15 +97,15 @@ async def get_classifications(
         # Grab and grayscale 2 images
         input1 = await self.camera.get_image(mime_type=CameraMimeType.JPEG)
         if input1.mime_type not in [CameraMimeType.JPEG, CameraMimeType.PNG]:
-            raise Exception(
+            raise ValueError(
                 "image mime type must be PNG or JPEG, not ", input1.mime_type
             )
         img1 = pil.viam_to_pil_image(input1)
         gray1 = cv2.cvtColor(np.array(img1), cv2.COLOR_BGR2GRAY)
 
         input2 = await self.camera.get_image()
         if input2.mime_type not in [CameraMimeType.JPEG, CameraMimeType.PNG]:
-            raise Exception(
+            raise ValueError(
                 "image mime type must be PNG or JPEG, not ", input2.mime_type
             )
         img2 = pil.viam_to_pil_image(input2)
@@ -117,7 +123,7 @@ async def get_classifications_from_camera(
         **kwargs,
     ) -> List[Classification]:
         if camera_name != self.cam_name:
-            raise Exception(
+            raise ValueError(
                 "Camera name passed to method:",
                 camera_name,
                 "is not the configured 'cam_name'",
@@ -138,15 +144,15 @@ async def get_detections(
         # Grab and grayscale 2 images
         input1 = await self.camera.get_image(mime_type=CameraMimeType.JPEG)
         if input1.mime_type not in [CameraMimeType.JPEG, CameraMimeType.PNG]:
-            raise Exception(
+            raise ValueError(
                 "image mime type must be PNG or JPEG, not ", input1.mime_type
             )
         img1 = pil.viam_to_pil_image(input1)
         gray1 = cv2.cvtColor(np.array(img1), cv2.COLOR_BGR2GRAY)
 
         input2 = await self.camera.get_image()
         if input2.mime_type not in [CameraMimeType.JPEG, CameraMimeType.PNG]:
-            raise Exception(
+            raise ValueError(
                 "image mime type must be PNG or JPEG, not ", input2.mime_type
             )
         img2 = pil.viam_to_pil_image(input2)
@@ -163,7 +169,7 @@ async def get_detections_from_camera(
         **kwargs,
     ) -> List[Detection]:
         if camera_name != self.cam_name:
-            raise Exception(
+            raise ValueError(
                 "Camera name passed to method:",
                 camera_name,
                 "is not the configured 'cam_name':",
@@ -207,7 +213,7 @@ async def capture_all_from_camera(  # pylint: disable=too-many-positional-argume
     ) -> CaptureAllResult:
         result = CaptureAllResult()
         if camera_name not in (self.cam_name, ""):
-            raise Exception(
+            raise ValueError(
                 "Camera name passed to method:",
                 camera_name,
                 "is not the configured 'cam_name':",
@@ -276,17 +282,23 @@ def detections_from_gray_imgs(self, gray1, gray2):
             xs = [pt[0][0] for pt in c]
             ys = [pt[0][1] for pt in c]
             xmin, xmax, ymin, ymax = min(xs), max(xs), min(ys), max(ys)
-            # Add to list of detections if big enough
-            if (ymax - ymin) * (xmax - xmin) > self.min_box_size:
-                detections.append(
-                    {
-                        "confidence": 0.5,
-                        "class_name": "motion",
-                        "x_min": int(xmin),
-                        "y_min": int(ymin),
-                        "x_max": int(xmax),
-                        "y_max": int(ymax),
-                    }
-                )
+
+            # Ignore this detection if it's the wrong size
+            area = (ymax - ymin) * (xmax - xmin)
+            if area < self.min_box_size:
+                continue
+            if self.max_box_size is not None and area > self.max_box_size:
+                continue
+
+            detections.append(
+                {
+                    "confidence": 0.5,
+                    "class_name": "motion",
+                    "x_min": int(xmin),
+                    "y_min": int(ymin),
+                    "x_max": int(xmax),
+                    "y_max": int(ymax),
+                }
+            )
 
         return detections
diff --git a/tests/fakecam.py b/tests/fakecam.py
@@ -18,14 +18,14 @@ def __init__(self, name: str):
         self.images = [img1, img2]
 
     async def get_image(self, mime_type: str = "") -> Coroutine[Any, Any, ViamImage]:
-        self.count +=1 
+        self.count +=1
         return pil.pil_to_viam_image(self.images[self.count%2], CameraMimeType.JPEG)
-    
+
     async def get_images(self) -> Coroutine[Any, Any, Tuple[List[NamedImage] | ResponseMetadata]]:
         raise NotImplementedError
 
     async def get_properties(self) -> Coroutine[Any, Any, GetPropertiesResponse]:
         raise NotImplementedError
-    
+
     async def get_point_cloud(self) -> Coroutine[Any, Any, Tuple[bytes | str]]:
-        raise NotImplementedError
+        raise NotImplementedError
diff --git a/tests/test_motiondetector.py b/tests/test_motiondetector.py
@@ -17,19 +17,33 @@ def make_component_config(dictionary: Mapping[str, Any]) -> ComponentConfig:
         struct.update(dictionary=dictionary)
         return ComponentConfig(attributes=struct)
 
-
 
 
 class TestMotionDetector:
 
-    def getMD(self):
+    @staticmethod
+    def getMD():
         md = MotionDetector("test")
         md.sensitivity = 0.9
         md.min_box_size = 1000
+        md.max_box_size = None
         md.cam_name = "test"
         md.camera = FakeCamera("test")
         return md
-
+
+    @staticmethod
+    async def get_output(md):
+        out = await md.capture_all_from_camera("test",return_image=True,
+                                                return_classifications=True,
+                                                return_detections=True,
+                                                return_object_point_clouds=True)
+        assert isinstance(out, CaptureAllResult)
+        assert out.image is not None
+        assert out.classifications is not None
+        assert len(out.classifications) == 1
+        assert out.classifications[0]["class_name"] == "motion"
+        return out
+
 
     def test_validate(self):
         md = self.getMD()
@@ -52,7 +66,7 @@ def test_classifications(self):
         classifications = md.classification_from_gray_imgs(gray1, gray2)
         assert len(classifications) == 1
         assert classifications[0]["class_name"] == "motion"
-        
+
 
     def test_detections(self):
         img1 = Image.open("tests/img1.jpg")
@@ -74,23 +88,37 @@ async def test_properties(self):
         assert props.detections_supported == True
         assert props.object_point_clouds_supported == False
 
-    
+
     @pytest.mark.asyncio
     async def test_captureall(self):
         md = self.getMD()
-        out = await md.capture_all_from_camera("test",return_image=True, 
-                                                return_classifications=True,
-                                                return_detections=True,
-                                                return_object_point_clouds=True)
-        assert isinstance(out, CaptureAllResult)
-        print(out)
-        assert out.image is not None 
-        assert out.classifications is not None 
-        assert len(out.classifications) == 1
-        assert out.classifications[0]["class_name"] == "motion"
-        assert out.detections is not None 
+        out = await self.get_output(md)
+        assert out.detections is not None
         assert out.detections[0]["class_name"] == "motion"
-        assert out.objects is None 
+        assert out.objects is None
+
 
+    @pytest.mark.asyncio
+    async def test_captureall_not_too_large(self):
+        md = self.getMD()
+        md.max_box_size = 1000000000
+        out = await self.get_output(md)
+        assert out.detections is not None
+        assert out.detections[0]["class_name"] == "motion"
+        assert out.objects is None
 
 
+    @pytest.mark.asyncio
+    async def test_captureall_too_small(self):
+        md = self.getMD()
+        md.min_box_size = 1000000000
+        out = await self.get_output(md)
+        assert out.detections == []
+
+
+    @pytest.mark.asyncio
+    async def test_captureall_too_large(self):
+        md = self.getMD()
+        md.max_box_size = 5
+        out = await self.get_output(md)
+        assert out.detections == []