Merge pull request #220 from mhwasil/melodic

[WIP] Updated perception doc and tutorial
b-it-bots · Oct 16, 2021 · a723cfb · a723cfb
2 parents d68296d + 88198da
commit a723cfb
Show file tree

Hide file tree

Showing 7 changed files with 334 additions and 40 deletions.
diff --git a/docs/source/mir_perception/camera.rst b/docs/source/mir_perception/camera.rst
@@ -46,7 +46,6 @@ How to use the RealSense2 camera
   .. code-block:: bash
 
     cd *catkin workspace*/src/mas_perception/mcr_scene_segmentation/ros/config
-    gedit scene_segmentation_constraints.yaml
 
 4. Setup Base Frame
 
@@ -67,7 +66,6 @@ How to use the RealSense2 camera
   .. code-block:: bash
 
     cd *catkin workspace*/src/mas_perception/mcr_scene_segmentation/ros/launch
-    gedit scene_segmentation.launch
 
   Change the value of "dataset_collection" from "false" to "true". Change value of "logdir" from "/temp/
   to the path in your computer where you want to save the files.

diff --git a/docs/source/mir_perception/dataset.rst b/docs/source/mir_perception/dataset.rst
@@ -13,17 +13,17 @@ Dataset collection
 3D dataset collection
 ^^^^^^^^^^^^^^^^^^^^^^
 
-We use a rotating table to collect point cloud data.
+Objects are placed on a rotating table such that it can capture the objects from 
+different angle. However, this can be done manually on a normal table and change 
+the object orientation manually.
 
 .. note::
 
   This only works with a single object.
 
 Setup:
 
-  1. Using robot arm camera
-
-  2. Using camera
+  1. Using external camera
 
     * Launch the camera
 
@@ -37,7 +37,7 @@ Setup:
 
         Passthrough filter will not work if it's not parallel to the ground
 
-    * Then launch multimodal object recognition
+    * Launch multimodal object recognition
 
       .. code-block:: bash
 
@@ -46,49 +46,79 @@ Setup:
       .. note::
 
         To enable dataset collection, it requires to be in *debug_mode*. You can also
-        point to a specifi logdir to save the data e.g. logdir:=/home/robocup/cloud_dataset
-
-    * Trigger data collection mode
-
-      .. code-block:: bash
-
-        rostopic pub /mir_perception/multimodal_object_recognition/event_in std_msgs/String e_data_collection
+        point to a specifi logdir to save the data e.g. logdir:=/home/robocup/cloud_dataset.
 
     * Start collectiong dataset
 
       .. code-block:: bash
 
         rostopic pub /mir_perception/multimodal_object_recognition/event_in std_msgs/String e_start
 
-    * Stop data collection mode
 
-      .. code-block:: bash
+  2. Using robot arm camera
+
+    * Bringup the robot
+
+    * Start `multimodal_object_recognition` and continue with the next steps as described previously.
+
+ .. note::
 
-        rostopic pub /mir_perception/multimodal_object_recognition/event_in std_msgs/String e_stop_data_collection
+    The segemented point clouds are saved in the `logdir`.
 
 .. _2d_dataset_collection:
 
 2D dataset collection
 ^^^^^^^^^^^^^^^^^^^^^^
 
-* Collect dataset using rotating table
-* Collect dataset using random surface
-* Collect dataset during competition
+Images can be collected using the robot camera or external camera.
+They can also be collected using `easy augment too <https://github.com/santoshreddy254/easy_augment>`_ 
+which use Intel Realsense D435 camera to capture the image and automatically 
+annotate them for 2D object detection.
+
 
 .. _dataset_preprocessing:
 
 Dataset preprocessing
 -----------------------
 
+Before training training the model, the data should be preprocessed, and this 
+includes but not limited to *removing bad data*, *normalization*, and converting 
+it to the required format such as *h5* for point clouds and *VOC* or *KITTI* for 
+images.
+
+.. _3d_dataset_preprocessing:
+
+3D dataset preprocessing
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+An example of the data directory structure:
+
+.. code-block:: bash
+
+  b-it-bots_atwork_dataset
+  ├── train
+  │   ├── AXIS
+  |       ├── axis_0001.pcd
+  |       ├── ...
+  │   ├── ...
+  ├── test
+  │   ├── AXIS
+  |       ├── axis_0001.pcd
+  |       ├── ...
+  │   ├── ...
+
+The dataset preprocessing can be found in `this notebook 
+<https://github.com/mhwasil/pointcloud_classification/blob/master/dataset/b-it-bots_dataset_preprocessing.ipynb>`_.
+
+It will generate `pgz` files containing a dictionary of objects consisting of `x y z r g b` and label.
+
+
 .. _2d_dataset_preprocessing:
 
 2D dataset preprocessing
 ^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-* Object detection
-
-  * Download labelme
-  * Label the data
-  * Augment data
-  * Convert to a particular format: VOC, KITTI etc
-
+* Create semantic labels using `labelme <https://github.com/wkentaro/labelme>`_.
+* Convert the semantic labels using `labelme2voc <https://github.com/mhwasil/labelme/blob/master/examples/bbox_detection/labelme2voc.py>`_.
+* If KITTI dataset is required, convert VOC dataset to KITTI using 
+  `vod-converter <https://github.com/umautobots/vod-converter>`_
diff --git a/docs/source/mir_perception/images/perception_architecture.png b/docs/source/mir_perception/images/perception_architecture.png
diff --git a/docs/source/mir_perception/index.rst b/docs/source/mir_perception/index.rst
@@ -3,7 +3,12 @@
 Perception
 ##########
 
-MAS industrial robotics - perception tutorial
+Robot perception architecture
+
+.. figure:: images/perception_architecture.png   
+  :align: center
+
+  Robot perception architecture
 
 .. toctree::
 

diff --git a/docs/source/mir_perception/object_recognition.rst b/docs/source/mir_perception/object_recognition.rst
@@ -3,14 +3,104 @@
 Object recognition
 ==================
 
+Our object recognition comes from two different modalities, namely 3D based object recognition and 
+2D object detection and recognition.
 
-.. _3d_object_recognition:
+.. _3d_object_recognition_model:
 
-3D object recognition
-------------------------
+3D object recognition models
+----------------------------
 
+`Our 3D object recognition node <https://github.com/b-it-bots/mas_industrial_robotics/blob/melodic/mir_perception/mir_object_recognition/ros/script/pc_object_recognizer_node>`_ 
+uses segmented point clouds described in :ref:`3d_object_segmentation` as the input 
+to the models. These segemented point clouds are published from 
+`mir_object_recognition node <https://github.com/b-it-bots/mas_industrial_robotics/blob/melodic/mir_perception/mir_object_recognition/ros/src/multimodal_object_recognition_node.cpp>`_.
 
-.. _2d_object_recognition:
+The tutorial for training the model is described in :ref:`training`.
 
-2D object recognition
-------------------------
+We use two models for the 3D object recognition, namely:
+
+* Random forest with Radial density distribution and 3D modified Fisher vector 
+  (3DmFV) as features as described in `our paper <https://link.springer.com/chapter/10.1007/978-3-030-35699-6_48>`_.
+* `Dynamic Graph CNN <https://github.com/WangYueFt/dgcnn>`_: an end-to-end point 
+  cloud classification. However, in addition to points, we also incorporate colors as inputs.
+
+You can change the classifier in the launch file
+
+.. literalinclude:: ../../../mir_perception/mir_object_recognition/ros/launch/pc_object_recognition.launch
+  :language: xml
+  :lineno-start: 0
+  :linenos:
+
+Where:
+
+* `model`: whether it is CNN based (`cnn_based`) or traditional ML estimators (`feature_based`) 
+* `model_id`: the actual name of the model, available model ids: 
+
+  * `cnn_based`: `dgcnn`
+  * `feature_based`: `fvrdd`
+* `dataset`: the dataset name where the model was trained on
+
+.. _2d_object_recognition_model:
+
+2D object recognition models
+----------------------------
+
+We use `squeezeDet <https://github.com/BichenWuUCB/squeezeDet>`_ for out 2D object detection model.
+This is lightweight, one-shot object detection and classification.
+The model can be changed in the `rgb_object_recognition.launch`
+
+.. literalinclude:: ../../../mir_perception/mir_object_recognition/ros/launch/rgb_object_recognition.launch
+  :language: xml
+  :lineno-start: 0
+  :linenos:
+
+Where:
+
+* `classifier`: the model used to detect and classify objects 
+* `dataset`: the dataset used to train the model 
+
+.. _mutlimodal_object_recognition:
+
+Multimodal object recognition
+-----------------------------
+
+`multimodal_object_recognition_node <https://github.com/b-it-bots/mas_industrial_robotics/blob/melodic/mir_perception/mir_object_recognition/ros/src/multimodal_object_recognition_node.cpp>`_
+coordinates the whole perception pipeline as described in the following items:
+
+* Subscribes to rgb and point cloud topics
+* Transforms point cloud to the target fram
+* Finds 3D object clusters from the point cloud using `mir_object_segementation`
+* Sends the 3D clusters to point cloud object recognizer (`pc_object_recognizer_node`)
+* Sends the image to rgb object detection and recognition node (`rgb_object_recognized_node`)
+* Waits until it gets results from both classifiers or if the timeout is reached
+* Posts processing of the recognized objects
+
+  * Applies filters for the objects
+* Sends object_list to object_list_merger
+
+**Trigger multimodal_object_recognition**
+
+  .. code-block:: bash
+
+    rostopic pub /mir_perception/multimodal_object_recognition/event_in std_msgs/String e_start
+
+**Outputs**
+
+  .. code-block:: bash
+
+    /mcr_perception/object_detector/object_list
+    /mir_perception/multimodal_object_recognition/output/workspace_height
+
+**Visualization outputs**
+
+  .. code-block:: bash
+
+    /mir_perception/multimodal_object_recognition/output/bounding_boxes
+    /mir_perception/multimodal_object_recognition/output/debug_cloud_plane
+    /mir_perception/multimodal_object_recognition/output/pc_labels
+    /mir_perception/multimodal_object_recognition/output/pc_object_pose_array
+    /mir_perception/multimodal_object_recognition/output/rgb_labels
+    /mir_perception/multimodal_object_recognition/output/rgb_object_pose_array
+    /mir_perception/multimodal_object_recognition/output/tabletop_cluster_pc
+    /mir_perception/multimodal_object_recognition/output/tabletop_cluster_rgb