Skip to content

Commit

Permalink
Merge pull request #220 from mhwasil/melodic
Browse files Browse the repository at this point in the history
[WIP] Updated perception doc and tutorial
  • Loading branch information
mhwasil authored Oct 16, 2021
2 parents d68296d + 88198da commit a723cfb
Show file tree
Hide file tree
Showing 7 changed files with 334 additions and 40 deletions.
2 changes: 0 additions & 2 deletions docs/source/mir_perception/camera.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,6 @@ How to use the RealSense2 camera
.. code-block:: bash
cd *catkin workspace*/src/mas_perception/mcr_scene_segmentation/ros/config
gedit scene_segmentation_constraints.yaml
4. Setup Base Frame

Expand All @@ -67,7 +66,6 @@ How to use the RealSense2 camera
.. code-block:: bash
cd *catkin workspace*/src/mas_perception/mcr_scene_segmentation/ros/launch
gedit scene_segmentation.launch
Change the value of "dataset_collection" from "false" to "true". Change value of "logdir" from "/temp/
to the path in your computer where you want to save the files.
Expand Down
80 changes: 55 additions & 25 deletions docs/source/mir_perception/dataset.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,17 +13,17 @@ Dataset collection
3D dataset collection
^^^^^^^^^^^^^^^^^^^^^^

We use a rotating table to collect point cloud data.
Objects are placed on a rotating table such that it can capture the objects from
different angle. However, this can be done manually on a normal table and change
the object orientation manually.

.. note::

This only works with a single object.

Setup:

1. Using robot arm camera

2. Using camera
1. Using external camera

* Launch the camera

Expand All @@ -37,7 +37,7 @@ Setup:

Passthrough filter will not work if it's not parallel to the ground

* Then launch multimodal object recognition
* Launch multimodal object recognition

.. code-block:: bash
Expand All @@ -46,49 +46,79 @@ Setup:
.. note::

To enable dataset collection, it requires to be in *debug_mode*. You can also
point to a specifi logdir to save the data e.g. logdir:=/home/robocup/cloud_dataset

* Trigger data collection mode

.. code-block:: bash
rostopic pub /mir_perception/multimodal_object_recognition/event_in std_msgs/String e_data_collection
point to a specifi logdir to save the data e.g. logdir:=/home/robocup/cloud_dataset.

* Start collectiong dataset

.. code-block:: bash
rostopic pub /mir_perception/multimodal_object_recognition/event_in std_msgs/String e_start
* Stop data collection mode
.. code-block:: bash
2. Using robot arm camera

* Bringup the robot

* Start `multimodal_object_recognition` and continue with the next steps as described previously.

.. note::

rostopic pub /mir_perception/multimodal_object_recognition/event_in std_msgs/String e_stop_data_collection
The segemented point clouds are saved in the `logdir`.

.. _2d_dataset_collection:

2D dataset collection
^^^^^^^^^^^^^^^^^^^^^^

* Collect dataset using rotating table
* Collect dataset using random surface
* Collect dataset during competition
Images can be collected using the robot camera or external camera.
They can also be collected using `easy augment too <https://github.com/santoshreddy254/easy_augment>`_
which use Intel Realsense D435 camera to capture the image and automatically
annotate them for 2D object detection.


.. _dataset_preprocessing:

Dataset preprocessing
-----------------------

Before training training the model, the data should be preprocessed, and this
includes but not limited to *removing bad data*, *normalization*, and converting
it to the required format such as *h5* for point clouds and *VOC* or *KITTI* for
images.

.. _3d_dataset_preprocessing:

3D dataset preprocessing
^^^^^^^^^^^^^^^^^^^^^^^^

An example of the data directory structure:

.. code-block:: bash
b-it-bots_atwork_dataset
├── train
│   ├── AXIS
| ├── axis_0001.pcd
| ├── ...
│   ├── ...
├── test
│   ├── AXIS
| ├── axis_0001.pcd
| ├── ...
│   ├── ...
The dataset preprocessing can be found in `this notebook
<https://github.com/mhwasil/pointcloud_classification/blob/master/dataset/b-it-bots_dataset_preprocessing.ipynb>`_.

It will generate `pgz` files containing a dictionary of objects consisting of `x y z r g b` and label.


.. _2d_dataset_preprocessing:

2D dataset preprocessing
^^^^^^^^^^^^^^^^^^^^^^^^^^

* Object detection

* Download labelme
* Label the data
* Augment data
* Convert to a particular format: VOC, KITTI etc

* Create semantic labels using `labelme <https://github.com/wkentaro/labelme>`_.
* Convert the semantic labels using `labelme2voc <https://github.com/mhwasil/labelme/blob/master/examples/bbox_detection/labelme2voc.py>`_.
* If KITTI dataset is required, convert VOC dataset to KITTI using
`vod-converter <https://github.com/umautobots/vod-converter>`_
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 6 additions & 1 deletion docs/source/mir_perception/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,12 @@
Perception
##########

MAS industrial robotics - perception tutorial
Robot perception architecture

.. figure:: images/perception_architecture.png
:align: center

Robot perception architecture

.. toctree::

Expand Down
102 changes: 96 additions & 6 deletions docs/source/mir_perception/object_recognition.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,104 @@
Object recognition
==================

Our object recognition comes from two different modalities, namely 3D based object recognition and
2D object detection and recognition.

.. _3d_object_recognition:
.. _3d_object_recognition_model:

3D object recognition
------------------------
3D object recognition models
----------------------------

`Our 3D object recognition node <https://github.com/b-it-bots/mas_industrial_robotics/blob/melodic/mir_perception/mir_object_recognition/ros/script/pc_object_recognizer_node>`_
uses segmented point clouds described in :ref:`3d_object_segmentation` as the input
to the models. These segemented point clouds are published from
`mir_object_recognition node <https://github.com/b-it-bots/mas_industrial_robotics/blob/melodic/mir_perception/mir_object_recognition/ros/src/multimodal_object_recognition_node.cpp>`_.

.. _2d_object_recognition:
The tutorial for training the model is described in :ref:`training`.

2D object recognition
------------------------
We use two models for the 3D object recognition, namely:

* Random forest with Radial density distribution and 3D modified Fisher vector
(3DmFV) as features as described in `our paper <https://link.springer.com/chapter/10.1007/978-3-030-35699-6_48>`_.
* `Dynamic Graph CNN <https://github.com/WangYueFt/dgcnn>`_: an end-to-end point
cloud classification. However, in addition to points, we also incorporate colors as inputs.

You can change the classifier in the launch file

.. literalinclude:: ../../../mir_perception/mir_object_recognition/ros/launch/pc_object_recognition.launch
:language: xml
:lineno-start: 0
:linenos:

Where:

* `model`: whether it is CNN based (`cnn_based`) or traditional ML estimators (`feature_based`)
* `model_id`: the actual name of the model, available model ids:

* `cnn_based`: `dgcnn`
* `feature_based`: `fvrdd`
* `dataset`: the dataset name where the model was trained on

.. _2d_object_recognition_model:

2D object recognition models
----------------------------

We use `squeezeDet <https://github.com/BichenWuUCB/squeezeDet>`_ for out 2D object detection model.
This is lightweight, one-shot object detection and classification.
The model can be changed in the `rgb_object_recognition.launch`

.. literalinclude:: ../../../mir_perception/mir_object_recognition/ros/launch/rgb_object_recognition.launch
:language: xml
:lineno-start: 0
:linenos:

Where:

* `classifier`: the model used to detect and classify objects
* `dataset`: the dataset used to train the model

.. _mutlimodal_object_recognition:

Multimodal object recognition
-----------------------------

`multimodal_object_recognition_node <https://github.com/b-it-bots/mas_industrial_robotics/blob/melodic/mir_perception/mir_object_recognition/ros/src/multimodal_object_recognition_node.cpp>`_
coordinates the whole perception pipeline as described in the following items:

* Subscribes to rgb and point cloud topics
* Transforms point cloud to the target fram
* Finds 3D object clusters from the point cloud using `mir_object_segementation`
* Sends the 3D clusters to point cloud object recognizer (`pc_object_recognizer_node`)
* Sends the image to rgb object detection and recognition node (`rgb_object_recognized_node`)
* Waits until it gets results from both classifiers or if the timeout is reached
* Posts processing of the recognized objects

* Applies filters for the objects
* Sends object_list to object_list_merger

**Trigger multimodal_object_recognition**

.. code-block:: bash
rostopic pub /mir_perception/multimodal_object_recognition/event_in std_msgs/String e_start
**Outputs**

.. code-block:: bash
/mcr_perception/object_detector/object_list
/mir_perception/multimodal_object_recognition/output/workspace_height
**Visualization outputs**

.. code-block:: bash
/mir_perception/multimodal_object_recognition/output/bounding_boxes
/mir_perception/multimodal_object_recognition/output/debug_cloud_plane
/mir_perception/multimodal_object_recognition/output/pc_labels
/mir_perception/multimodal_object_recognition/output/pc_object_pose_array
/mir_perception/multimodal_object_recognition/output/rgb_labels
/mir_perception/multimodal_object_recognition/output/rgb_object_pose_array
/mir_perception/multimodal_object_recognition/output/tabletop_cluster_pc
/mir_perception/multimodal_object_recognition/output/tabletop_cluster_rgb
Loading

0 comments on commit a723cfb

Please sign in to comment.