Skip to content

Segmentation and depth-alignment of geological core sample image columns via Mask-RCNN

License

Notifications You must be signed in to change notification settings

rgmyr/corebreakout

Repository files navigation

CoreBreakout

status

Requirements, installation, and contribution guidelines can be found below. Our full usage and API documentation can be found at: corebreakout.readthedocs.io

Overview

corebreakout is a Python package built around matterport/Mask_RCNN for the segmentation and depth-alignment of geological core sample images. It provides utilities and an API to enable the workflow depicted in the figure below, as well as a CoreColumn data structure to manage and manipulate the resulting depth-registered image data:

We are currently using this package to enable research on Lithology Prediction of Slabbed Core Photos Using Machine Learning Models, and are working on getting a DOI for the project through the Journal of Open Source Software.

Getting Started

Target Platform

This package was developed on Linux (Ubuntu, PopOS), and has also been tested on OS X. It may work on other platforms, but we make no guarantees.

Requirements

In addition to Python>=3.6, the packages listed in requirements.txt are required. Notable exceptions to the list are:

The TensorFlow requirement is not explicitly listed in requirements.txt due to the ambiguity between tensorflow and tensorflow-gpu in versions <=1.14. The latter is almost certainly required for training new models, although it may be possible to perform inference with saved models on CPU, and use of the CoreColumn data structure does not require a GPU.

Note that TensorFlow GPU capabilities are implemented with CUDA, which requires a supported NVIDIA GPU.

Additional (Optional) Requirements

Optionally, jupyter is required to run demo and test notebooks, and pytest is required to run unit tests. Both of these should be manually installed if you plan to modify or contribute to the package source code.

We also provide a script for extraction of top/base depths from core image text using pytesseract. After installing the Tesseract OCR Engine on your machine, you can install the pytesseract package with conda or pip.

Download code

$ git clone --recurse-submodules https://github.com/rgmyr/corebreakout.git
$ cd corebreakout

Download data (optional)

To make use of the provided dataset and model, or to train new a model starting from the pretrained COCO weights, you will need to download the assets.zip folder from the v0.2 Release.

Unzip and place this folder in the root directory of the repository (its contents will be ignored by git -- see the .gitignore). If you would like to place it elsewhere, you should modify the paths in corebreakout/defaults.py to point to your preferred location.

The current version of assets/data has JSON annotation files which include an imageData field representing the associated images as strings. For now you can delete this field and reduce the size of the data with scripts/prune_imageData.py:

$ python scripts/prune_imageData.py assets/

Installation

We recommend installing corebreakout and its dependencies in an isolated environment, and further recommend the use of conda. See Conda: Managing environments.


To create a new conda environment called corebreakout-env and activate it:

$ conda create -n corebreakout-env python=3.6 tensorflow-gpu=1.14
$ conda activate corebreakout-env

Note: If you want to try a CPU-only installation, then replace tensorflow-gpu with tensorflow. You may also lower the version number if you are on a machine with CUDA<10.0 (required for TensorFlow>=1.13). See TensorFlow GPU requirements for more compatibility details.


Then install the rest of the required packages into the environment:

$ conda install --file requirements.txt

Finally, install mrcnn and corebreakout using pip. Develop mode installation (-e) is recommended (but not required) for corebreakout, since many users will want to change some of the default parameters to suit their own data without having to reinstall afterward:

$ pip install ./Mask_RCNN
$ pip install -e .

Usage

Please refer to our readthedocs page for full documentation!

Development and Community Guidelines

Submit an Issue

  • Navigate to the repository's issue tab
  • Search for existing related issues
  • If necessary, create and submit a new issue

Contributing

Testing

  • Most corebreakout functionality not requiring trained model weights can be verified with pytest:
$ cd <root_directory>
$ pytest .
  • Model usage via the CoreSegmenter class can be verified by running tests/notebooks/test_inference.ipynb (requires saved model weights)
  • Plotting of CoreColumns can be verified by running tests/notebooks/test_plotting.ipynb

About

Segmentation and depth-alignment of geological core sample image columns via Mask-RCNN

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages