Object Detection

Object Segmentation is a method of categorizing or classifying each pixel of a given image, or a video into a particular classes. For example, this allows the computer to differentiate mouth from an eye and land from the sky.

For Object Segmentation task, we use the pre-built segNet program. The program takes an input (an image, a video or a live camera) and performs the inference using the pretrained networks, then outputs per-pixel classification mask overlay.

There are many different network models trained with various dataset. Using different dataset allows us to recognize and classify different scenarios.

Network	Resolution	CLI Argument
Cityscapes	512x256	fcn-resnet18-cityscapes-512x256
Cityscapes	1024x512	fcn-resnet18-cityscapes-1024x512
Cityscapes	2048x1024	fcn-resnet18-cityscapes-2048x1024
DeepScene	576x320	fcn-resnet18-deepscene-576x320
DeepScene	864x480	fcn-resnet18-deepscene-864x480
Multi-Human	512x320	fcn-resnet18-mhp-512x320
Multi-Human	640x360	fcn-resnet18-mhp-640x360
Pascal VOC	320x320	fcn-resnet18-voc-320x320
Pascal VOC	512x320	fcn-resnet18-voc-512x320
SUN RGB-D	512x400	fcn-resnet18-sun-512x400
SUN RGB-D	640x512	fcn-resnet18-sun-640x512

Launching the Program

The segNet program is a python based program. The program may be ran directly on the Command Line Interface or through our pre-built script ran on the Jupyter Notebook environment.

These are the different parameters that can the adjusted to the users need. (Note) The input and output information must be given.

The network name that will be used for the inference
The visualization method. Whether to have the classified mask by itself or overlay it on the original image, video or camera. The default is set to overlay.
The alpha value. How much blending to be done on the overlay, if the overlay setting is set. Default is 120.
The filter mode. Sets the sampling method as either linear or point. Default value is linear.
The input source (file path if it is an image(s) or a video(s))
The output method (file path if it is an image(s) or a video(s))

./segnet.py --networks=<network name> --visualize=<visual method> --alpha=<alpha value> --filter-mode=<filter value> <input source> <output method>

The visualize, alpha, and filter-mode parameters are optional.

Examples through Jupyter Notebook

The program launching process along with parameter settings are all simplified and set up on the Jupyter Notebook Environment.

(The Jetson Board used for these examples are => Jetson Nano)

Object Segmentation through a Camera

For this example we will use fcn-resnet18-cityscapes-1024x512 model trained with Cityscapes

1. 카메라로 객체 분할.ipynb
Running the cell code

Ctrl + Enter

Import the subprocess module to run the example scripts (i.e. show.sh, kill.sh)

import subprocess

Using the below code, activate a camera window. The segmentation will automatically happen.

# Object Segmentation with Raspberry Pi Camera
detect_command_segment = 'bash ~/ai_example/detect.sh cam_segment'
subprocess.call((detect_command_segment.split('\n')), shell=True)

After testing the detection program terminate the camera window

# terminating the process
kill_command_segment = 'bash ~/ai_example/kill.sh camera'
subprocess.call((kill_command_segment.split('\n')), shell=True)