Fast and robust upper body orientation estimation for mobile robotic applications
Fast and Robust Upper Body Orientation Estimation for Mobile Robotic Applications using Tensorflow (Python 3.6).
Clone repository:
git clone https://github.com/tui-nicr/deep-orientation.git
Set up environment and install dependencies:
# create Python 3.6 environment
conda create --name env_deep_orientation python=3.6
conda activate env_deep_orientation
# install dependencies
# GPU version:
pip install -r /path/to/this/repository/requirements_gpu.txt [--user]
# CPU version:
pip install -r /path/to/this/repository/requirements_cpu.txt [--user]
# opt. dependencies to plot models, see src/plot_models.py
conda install graphviz pydot
For network training: Download NICR RGB-D Orientation Data Set and install dataset package
Change directory to src
cd /path/to/this/repository/src
Apply best performing modified beyer architecture network (depth input, biternion output, mean absolute error of 5.28° on test set)
# single prediction (without dropout sampling)
python inference.py \
beyer_mod_relu \
../trained_networks/beyer_mod_relu__depth__126x48__biternion__0_030000__1/weights_valid_0268.hdf5 \
../nicr_rgb_d_orientation_data_set_examples/small_patches \
--input_type depth \
--input_preprocessing standardize \
--input_height 126 \
--input_width 48 \
--output_type biternion \
--n_samples 1
[--cpu]
# multiple predictions to estimate uncertainty (with dropout sampling)
python inference.py \
beyer_mod_relu \
../trained_networks/beyer_mod_relu__depth__126x48__biternion__0_030000__1/weights_valid_0268.hdf5 \
../nicr_rgb_d_orientation_data_set_examples/small_patches \
--input_type depth \
--input_preprocessing standardize \
--input_height 126 \
--input_width 48 \
--output_type biternion \
--n_samples 25
[--cpu]
Apply best performing MobileNet v2 architecture network (depth input, biternion output, mean absolute error of 5.17° on test set)
# single prediction (without dropout sampling)
python inference.py \
mobilenet_v2 \
../trained_networks/mobilenet_v2_1_00__depth__96x96__biternion__0_001000__2/weights_valid_0268.hdf5 \
../nicr_rgb_d_orientation_data_set_examples/small_patches \
--input_type depth \
--input_preprocessing scale01 \
--input_height 96 \
--input_width 96 \
--output_type biternion \
--mobilenet_v2_alpha 1.00 \
--n_samples 1
[--cpu]
Apply best performing RGB MobileNet v2 architecture network (rgb input, biternion output, mean absolute error of 7.98° on test set)
# single prediction (without dropout sampling)
python inference.py \
mobilenet_v2 \
../trained_networks/mobilenet_v2_1_00__rgb__96x96__biternion__0_001000__0/weights_valid_0134.hdf5 \
../nicr_rgb_d_orientation_data_set_examples/small_patches \
--input_type rgb \
--input_preprocessing scale01 \
--input_height 96 \
--input_width 96 \
--output_type biternion \
--mobilenet_v2_alpha 1.00 \
--n_samples 1
[--cpu]
Change directory to src
cd /path/to/this/repository/src
Extract patches
python extract_patches.py --dataset_basepath /path/to/nicr_rgb_d_orientation_data_set
Train (GPU only) multiple neural networks with same hyperparameters
# best performing configuration and hyperparameters from paper
python train.py \
beyer_mod_relu \
--dataset_basepath /path/to/nicr_rgb_d_orientation_data_set \
--output_basepath /path/where/to/store/training/output/files \
--input_type depth \
--input_preprocessing standardize \
--input_height 126 \
--input_width 48 \
--output_type biternion \
--learning_rate 0.03 \
--run_id 0
python train.py \
beyer_mod_relu \
--dataset_basepath /path/to/nicr_rgb_d_orientation_data_set \
--output_basepath /path/where/to/store/training/output/files \
--input_type depth \
--input_preprocessing standardize \
--input_height 126 \
--input_width 48 \
--output_type biternion \
--learning_rate 0.03 \
--run_id 1
python train.py \
beyer_mod_relu \
--dataset_basepath /path/to/nicr_rgb_d_orientation_data_set \
--output_basepath /path/where/to/store/training/output/files \
--input_type depth \
--input_preprocessing standardize \
--input_height 126 \
--input_width 48 \
--output_type biternion \
--learning_rate 0.03 \
--run_id 2
For further details and parameters, see:
python train.py --help
usage: train.py [-h] [-o OUTPUT_BASEPATH] [-db DATASET_BASEPATH]
[-ds {small,large}] [-ts TRAINING_SET]
[-vs VALIDATION_SETS [VALIDATION_SETS ...]]
[-it {depth,rgb,depth_and_rgb}] [-iw INPUT_WIDTH]
[-ih INPUT_HEIGHT] [-ip {standardize,scale01,none}]
[-ot {regression,classification,biternion}]
[-lr LEARNING_RATE] [-lrd {poly}] [-m MOMENTUM] [-ne N_EPOCHS]
[-es EARLY_STOPPING] [-b BATCH_SIZE]
[-vb VALIDATION_BATCH_SIZE] [-nc N_CLASSES] [-k KAPPA]
[-opt {sgd,adam,rmsprop}] [-naug] [-rid RUN_ID]
[--mobilenet_v2_alpha {0.35,0.5,0.75,1.0}] [-d DEVICES] [-v]
{beyer,beyer_mod,beyer_mod_relu,mobilenet_v2,beyer_mod_relu_sep}
Train neural network for orientation estimation
positional arguments:
{beyer,beyer_mod_relu,mobilenet_v2}
Model to use: beyer, beyer_mod_relu or mobilenet_v2
optional arguments:
-h, --help show this help message and exit
-o OUTPUT_BASEPATH, --output_basepath OUTPUT_BASEPATH
Path where to store output files, default: '/results/rotator'
-db DATASET_BASEPATH, --dataset_basepath DATASET_BASEPATH
Path to downloaded dataset (default: '/datasets/rotator')
-ds {small,large}, --dataset_size {small,large}
Dataset image size to use. One of :('small', 'large'), default: small
-ts TRAINING_SET, --training_set TRAINING_SET
Set to use for training, default: training
-vs VALIDATION_SETS [VALIDATION_SETS ...], --validation_sets VALIDATION_SETS [VALIDATION_SETS ...]
Sets to use for validation, default: [validation, test]
-it {depth,rgb,depth_and_rgb}, --input_type {depth,rgb,depth_and_rgb}
Input type. One of ('depth', 'rgb', 'depth_and_rgb'), default: depth
-iw INPUT_WIDTH, --input_width INPUT_WIDTH
Patch width to use, default: 96
-ih INPUT_HEIGHT, --input_height INPUT_HEIGHT
Patch height to use, default: 96
-ip {standardize,scale01,none}, --input_preprocessing {standardize,scale01,none}
Preprocessing to apply. One of [standardize, scale01, none], default: standardize
-ot {regression,classification,biternion}, --output_type {regression,classification,biternion}
Output type. One of ('regression', 'classification', 'biternion'), default: biternion)
-lr LEARNING_RATE, --learning_rate LEARNING_RATE
(Base) learning rate, default: 0.01
-lrd {poly}, --learning_rate_decay {poly}
Learning rate decay to use, default: poly
-m MOMENTUM, --momentum MOMENTUM
Momentum to use, default: 0.9
-ne N_EPOCHS, --n_epochs N_EPOCHS
Number of epochs to train, default: 800
-es EARLY_STOPPING, --early_stopping EARLY_STOPPING
Number of epochs with no improvement after which training will be stopped, default: 100.To disable early stopping use -1.
-b BATCH_SIZE, --batch_size BATCH_SIZE
Batch size to use, default: 128
-vb VALIDATION_BATCH_SIZE, --validation_batch_size VALIDATION_BATCH_SIZE
Batch size to use for validation, default: 512
-nc N_CLASSES, --n_classes N_CLASSES
Number of classes when output_type is classification, default: 8
-k KAPPA, --kappa KAPPA
Kappa to use when output_type is biternion or regression, default: biternion: 1.0, regression: 0.5
-opt {sgd,adam,rmsprop}, --optimizer {sgd,adam,rmsprop}
Optimizer to use, default: sgd
-naug, --no_augmentation
Disable augmentation
-rid RUN_ID, --run_id RUN_ID
Run ID (default: 0)
-ma {0.35,0.5,0.75,1.0}, --mobilenet_v2_alpha {0.35,0.5,0.75,1.0}
Alpha value for MobileNet v2 (default: 1.0)
-d DEVICES, --devices DEVICES
GPU device id(s) to train on. (default: 0)
-v, --verbose Enable verbose output
Evaluate trained networks
# this creates a json file containing a deeper analysis of the trained networks
python eval.py \
biternion \
--dataset_basepath /path/to/nicr_rgb_d_orientation_data_set \
--set test \
--output_path /path/where/to/store/evaluation/output/files \
--training_basepath /path/where/to/store/training/output/files
For further details and parameters, see:
python eval.py --help
usage: eval.py [-h] [-o OUTPUT_PATH] [-tb TRAINING_BASEPATH]
[-s {validation,test}] [-ss {training,validation,test}]
[-db DATASET_BASEPATH] [-ds {small,large}] [-v]
{regression,classification,biternion}
Evaluate trained neural networks for orientation estimation
positional arguments:
{regression,classification,biternion}
Output type. One of ('regression', 'classification', 'biternion') (default: biternion)
optional arguments:
-h, --help show this help message and exit
-o OUTPUT_PATH, --output_path OUTPUT_PATH
Path where to store created output files, default: '../eval_outputs/' relative to the location of this script
-tb TRAINING_BASEPATH, --training_basepath TRAINING_BASEPATH
Path to training outputs (default: '/results/rotator')
-s {validation,test}, --set {validation,test}
Set to use for evaluation, default: test
-ss {training,validation,test}, --selection_set {training,validation,test}
Set to use for deriving the best epoch, default: validation
-db DATASET_BASEPATH, --dataset_basepath DATASET_BASEPATH
Path to downloaded dataset (default: '/datasets/rotator')
-ds {small,large}, --dataset_size {small,large}
Dataset image size to use. One of :('small', 'large'), default: small
-v, --verbose Enable verbose output
The source code is published under BSD 3-Clause license, see license file for details.
If you use the source code or the network weights, please cite the following paper:
Lewandowski, B., Seichter, D., Wengefeld, T., Pfennig, L., Drumm, H., Gross, H.-M.
Deep Orientation: Fast and Robust Upper Body Orientation Estimation for Mobile Robotic Applications.
in: IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), Macau, pp. 441-448, IEEE 2019
@InProceedings{Lewandowski-IROS-2019,
author = {Lewandowski, Benjamin and Seichter, Daniel and Wengefeld, Tim and Pfennig, Lennard and Drumm, Helge and Gross, Horst-Michael},
title = {Deep Orientation: Fast and Robust Upper Body Orientation Estimation for Mobile Robotic Applications},
booktitle = {IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), Macau},
year = {2019},
pages = {441--448},
publisher = {IEEE},
}