项目作者: choasup

项目描述 :
CVPR 2018: Structure Inference Net for Object Detection
高级语言: Python
项目地址: git://github.com/choasup/SIN.git
创建时间: 2018-03-14T13:11:31Z
项目社区:https://github.com/choasup/SIN

开源协议:

下载


SIN

Structure Inference Net: Object Detection Using Scene-level Context and Instance-level Relationships. In CVPR 2018.(http://vipl.ict.ac.cn/uploadfile/upload/2018041318013480.pdf)

Requirements: software

  1. Requirements for Tensorflow 1.3.0 (see: Tensorflow)

  2. Python packages you might not have: cython, python-opencv, easydict

Installation (sufficient for the demo)

  1. Clone the SIN repository

    1. # Make sure to clone with --recursive
    2. git clone --recursive https://github.com/choasUp/SIN.git
  2. Build the Cython modules

    1. cd $SIN_ROOT/lib
    2. make

Demo

After successfully completing basic installation, you’ll be ready to run the demo.

Wait …

Training Model

  1. Download the training, validation, test data and VOCdevkit

    1. wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
    2. wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
    3. wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
  2. Extract all of these tars into one directory named VOCdevkit

    1. tar xvf VOCtrainval_06-Nov-2007.tar
    2. tar xvf VOCtest_06-Nov-2007.tar
    3. tar xvf VOCdevkit_08-Jun-2007.tar
  3. It should have this basic structure

    1. $VOCdevkit/ # development kit
    2. $VOCdevkit/VOCcode/ # VOC utility code
    3. $VOCdevkit/VOC2007 # image sets, annotations, etc.
    4. # ... and several other directories ...
  4. Create symlinks for the PASCAL VOC dataset

    1. cd $SIN_ROOT/data
    2. ln -s $VOCdevkit VOCdevkit
  5. Download the pre-trained ImageNet models [Google Drive] [Dropbox]

    1. mv VGG_imagenet.npy $SIN_ROOT/data/pretrain_model/VGG_imagenet.npy
  6. [optional] Set learning rate and max iter

    1. vim experiments/scripts/faster_rcnn_end2end.sh # ITERS
    2. vim lib/fast/config.py # LR
    3. cd lib # if you edit the code, make best
    4. make
  7. Set your GPU id, then run script to train and test model

    1. cd $SIN_ROOT
    2. export CUDA_VISIBLE_DEVICSE=0
    3. ./train.sh
  8. Test your dataset

    1. ./test_all.sh

The result of testing on PASCAL VOC 2007 (VGG net)

  1. AP for aeroplane = 0.7853
  2. AP for bicycle = 0.8045
  3. AP for bird = 0.7456
  4. AP for boat = 0.6657
  5. AP for bottle = 0.6144
  6. AP for bus = 0.8424
  7. AP for car = 0.8663
  8. AP for cat = 0.8894
  9. AP for chair = 0.5803
  10. AP for cow = 0.8466
  11. AP for diningtable = 0.7171
  12. AP for dog = 0.8578
  13. AP for horse = 0.8626
  14. AP for motorbike = 0.7802
  15. AP for person = 0.7857
  16. AP for pottedplant = 0.4869
  17. AP for sheep = 0.7599
  18. AP for sofa = 0.7351
  19. AP for train = 0.8199
  20. AP for tvmonitor = 0.7683
  21. Mean AP = 0.7607

References

Faster R-CNN caffe version

Faster R-CNN tf version

Citation

Yong Liu, Ruiping Wang, Shiguang Shan, and Xilin Chen. Structure Inference Net: Object Detection Using Scene-level Context and Instance-level Relationships. In CVPR 2018.