项目作者: endernewton

项目描述 :
Code for Iterative Reasoning Paper (CVPR 2018)
高级语言: Python
项目地址: git://github.com/endernewton/iter-reason.git
创建时间: 2018-03-27T04:18:30Z
项目社区:https://github.com/endernewton/iter-reason

开源协议:MIT License

下载


Iterative Visual Reasoning Beyond Convolutions

By Xinlei Chen, Li-Jia Li, Li Fei-Fei and Abhinav Gupta.

Disclaimer

  • This is the authors’ implementation of the system described in the paper, not an official Google product.
  • Right now:
    • The available reasoning module is based on convolutions and spatial memory.
    • For simplicity, the released code uses the tensorflow default crop_and_resize operation, rather than the customized one reported in the paper (I find the default one is actually better by ~1%).

Prerequisites

  1. Tensorflow, tested with version 1.6 with Ubuntu 16.04, installed with:

    1. pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.6.0-cp27-none-linux_x86_64.whl
  2. Other packages needed can be installed with pip:

    1. pip install Cython easydict matplotlib opencv-python Pillow pyyaml scipy
  3. For running COCO, the API can be installed globally:

    1. # any path is okay
    2. mkdir ~/install && cd ~/install
    3. git clone https://github.com/cocodataset/cocoapi.git cocoapi
    4. cd cocoapi/PythonAPI
    5. python setup.py install --user

Setup and Running

  1. Clone the repository.

    1. git clone https://github.com/endernewton/iter-reason.git
    2. cd iter-reason
  2. Set up data, here we use ADE20K as an example.

    1. mkdir -p data/ADE
    2. cd data/ADE
    3. wget -v http://groups.csail.mit.edu/vision/datasets/ADE20K/ADE20K_2016_07_26.zip
    4. tar -xzvf ADE20K_2016_07_26.zip
    5. mv ADE20K_2016_07_26/* ./
    6. rmdir ADE20K_2016_07_26
    7. # then get the train/val/test split
    8. wget -v http://xinleic.xyz/data/ADE_split.tar.gz
    9. tar -xzvf ADE_split.tar.gz
    10. rm -vf ADE_split.tar.gz
    11. cd ../..
  3. Set up pre-trained ImageNet models. This is similarly done in tf-faster-rcnn. Here by default we use ResNet-50 as the backbone:

    1. mkdir -p data/imagenet_weights
    2. cd data/imagenet_weights
    3. wget -v http://download.tensorflow.org/models/resnet_v1_50_2016_08_28.tar.gz
    4. tar -xzvf resnet_v1_50_2016_08_28.tar.gz
    5. mv resnet_v1_50.ckpt res50.ckpt
    6. cd ../..
  4. Compile the library (for computing bounding box overlaps).

    1. cd lib
    2. make
    3. cd ..
  5. Now you are ready to run! For example, to train and test the baseline:

    1. ./experiments/scripts/train.sh [GPU_ID] [DATASET] [NET] [STEPS] [ITER]
    2. # GPU_ID is the GPU you want to test on
    3. # DATASET in {ade, coco, vg} is the dataset to train/test on, defined in the script
    4. # NET in {res50, res101} is the backbone networks to choose from
    5. # STEPS (x10K) is the number of iterations before it reduces learning rate, can support multiple steps separated by character 'a'
    6. # ITER (x10K) is the total number of iterations to run
    7. # Examples:
    8. # train on ADE20K for 320K iterations, reducing learning rate at 280K.
    9. ./experiments/scripts/train.sh 0 ade 28 32
    10. # train on COCO for 720K iterations, reducing at 320K and 560K.
    11. ./experiments/scripts/train.sh 1 coco 32a56 72
  6. To train and test the reasoning modules (based on ResNet-50):

    1. ./experiments/scripts/train_memory.sh [GPU_ID] [DATASET] [MEM] [STEPS] [ITER]
    2. # MEM in {local} is the type of reasoning modules to use
    3. # Examples:
    4. # train on ADE20K on the local spatial memory.
    5. ./experiments/scripts/train_memory.sh 0 ade local 28 32
  7. Once the training is done, you can test the models separately with test.sh and test_memory.sh, we also provided a separate set of scripts to test on larger image inputs.

  8. You can use tensorboard to visualize and track the progress, for example:

    1. tensorboard --logdir=tensorboard/res50/ade_train_5/ --port=7002 &

References

  1. @inproceedings{chen18iterative,
  2. author = {Xinlei Chen and Li-Jia Li and Li Fei-Fei and Abhinav Gupta},
  3. title = {Iterative Visual Reasoning Beyond Convolutions},
  4. booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  5. Year = {2018}
  6. }

The idea of spatial memory was developed in:

  1. @inproceedings{chen2017spatial,
  2. author = {Xinlei Chen and Abhinav Gupta},
  3. title = {Spatial Memory for Context Reasoning in Object Detection},
  4. booktitle = {Proceedings of the International Conference on Computer Vision},
  5. Year = {2017}
  6. }