项目作者: ayziksha

项目描述 :
Deep Image Compression using Decoder Side Information (ECCV 2020)
高级语言: Python
项目地址: git://github.com/ayziksha/DSIN.git
创建时间: 2020-01-13T12:48:03Z
项目社区:https://github.com/ayziksha/DSIN

开源协议:Other

下载


Deep Image Compression using Decoder Side Information

alt text

DSIN (Decoder Side Information Network) is the TensorFlow implementation of Deep Image Compression using Decoder Side Information, published in ECCV 2020.

[Paper]

Citation

If you find our work useful in your research, please cite:

  1. @inproceedings{ayzikA2020dsin,
  2. author = {Sharon Ayzik and Shai Avidan},
  3. title = {Deep Image Compression Using Decoder Side Information},
  4. booktitle = {Computer Vision - {ECCV} 2020 - 16th European Conference, Glasgow,
  5. UK, August 23-28, 2020, Proceedings, Part {XVII}},
  6. volume = {12362},
  7. pages = {699--714},
  8. year = {2020}
  9. }

Abstract

We present a Deep Image Compression neural network that relies on side information, which is only available to the decoder. We base our algorithm on the assumption that the image available to the encoder and the image available to the decoder are correlated, and we let the network learn these correlations in the training phase.

Then, at run time, the encoder side encodes the input image without knowing anything about the decoder side image and sends it to the decoder. The decoder then uses the encoded input image and the side information image to reconstruct the original image.

This problem is known as Distributed Source Coding in Information Theory, and we discuss several use cases for this technology. We compare our algorithm to several image compression algorithms and show that adding decoder-only side information does indeed improve results.

Prerequisites

  • Python 3.5.2
  • Installation of all packages specified in requirements.txt (pip install -r requirements.txt)

Dataset

Training and testing were performed over KITTI dataset. If you wish to use it, please download
KITTI 2012 and KITTI 2015.

Weights

Pre-trained models for KITTI Stereo and KITTI General (as referred in the paper) can be downloaded here.
Please place the weights under src/weights folder in the project.

Inference

In order to perform inference (only), please open the ae_config file and change the following lines:

  1. crop_size = (320,1224) # we used this crop size for our inference
  2. train_model = False
  3. test_model = True
  4. root_data = '/put/path/to/directory/containig/downloaded/data/folders'

For KITTI Stereo:

  1. load_model_name = 'KITTI_stereo_target_bpp0.02' # Model name
  2. file_path_train = 'KITTI_stereo_train.txt' # Name of train.txt file
  3. file_path_val = 'KITTI_stereo_val.txt' # Name of validation.txt file
  4. file_path_test = 'KITTI_stereo_test.txt' # Name of test.txt file
  • Where file_path_... is a text file containing the relative paths to the correlated image pairs (one below the other).
  • If you wish to train/test on your data, please create txt files with the same structure (relative paths to the correlated image pairs, one below the other) under the folder data_path and put their names under the relevant fields in the ae_config file.
  • For KITTI General, change all stereo to general,
    When done, under the DSIN/src folder, run the following command:
    1. python main.py
    All output images will be saved under src/images/model_name/ folder that will be created.

Train

Open the ae_config file and change: train_model = True
Also it is recommended to adjust the following fields:

  1. crop_size = (320,960) # we used this crop size for our training
  2. load_model = True/False # Boolean
  3. lr_initial = <your_learning_rate>

Saved weights will be saved under src/weights folder in the project. In addition, a config file (containing the ae_config and pc_config parameters), as well as last saved files will be created.

License

This project is licensed under the MIT License - see the LICENSE file for details