项目作者: demonpratapdemon

项目描述 :
Encoder Decoder Model for Image Captioning
高级语言: Jupyter Notebook
项目地址: git://github.com/demonpratapdemon/Image-Captioning.git
创建时间: 2021-05-22T05:07:32Z
项目社区:https://github.com/demonpratapdemon/Image-Captioning

开源协议:GNU General Public License v3.0

下载


Image-Captioning

Encoder Decoder Model for Image Captioning

Encoder Model

For extracting the feature vectors from the image, a CNN based model is used. Pytorch trasfer learning using GoogleNet model was used for this purpose. The size of the model is 49MB.
The feature vector is then converted to a particular embedding size for input to the Decoder model

Decoder Model

After the feature vectors are obtained with embed size, we feed this vector along with the actual captions to a LSTM model for training. The captions also need to be converted to embed size so as to match the input size of the LSTM.

Model

The Encoder-Decoder model is 15 epochs trained and the number of parameters is 12 million.