Fridge demo reasoning model
This is the code used to train models for smart fridge paper and demo:
Smart Home Appliances: Chat with Your Fridge
This project forks MAC project and modifies it for smart fridge application. You can use this code to reproduce our models.
requirements.txt
for the required python3 packages and run pip3 install -r requirements.txt
to install them.Before training the model, we first have to generate the FRIDGR dataset and extract features for the images:
To generate dataset, use FRIDGR dataset repository:
mv FRIDGR_SPLIT_scenes.json ./FRIDGR_v0.1/scenes/
mv FRIDGR_SPLIT_questions.json ./FRIDGR_v0.1/data/
mv images/* ./FRIDGR_v0.1/images/SPLIT/
The final command moves the generated dataset into the proper data
directory, where we will put all the data files we use during training.
Extract ResNet-101 features for the FRIDGR train, val, and test images with the following commands:
CUDA_VISIBLE_DEVICES=0, python3 extract_features.py --input_image_dir ./images/train --output_h5_file ./data/train.h5 --batch_size 100
CUDA_VISIBLE_DEVICES=1, python3 extract_features.py --input_image_dir ./images/val --output_h5_file ./data/val.h5 --batch_size 100
CUDA_VISIBLE_DEVICES=2, python3 extract_features.py --input_image_dir ./images/test --output_h5_file ./data/test.h5 --batch_size 100
To train the model, run the following command:
python3 main.py --expName "fridgr_FeatCls_EmbRandom_CfgArgs0" --train --batchSize 64 --testedNum 10000 --epochs 25 --netLength 4 --gpus 0 @configs/args.txt
python3 main.py --expName "fridgr_FeatCls_EmbRandom_CfgArgs2" --train --batchSize 64 --testedNum 10000 --epochs 40 --netLength 6 --gpus 1 @configs/args2.txt
python3 main.py --expName "fridgr_FeatCls_EmbRandom_CfgArgs3" --train --batchSize 64 --testedNum 10000 --epochs 40 --netLength 6 --gpus 2 @configs/args3.txt
python3 main.py --expName "fridgr_FeatCls_EmbRandom_CfgArgs4" --train --batchSize 64 --testedNum 10000 --epochs 40 --netLength 6 --gpus 3 @configs/args4.txt
First, the program preprocesses the FRIDGR questions. It tokenizes them and maps them to integers to prepare them for the network. It then stores a JSON with that information about them as well as word-to-integer dictionaries in the ./FRIDGR_v0.1/data
directory.
Then, the program trains the model. Weights are saved by default to ./weights/{expName}
and statistics about the training are collected in ./results/{expName}
, where expName
is the name we choose to give to the current experiment.
--trainedNum
and --testedNum
respectively.-r
flag to restore and continue training a previously pre-trained model. --netLength
option to explore different lengths of reasoning processes.We have explored several variants of our model. We provide a few examples in configs/args2-4.txt
. For instance, you can run the first by:
python3 main.py --expName "experiment1" --train --batchSize 64 --testedNum 10000 --epochs 40 --netLength 6 @configs/args2.txt
args2
uses a non-recurrent variant of the control unit that converges faster.args3
incorporates self-attention into the write unit.args4
adds control-based gating over the memory.See config.py
for further available options (Note that some of them are still in an experimental stage).
To evaluate the trained model, and get predictions and attention maps, run the following:
python3 main.py --expName "fridgr_FeatCls_EmbRandom_CfgArgs0" --finalTest --batchSize 64 --testedNum 10000 --netLength 4 --gpus 3 --getPreds --getAtt -r @configs/args.txt
python3 main.py --expName "fridgr_FeatCls_EmbRandom_CfgArgs2" --finalTest --batchSize 64 --testedNum 10000 --netLength 6 --gpus 4 --getPreds --getAtt -r @configs/args2.txt
python3 main.py --expName "fridgr_FeatCls_EmbRandom_CfgArgs3" --finalTest --batchSize 64 --testedNum 10000 --netLength 6 --gpus 5 --getPreds --getAtt -r @configs/args3.txt
python3 main.py --expName "fridgr_FeatCls_EmbRandom_CfgArgs4" --finalTest --batchSize 64 --testedNum 10000 --netLength 6 --gpus 6 --getPreds --getAtt -r @configs/args4.txt
The command will restore the model we have trained, and evaluate it on the validation set. JSON files with predictions and the attention distributions resulted by running the model are saved by default to ./preds/{expName}
.
--getAtt
), and to avoid having large prediction files, we advise you to limit the number of examples evaluated to 5,000-20,000.After we evaluate the model with the command above, we can visualize the attention maps generated by running:
python3 visualization.py --expName "fridgr_FeatCls_EmbRandom_CfgArgs4" --tier val
python3 visualization.py --expName "fridgr_FeatCls_EmbRandom_CfgArgs4" --tier test --imageBasedir ./images --dataBasedir ./FRIDGR_v0.1
(Tier can be set to train
or test
as well). The script supports filtering of the visualized questions by various ways. See visualization.py
for further details.