项目作者: mtkwT
项目描述 :
Implementation of Adaptive Hessian-free optimization.
高级语言: Python
项目地址: git://github.com/mtkwT/adaptive-hessian-free-optimization.git
Adaptive Hessian-free Optimization
Required
Training model using Adaptive-HF
Example
$ (adaptive-hf) cd /code/adaptive-hessian-free-optimization
$ (adaptive-hf) python exp/training.py
Argments
- —arch: model architecture, default is ‘LeNet’.
- —gpu-num: GPU Device number, default is 1.
- —seed: random seed for training model, default is 1.
- —batch-size: input batch size for training model, default is 128.
- —epochs: number of epochs for training model, default is 10.
- —lr: learning rate, default is 0.001.
- —damping: damping rate for constract a positive-definite Hessian matrix, default is 10.
- —beta2: hyperparameter of Adam, default is 1e-8.
The following three parametes are the unique hyperparameters of Adaptive-HF.
- —cg-epsilon: default is 1e-3.
- —cg-sigma: the upper bound of the conjugate gradients, default is 50.
- —cg-L-smoothness: the lower bound of L-smoothness, default is 100.
Comparison between Original-HF and Adaptive-HF

