How to Configure Algorithms

List of supported algorithms for neural architecture search

This page describes neural architecture search (NAS) algorithms that Katib supports and how to configure them.

NAS Algorithms

Katib currently supports several search algorithms for NAS:

Efficient Neural Architecture Search (ENAS)

The algorithm name in Katib is enas.

The ENAS example — enas-gpu.yaml — which attempts to show all possible operations. Due to the large search space, the example is not likely to generate a good result.

Katib supports the following algorithm settings for ENAS:

Setting NameTypeDefault valueDescription
controller_hidden_sizeint64RL controller lstm hidden size. Value must be >= 1.
controller_temperaturefloat5.0RL controller temperature for the sampling logits. Value must be > 0. Set value to "None" to disable it in the controller.
controller_tanh_constfloat2.25RL controller tanh constant to prevent premature convergence. Value must be > 0. Set value to "None" to disable it in the controller.
controller_entropy_weightfloat1e-5RL controller weight for entropy applying to reward. Value must be > 0. Set value to "None" to disable it in the controller.
controller_baseline_decayfloat0.999RL controller baseline factor. Value must be > 0 and <= 1.
controller_learning_ratefloat5e-5RL controller learning rate for Adam optimizer. Value must be > 0 and <= 1.
controller_skip_targetfloat0.4RL controller probability, which represents the prior belief of a skip connection being formed. Value must be > 0 and <= 1.
controller_skip_weightfloat0.8RL controller weight of skip penalty loss. Value must be > 0. Set value to "None" to disable it in the controller.
controller_train_stepsint50Number of RL controller training steps after each candidate runs. Value must be >= 1.
controller_log_every_stepsint10Number of RL controller training steps before logging it. Value must be >= 1.

Differentiable Architecture Search (DARTS)

The algorithm name in Katib is darts.

The DARTS example — darts-gpu.yaml.

Katib supports the following algorithm settings for DARTS:

Setting NameTypeDefault valueDescription
num_epochsint50Number of epochs to train model
w_lrfloat0.025Initial learning rate for training model weights. This learning rate annealed down to w_lr_min following a cosine schedule without restart.
w_lr_minfloat0.001Minimum learning rate for training model weights.
w_momentumfloat0.9Momentum for training training model weights.
w_weight_decayfloat3e-4Training model weight decay.
w_grad_clipfloat5.0Max norm value for clipping gradient norm of training model weights.
alpha_lrfloat3e-4Initial learning rate for alphas weights.
alpha_weight_decayfloat1e-3Alphas weight decay.
batch_sizeint128Batch size for dataset.
num_workersint4Number of subprocesses to download the dataset.
init_channelsint16Initial number of channels.
print_stepint50Number of training or validation steps before logging it.
num_nodesint4Number of DARTS nodes.
stem_multiplierint3Multiplier for initial channels. It is used in the first stem cell.

Next steps

Feedback

Was this page helpful?