How to Configure Early Stopping
This guide shows how you can use early stopping to optimize cost for your Katib Experiments. Early stopping allows you to avoid overfitting when you train your model during Katib Experiments. It also helps by saving computing resources and reducing Experiment execution time by stopping the Experiment’s Trials when the target metric(s) no longer improves before the training process is complete.
The major advantage of using early stopping in Katib is that you don’t need to modify your training container package. All you have to do is make necessary changes to your Experiment’s YAML file.
Early stopping works in the same way as Katib’s
metrics collector. It analyses required
metrics from the StdOut
or from the arbitrary output file and an early stopping algorithm makes
the decision if the Trial needs to be stopped. Currently, early stopping works only with
StdOut
or File
metrics collectors.
Note: Your training container must print training logs with the timestamp,
because early stopping algorithms need to know the sequence of reported metrics.
Check the
PyTorch
example
to learn how to add a date format to your logs.
Configure the Experiment with early stopping
As a reference, you can use the YAML file of the early stopping example.
Follow the guide to configure your Katib Experiment.
Next, to apply early stopping for your Experiment, specify the
.spec.earlyStopping
parameter, similar to the.spec.algorithm
..earlyStopping.algorithmName
- the name of the early stopping algorithm..earlyStopping.algorithmSettings
- the settings for the early stopping algorithm.
What happens is your Experiment’s Suggestion produces new Trials. After that, the early stopping
algorithm generates early stopping rules for the created Trials. Once the Trial reaches all the rules,
it is stopped and the Trial status is changed to the EarlyStopped
. Then, Katib calls the Suggestion again to
ask for the new Trials.
Early Stopping Algorithms
Katib currently supports several algorithms for early stopping:
More algorithms are under development.
Median Stopping Rule
The early stopping algorithm name in Katib is medianstop
.
The median stopping rule stops a pending Trial X
at step S
if the Trial’s best objective value
by step S
is worse than the median value of the running averages of all completed Trials objectives
reported up to step S
.
To learn more about it, check Google Vizier: A Service for Black-Box Optimization.
Katib supports the following early stopping settings:
Setting Name | Description | Default Value |
---|---|---|
min_trials_required | Minimal number of successful Trials to compute median value | 3 |
start_step | Number of reported intermediate results before stopping the Trial | 4 |
Next steps
How to use Katib Experiment Trial templates(/docs/components/katib/user-guides/trial-template).
Feedback
Was this page helpful?
Thank you for your feedback!
We're sorry this page wasn't helpful. If you have a moment, please share your feedback so we can improve.