How to Fine-Tune LLMs with Kubeflow
Warning
This feature is in alpha stage and the Kubeflow community is looking for your feedback. Please share your experience using the #kubeflow-training Slack channel or the Kubeflow Training Operator GitHub.This page describes how to use a train
API from the Training Python SDK
that simplifies the ability to fine-tune LLMs with distributed PyTorchJob workers.
If you want to learn more about how the fine-tuning API fits in the Kubeflow ecosystem, head to the explanation guide.
Prerequisites
You need to install the Training Python SDK with fine-tuning support to run this API.
How to use the Fine-Tuning API?
You need to provide the following parameters to use the train
API:
- Pre-trained model parameters.
- Dataset parameters.
- Trainer parameters.
- Number of PyTorch workers and resources per workers.
For example, you can use the train
API to fine-tune the BERT model using the Yelp Review dataset
from HuggingFace Hub with the code below:
import transformers
from peft import LoraConfig
from kubeflow.training import TrainingClient
from kubeflow.storage_initializer.hugging_face import (
HuggingFaceModelParams,
HuggingFaceTrainerParams,
HuggingFaceDatasetParams,
)
TrainingClient().train(
name="fine-tune-bert",
# BERT model URI and type of Transformer to train it.
model_provider_parameters=HuggingFaceModelParams(
model_uri="hf://google-bert/bert-base-cased",
transformer_type=transformers.AutoModelForSequenceClassification,
),
# Use 3000 samples from Yelp dataset.
dataset_provider_parameters=HuggingFaceDatasetParams(
repo_id="yelp_review_full",
split="train[:3000]",
),
# Specify HuggingFace Trainer parameters. In this example, we will skip evaluation and model checkpoints.
trainer_parameters=HuggingFaceTrainerParams(
training_parameters=transformers.TrainingArguments(
output_dir="test_trainer",
save_strategy="no",
evaluation_strategy="no",
do_eval=False,
disable_tqdm=True,
log_level="info",
),
# Set LoRA config to reduce number of trainable model parameters.
lora_config=LoraConfig(
r=8,
lora_alpha=8,
lora_dropout=0.1,
bias="none",
),
),
num_workers=4, # nnodes parameter for torchrun command.
num_procs_per_worker=2, # nproc-per-node parameter for torchrun command.
resources_per_worker={
"gpu": 2,
"cpu": 5,
"memory": "10G",
},
)
After you execute train
, the Training Operator will orchestrate the appropriate PyTorchJob resources
to fine-tune the LLM.
Next Steps
Run the example to fine-tune the TinyLlama LLM
Check this example to compare the
create_job
and thetrain
Python API for fine-tuning BERT LLM.Understand the architecture behind
train
API.
Feedback
Was this page helpful?
Thank you for your feedback!
We're sorry this page wasn't helpful. If you have a moment, please share your feedback so we can improve.