Developer Guide
Clone the Repository
Clone the Spark operator repository and change to the directory:
git clone git@github.com:kubeflow/spark-operator.git
cd spark-operator
(Optional) Configure Git Pre-Commit Hooks
Git hooks are useful for identifying simple issues before submission to code review. We run hooks on every commit to automatically generate helm chart README.md
file from README.md.gotmpl
file. Before you can run git hooks, you need to have the pre-commit package manager installed as follows:
# Using pip
pip install pre-commit
# Using conda
conda install -c conda-forge pre-commit
# Using Homebrew
brew install pre-commit
To set up the pre-commit hooks, run the following command:
pre-commit install
pre-commit install-hooks
Use Makefile
We use Makefile to automate common tasks. For example, to build the operator, run the build-operator
target as follows, and spark-operator
binary will be build and placed in the bin
directory:
make build-operator
Dependencies will be automatically downloaded locally to bin
directory as needed. For example, if you run make manifests
target, then controller-gen
tool will be automatically downloaded using go install
command and then it will be renamed like controller-gen-vX.Y.Z
and placed in the bin
directory.
To see the full list of available targets, run the following command:
$ make help
Usage:
make <target>
General
help Display this help.
version Print version information.
Development
manifests Generate CustomResourceDefinition, RBAC and WebhookConfiguration manifests.
generate Generate code containing DeepCopy, DeepCopyInto, and DeepCopyObject method implementations.
update-crd Update CRD files in the Helm chart.
go-clean Clean up caches and output.
go-fmt Run go fmt against code.
go-vet Run go vet against code.
lint Run golangci-lint linter.
lint-fix Run golangci-lint linter and perform fixes.
unit-test Run unit tests.
e2e-test Run the e2e tests against a Kind k8s instance that is spun up.
Build
build-operator Build Spark operator.
build-sparkctl Build sparkctl binary.
install-sparkctl Install sparkctl binary.
clean Clean spark-operator and sparkctl binaries.
build-api-docs Build api documentation.
docker-build Build docker image with the operator.
docker-push Push docker image with the operator.
docker-buildx Build and push docker image for the operator for cross-platform support
Helm
detect-crds-drift Detect CRD drift.
helm-unittest Run Helm chart unittests.
helm-lint Run Helm chart lint test.
helm-docs Generates markdown documentation for helm charts from requirements and values files.
Deployment
kind-create-cluster Create a kind cluster for integration tests.
kind-load-image Load the image into the kind cluster.
kind-delete-custer Delete the created kind cluster.
install-crd Install CRDs into the K8s cluster specified in ~/.kube/config.
uninstall-crd Uninstall CRDs from the K8s cluster specified in ~/.kube/config. Call with ignore-not-found=true to ignore resource not found errors during deletion.
deploy Deploy controller to the K8s cluster specified in ~/.kube/config.
undeploy Undeploy controller from the K8s cluster specified in ~/.kube/config. Call with ignore-not-found=true to ignore resource not found errors during deletion.
Dependencies
kustomize Download kustomize locally if necessary.
controller-gen Download controller-gen locally if necessary.
kind Download kind locally if necessary.
envtest Download setup-envtest locally if necessary.
golangci-lint Download golangci-lint locally if necessary.
gen-crd-api-reference-docs Download gen-crd-api-reference-docs locally if necessary.
helm Download helm locally if necessary.
helm-unittest-plugin Download helm unittest plugin locally if necessary.
helm-docs-plugin Download helm-docs plugin locally if necessary.
Develop with Spark Operator
Build the Binary
To build the operator, run the following command:
make build-operator
Build the Docker Image
In case you want to build the operator from the source code, e.g., to test a fix or a feature you write, you can do so following the instructions below.
The easiest way to build the operator without worrying about its dependencies is to just build an image using the Dockerfile.
make docker-build IMAGE_TAG=<image-tag>
The operator image is built upon a base Spark image that defaults to spark:3.5.2
. If you want to use your own Spark image (e.g., an image with a different version of Spark or some custom dependencies), specify the argument SPARK_IMAGE
as the following example shows:
docker build --build-arg SPARK_IMAGE=<your Spark image> -t <image-tag> .
Update the API definition
If you have updated the API definition, then you also need to update the auto-generated code. To update the auto-generated code which contains DeepCopy, DeepCopyInto, and DeepCopyObject method implementations, run the following command:
make generate
To update the auto-generated CustomResourceDefinition (CRD), RBAC and WebhookConfiguration manifests, run the following command:
make manifests
After updating the CRD files, run the following command to copy the CRD files to the helm chart directory:
make update-crd
Besides, the API specification documentation docs/api-docs.md
also needs to be updated. To update the doc, run the following command:
make build-api-docs
Run Unit Tests
To run unit tests, run the following command:
make unit-test
Run E2E Tests
To run e2e tests, run the following command:
# Create a kind cluster
make kind-create-cluster
# Build docker image
make docker-build IMAGE_TAG=local
# Load docker image to kind cluster
make kind-load-image
# Run e2e tests
make e2e-test
# Delete the kind cluster
make kind-delete-cluster
Develop with the Helm Chart
Run Helm Chart Lint Tests
To run Helm chart lint tests, run the following command:
$ make helm-lint
Linting charts...
------------------------------------------------------------------------------------------------------------------------
Charts to be processed:
------------------------------------------------------------------------------------------------------------------------
spark-operator => (version: "1.2.4", path: "charts/spark-operator-chart")
------------------------------------------------------------------------------------------------------------------------
Linting chart "spark-operator => (version: \"1.2.4\", path: \"charts/spark-operator-chart\")"
Checking chart "spark-operator => (version: \"1.2.4\", path: \"charts/spark-operator-chart\")" for a version bump...
Old chart version: 1.2.1
New chart version: 1.2.4
Chart version ok.
Validating /Users/user/go/src/github.com/kubeflow/spark-operator/charts/spark-operator-chart/Chart.yaml...
Validation success! 👍
Validating maintainers...
Linting chart with values file "charts/spark-operator-chart/ci/ci-values.yaml"...
==> Linting charts/spark-operator-chart
[INFO] Chart.yaml: icon is recommended
1 chart(s) linted, 0 chart(s) failed
------------------------------------------------------------------------------------------------------------------------
✔︎ spark-operator => (version: "1.2.4", path: "charts/spark-operator-chart")
------------------------------------------------------------------------------------------------------------------------
All charts linted successfully
Run Helm chart unit tests
For detailed information about how to write Helm chart unit tests, please refer to helm-unittest. To run the Helm chart unit tests, run the following command:
$ make helm-unittest
### Chart [ spark-operator ] charts/spark-operator-chart
PASS Test controller deployment charts/spark-operator-chart/tests/controller/deployment_test.yaml
PASS Test controller pod disruption budget charts/spark-operator-chart/tests/controller/poddisruptionbudget_test.yaml
PASS Test controller rbac charts/spark-operator-chart/tests/controller/rbac_test.yaml
PASS Test controller deployment charts/spark-operator-chart/tests/controller/service_test.yaml
PASS Test controller service account charts/spark-operator-chart/tests/controller/serviceaccount_test.yaml
PASS Test prometheus pod monitor charts/spark-operator-chart/tests/prometheus/podmonitor_test.yaml
PASS Test Spark RBAC charts/spark-operator-chart/tests/spark/rbac_test.yaml
PASS Test spark service account charts/spark-operator-chart/tests/spark/serviceaccount_test.yaml
PASS Test webhook deployment charts/spark-operator-chart/tests/webhook/deployment_test.yaml
PASS Test mutating webhook configuration charts/spark-operator-chart/tests/webhook/mutatingwebhookconfiguration_test.yaml
PASS Test webhook pod disruption budget charts/spark-operator-chart/tests/webhook/poddisruptionbudget_test.yaml
PASS Test webhook rbac charts/spark-operator-chart/tests/webhook/rbac_test.yaml
PASS Test webhook service charts/spark-operator-chart/tests/webhook/service_test.yaml
PASS Test validating webhook configuration charts/spark-operator-chart/tests/webhook/validatingwebhookconfiguration_test.yaml
Charts: 1 passed, 1 total
Test Suites: 14 passed, 14 total
Tests: 137 passed, 137 total
Snapshot: 0 passed, 0 total
Time: 477.748ms
Build the Helm Docs
The Helm chart README.md
file is generated by helm-docs tool. If you want to update the Helm docs, remember to modify README.md.gotmpl
rather than README.md
, then run make helm-docs
to generate the README.md
file:
$ make helm-docs
INFO[2024-04-14T07:29:26Z] Found Chart directories [charts/spark-operator-chart]
INFO[2024-04-14T07:29:26Z] Generating README Documentation for chart charts/spark-operator-chart
Note that if git pre-commit hooks are set up, helm-docs
will automatically run before committing any changes. If there are any changes to the README.md
file, the commit process will be aborted.
Sign off your commits
After you have made changes to the code, please sign off your commits with -s
or --signoff
flag so that the DCO check CI will pass:
git commit -s -m "Your commit message"
Feedback
Was this page helpful?
Thank you for your feedback!
We're sorry this page wasn't helpful. If you have a moment, please share your feedback so we can improve.