Google Summer of Code 2024
Kubeflow Community is excited to announce that we have been selected as organization to participate in Google Summer of Code 2024. This page aims to help you participate in the Kubeflow organization for GSoC 2024.
How can I participate in Kubeflow GSoC?
Please go to Google Summer of Code 2024 and sign up as a student. Next, look at the projects below to decide which ones you are interested in. Note, that you must submit your proposals through the GSoC website and your proposal must be selected to participate.
Contributor applications open March 18th, 2024 and close on April 2nd, 2024. For more information, see the GSoC website and/or reach out to the GSoC organizers. Please only contact mentors about projects, not the program itself.
Slack
Please Join the Kubeflow Slack.
Please do not reach out privately to mentors, instead, start a thread in the #gsoc-participants
channel so others can see the response. Be kind to our mentors, please search to see if your question has already been answered.
Meetings
You may wish to attend the next community meeting for the group that is leading your chosen project, please see the calendar below for more information.
What if my proposal is not chosen?
Please understand that not everyone can be selected for Google Summer of Code (GSoC), there are many possible candidates for each project.
However, we still want to encourage you to participate in the Kubeflow project! Get started by attending working group meetings for components you want to help with, and reading our contributing guide.
Project Ideas for 2024 GSoC
Project 1: Kubeflow Notebooks 2.0
Kubeflow Notebook is a widely used component of Kubeflow that allows Data Scientists and ML Engineers to run web-based IDEs (JupyterLab, VSCode, RStudio) on Kubernetes clusters.
There is currently an effort to create the next major version of Kubeflow Notebooks.
The main idea is to change the Kubeflow Notebook CRD so that it is no longer just a wrapper around a Kubernetes PodSpec.
This foundational change enables users to:
- Update existing notebooks after spawning, to change their “pod config” (CPU/GPU/RAM), “volumes” (storage), and “image” (what packages are installed) from options that are defined by their admin.
- Make spawning notebooks less confusing for end-users. Pod configs stop being about specific parts of the PodSpec (e.g. tolerations, requests, limits), and become a drop-down list of user-friendly names (e.g. “Big GPU Notebook - A100 - 128GB”), similar to cloud “instance types”.
- Give admins more control over how workspaces are spawned, and the lifecycle of the “options” which are available to users. For example, admins can now “redirect” existing image/pod configs to new ones, but delay the application of these updates until the next pod restart (during which, the interface will display a warning to users that a change is pending).
- Support new web-based IDEs without needing to specifically integrate with them. Cluster admins can define a custom “kind” for their internal app, or even make “flavors” of existing apps (like Jupyter and VSCode) with the packages and pod-sizes required for specific teams in their organization.
You would be part of the larger effort, and involved in one or more code deliverables:
- See Kubeflow Notebooks docs: https://www.kubeflow.org/docs/components/notebooks/overview/
- See Kubeflow Notebooks 2.0 GitHub proposal: https://github.com/kubeflow/kubeflow/issues/7156
- See Kubeflow Notebooks 2.0 design document: https://docs.google.com/document/d/1_zk06zebbaTBdJ8TdU07Ibky25hqHGARXjVcsp2qEnU/edit
Skills required: Kubernetes Controllers (Golang - Kubebuilder) AND/OR Web Development (JS - Angular, Python - Flask)
Difficulty: medium/high
Length: 350 hrs
Mentors: Mathew Wicks, Kimonas Sitorchos, Julius von Kohout
Component: Notebooks
Project 2: Rootless Kubeflow Container Images (Istio Ambient Mesh)
Kubeflow uses Istio as a service mesh, which by default requires “root level” network permissions for its init-containers. We want to reduce the number of privileged containers required to run Kubeflow, so are investigating using the Istio CNI, and eventually the Istio Ambient mesh.
You would be involved in testing and investigating the impacts of these changes, and helping push the integration forwards.
See the proposal for more information: https://github.com/kubeflow/manifests/blob/master/proposals/20200913-rootlessKubeflow.md
Skills required: Istio, Kubernetes, YAML
Difficulty: medium
Length: 175 hrs
Mentors: Kimonas Sitorchos, Julius von Kohout
Component: Notebooks
Project 3: Triage and Categorize Kubeflow GitHub Issues & PRs
The Kubeflow project needs help to triage, categorize, and highlight important Issues/PRs from the https://github.com/kubeflow/kubeflow GitHub repo. There are around 200 open Issues and 200 open PRs, in addition to many Issues/PRs that have been lost to time (closed automatically due to inactivity).
Specifically, your goal would be to:
- Decide which Issues/PRs are still relevant
- Categorize Issues/PRs by type
- De-duplicate multiple Issues for the same request
- Suggest which ones are the most important.
- Help find “good first issues” for new members:
- Review which PRs are likely safe to merge (especially dependabot ones)
Skills required: GitHub, Kubernetes, YAML, Python, GO, JS
Difficulty: medium
Length: 175 hrs
Mentors: Mathew Wicks, Kimonas Sitorchos, Julius von Kohout
Component: Notebooks/General
Project 4: Implement LLM Tuning API for Katib
Recently, we implemented a new train
Python SDK API in Kubeflow Training Operator to easily fine-tune LLMs on multiple GPUs with predefined datasets provider, model provider, and HuggingFace trainer.
To continue our roadmap around LLMOps in Kubeflow, we want to give user functionality to tune HyperParameters of LLMs using simple Python SDK APIs. It requires making appropriate changes to the Katib Python SDK which allows users to set model, dataset, and HyperParameters that they want to optimize for LLM.
Skills required: Kubernetes, YAML, Python
Difficulty: medium
Length: 350 hrs
Mentors: Andrey Velichkevich, Johnu George, Yuan (Terry) Tang, Yuki Iwai
Component: Katib
Project 5: Support Distributed Jax for Training Operator
Open issue: https://github.com/kubeflow/training-operator/issues/1619
We want to integrate Jax in Training Operator to run distributed training and fine-tuning jobs on Kubernetes using the Jax ML framework. We need to create a new Kubernetes Custom Resource for Jax (e.g. JaxJob) and update the Training Operator controller to support it. Potentially, we can integrate Jax with the Training Operator Python SDK to give Data Scientists simple APIs to create JaxJob on Kubernetes.
Skills required: Kubernetes, Go, YAML, Python
Difficulty: medium
Length: 350 hrs
Mentors: Andrey Velichkevich, Johnu George, Yuan (Terry) Tang, Yuki Iwai
Component: Training Operator
Project 6: Push-based metrics collection for Katib
Open issue: https://github.com/kubeflow/katib/issues/577.
Katib implements Metrics Collector as a sidecar container to collect training metrics from the Trials once training is complete. This Metrics Collector waits until the training container is complete and parses training logs to get appropriate metrics like accuracy or loss to get evaluation results for the HyperParameter tuning algorithm.
Sometimes the container sidecar approach might not work for users. For example, if their Trial resources executor doesn’t support sidecar containers. For such use-cases, we want to implement a new API to the Katib Python SDK to allow users to push metrics directly from their training scripts to the Katib DB.
Skills required: Kubernetes, Go, YAML, Python
Difficulty: medium
Length: 175 hrs
Mentors: Andrey Velichkevich, Johnu George, Yuan (Terry) Tang, Yuki Iwai
Component: Katib
Project 7: Automate docs generation for Kubeflow Python SDKs
Open issue: https://github.com/kubeflow/katib/issues/2081
Training Operator and Katib SDKs have a valid docstring for each public API that users are running. We want to automatically generate documentation for Kubeflow users from these docstrings, so users don’t need to read source code to understand APIs parameters.
Skills required: Python
Difficulty: medium
Length: 90 hrs
Mentors: Andrey Velichkevich, Johnu George, Shivay Lamba, Yuan (Terry) Tang, Yuki Iwai
Component: Katib/Training Operator
Project 8: Support various parameter distributions like log-uniform in Katib
Open issue: https://github.com/kubeflow/katib/pull/2059
We need to enhance Katib Experiment APIs to support various parameter distributions like uniform, log-uniform, qlog-uniform to make Katib more native to other HyperParameter tuning frameworks like Hyperopt. Currently, Katib supports only uniform distribution of integer, float, and categorical HyperParameters.
Skills required: Kubernetes, Python, Go, YAML
Difficulty: medium
Length: 350 hrs
Mentors: Andrey Velichkevich, Johnu George, Yuan (Terry) Tang, Yuki Iwai
Component: Katib
Project 9: PostgreSQL integration in Kubeflow Pipelines
Open issue: https://github.com/kubeflow/pipelines/issues/9813
Kubeflow Pipelines must store information about pipelines, experiments, runs, and artifacts in a database. Currently, the only database it supports is MySQL/MariaDB.
We plan to support PostgreSQL as an alternative to MySQL/MariaDB so users will be able to reuse existing databases, and PostgreSQL will be a good use case for supporting multiple databases.
Skills required: Kubernetes, Python, Go, YAML
Difficulty: medium
Length: 175 hrs
Mentors: Ricardo Martinelli, Shivay Lamba
Component: Pipelines
Project 10: Enhancing KF Model Registry Python client for seamless ML imports from alternative registries
We aim to extend the capabilities of the KF Model Registry Python client by enabling smooth imports from various machine learning registries. While import from HuggingFace is already implemented (and can be used as a basis) we seek to integrate support for MLFlow, and other popular registry formats.
Skills required: Python, ML model serialization formats, YAML, Kubernetes/Kubeflow as a plus
Difficulty: medium
Length: 175 hrs
Mentors: Matteo Mortari, Andrea Lamparelli
Component: Model Registry
Feedback
Was this page helpful?
Thank you for your feedback!
We're sorry this page wasn't helpful. If you have a moment, please share your feedback so we can improve.