Kubeflow is an open-source platform for machine learning and MLOps on Kubernetes introduced by Google. The different stages in a typical machine learning lifecycle are represented with different software components in Kubeflow, including model development (Kubeflow Notebooks[4]), model training (Kubeflow Pipelines,[5] Kubeflow Training Operator[6]), model serving (KServe[a][7]), and automated machine learning (Katib[8]).
Original author(s) | |
---|---|
Developer(s) | Kubeflow Contributors[1] - AWS, Bloomberg, Google, IBM, NVIDIA, Nutanix, Red Hat, Arrikto, and others |
Initial release | April 5, 2018[2] |
Stable release | 1.8[3]
/ November 1, 2023 |
Repository | github |
Written in | Go, Python |
Platform | Kubernetes |
Type | Machine Learning Platform |
License | Apache License 2.0 |
Website | kubeflow |
Each component of Kubeflow can be deployed separately, and it is not a requirement to deploy every component.[9]
The Kubeflow project was first announced at KubeCon + CloudNativeCon North America 2017 by Google engineers David Aronchick, Jeremy Lewi, and Vishnu Kannan[10] to address a perceived lack of flexible options for building production-ready machine learning systems.[11] The project has also stated it began as a way for Google to open-source how they ran TensorFlow internally.[12]
The first release of Kubeflow (Kubeflow 0.1) was announced at KubeCon + CloudNativeCon Europe 2018[13].[14] Kubeflow 1.0 was released in March 2020 via a public blog post announcing that many Kubeflow components were graduating to a "stable status", indicating they were now ready for production usage.[15]
In October 2022, Google announced that the Kubeflow project had applied to join the Cloud Native Computing Foundation.[16][17] In July 2023, the foundation voted to accept Kubeflow as an incubating stage project.[18][19]
Machine learning models are developed in the notebooks component called Kubeflow Notebooks. The component runs web-based development environments inside a Kubernetes cluster, with native support for Jupyter Notebook, Visual Studio Code, and RStudio.[20]
Once developed, models are trained in the Kubeflow Pipelines component. The component acts as a platform for building and deploying portable, scalable machine learning workflows based on Docker containers.[21] Google Cloud Platform has adopted the Kubeflow Pipelines DSL within its Vertex AI Pipelines product.[22]
For certain machine learning models and libraries, the Kubeflow Training Operator component provides Kubernetes custom resources support. The component runs distributed or non-distributed TensorFlow, PyTorch, Apache MXNet, XGBoost, and MPI training jobs on Kubernetes.[6]
The KServe component (previously named KFServing[23]) provides Kubernetes custom resources for serving machine learning models on arbitrary frameworks including TensorFlow, XGBoost, scikit-learn, PyTorch, and ONNX.[24] KServe was developed collaboratively by Google, IBM, Bloomberg, NVIDIA, and Seldon.[23] Publicly disclosed adopters of KServe include Bloomberg,[25] Gojek,[26] and others.[27]
Lastly, Kubeflow includes a component for automated training and development of machine learning models, the Katib component. It is described as a Kubernetes-native project and features hyperparameter tuning, early stopping, and neural architecture search.[28]
Version | Release Date | Release Information | Release Blog |
---|---|---|---|
Kubeflow 0.1 | 5 April, 2018[2] | - | https://kubernetes.io/blog/2018/05/04/announcing-kubeflow-0.1/ |
Kubeflow 0.2 | 2 July, 2018[29] | - | https://medium.com/kubeflow/kubeflow-0-2-offers-new-components-and-simplified-setup-735e4c56988d |
Kubeflow 0.3 | 5 October, 2018[30] | - | https://medium.com/kubeflow/kubeflow-0-3-simplifies-setup-improves-ml-development-98b8ca10bd69 |
Kubeflow 0.4 | 8 January, 2019[31] | - | https://medium.com/kubeflow/kubeflow-0-4-release-enhancements-for-machine-learning-productivity-d77c54df07a9 |
Kubeflow 0.5 | 9 April, 2019[32] | - | https://medium.com/kubeflow/kubeflow-v0-5-simplifies-model-development-with-enhanced-ui-and-fairing-library-78e19cdc9f50 |
Kubeflow 0.6 | 19 July, 2019[33] | https://www.kubeflow.org/docs/releases/kubeflow-0.6/ | https://medium.com/kubeflow/kubeflow-v0-6-a-robust-foundation-for-artifact-tracking-data-versioning-multi-user-support-9896d329412c |
Kubeflow 0.7 | 17 October, 2019[34] | https://www.kubeflow.org/docs/releases/kubeflow-0.7/ | https://medium.com/kubeflow/kubeflow-v0-7-delivers-beta-functionality-in-the-leadup-to-v1-0-1e63036c07b8 |
Kubeflow 1.0 | 20 February, 2020[35] | https://www.kubeflow.org/docs/releases/kubeflow-1.0/ | https://blog.kubeflow.org/releases/2020/03/02/kubeflow-1-0-cloud-native-ml-for-everyone |
Kubeflow 1.1 | 31 July, 2020[36] | https://www.kubeflow.org/docs/releases/kubeflow-1.1/ | https://blog.kubeflow.org/release/official/2020/07/31/kubeflow-1.1-blog-post |
Kubeflow 1.2 | 18 November, 2020[37] | https://www.kubeflow.org/docs/releases/kubeflow-1.2/ | https://blog.kubeflow.org/release/official/2020/11/18/kubeflow-1.2-blog-post |
Kubeflow 1.3 | 23 April, 2021[38] | https://www.kubeflow.org/docs/releases/kubeflow-1.3/ | https://blog.kubeflow.org/kubeflow-1.3-release/ |
Kubeflow 1.4 | 12 October, 2021[39] | https://www.kubeflow.org/docs/releases/kubeflow-1.4/ | https://blog.kubeflow.org/kubeflow-1.4-release/ |
Kubeflow 1.5 | 10 March, 2022[40] | https://www.kubeflow.org/docs/releases/kubeflow-1.5/ | https://blog.kubeflow.org/kubeflow-1.5-release/ |
Kubeflow 1.6 | 7 September, 2022[41] | https://www.kubeflow.org/docs/releases/kubeflow-1.6/ | https://blog.kubeflow.org/kubeflow-1.6-release/ |
Kubeflow 1.7 | 29 March, 2023[42] | https://www.kubeflow.org/docs/releases/kubeflow-1.7/ | https://blog.kubeflow.org/kubeflow-1.7-release/ |
Kubeflow 1.8 | 1 November, 2023[3] | https://www.kubeflow.org/docs/releases/kubeflow-1.8/ | https://blog.kubeflow.org/kubeflow-1.8-release/ |