Questions and Answers
Hello Everyone, we’re glad to be here. We are happy to talk about the book if you have questions. Thank Francis Terence Amit for the opportunity.
Thanks for joining us!
Just curious, why should we do ML in Kubernetes at all? It seems more difficult than using some more specialized solutions like sagemaker
K8S provides a compatibility layer to move through clouds. or If you are on-premise (regulations e.g.), K8S provides cloud like benefits of scaling to the on-premise workloads
Good question. Apart from cost and vendor lock-ins, Kubernetes allows you to use the same environment for both development, experiments and production. This means unified experience and standard deployment and operational tasks. You can take advantage of the open source observability tools that work well in Kubernetes. Kubernetes runs on the edge devices as well, wether its on a vehicle or satellites.
There are products that can manage multiple Kubernetes instances and Kubernetes deployments in one place. Traditional software engineering teams are already gaining benefits from this through DevOps. Why not bring this benefits to data science and ML workloads.
What do you think of KServe and other tools on top of Kubernetes? Are they making our life easier or actually only complicating it?
I cannot comment more on KServe as I have not personally used it. It looks promising but I guess it’s too early to say. We use Seldon core for model serving in the book.
Should data scientists learn about Kubernetes? 😉 if so, how much should they learn - only high level?
Data scientists need not learn the details of how Kubernetes works but at least know how to use it. The ML platform that we used in the book allows Data Scientists, Data engineers to take advantage the self service features of Kubernetes where they can creat Jupyter notebooks on demand. Schedule jobs, manage versions of experiments, parameters and models.
is there a way to auto-scale gpu load on kubernetes? say for transformer inference?
There are some solutions. NVidia GPU operator is available that helps you virtualise GPU on Kubernetes.
Run.ai is doing some fantastic work in this domain. Check it out
Is the book cover the cloud version kubernetes? (AWS EKS, GCP GKE)
We used Minikube in the book. It means it will work on any Kubernetes flavors. EKS, GKE, AKS, Openshift, etc. Some network configurations maybe a little different though.
Thanks for your reply!
Hi Ross Brigoli and Machine Learning in Production I’ve got a few questions:
- I’ll like to know if your book touches on using tools like Terraform
- Does the book touch on registering Kubernetes clusters on cloud platforms particularly AWS and maybe GCS?
- This may be a bit off but can you point to reasons why one would use a Kubernetes cluster + containers
over using tools like AWS Lambda which might seem easier and faster? - Finally, what workflow tools do you best recommend for deploying ML workflows on Kubernetes and does your book touch on that? I know there are a few but I’d like to know the pros and cons and your best recommendations.
Thank you!
Great questions onyeka okonji.
1- The book does not use terraform. Kubernetes is our abstraction layer for this book
2- The book does not talk about provisioning K8S in the cloud. We want to focus on the ML toolset and not the underlying platform. If you want to know more about it, check out my other book at https://www.packtpub.com/product/the-kubernetes-workshop/9781838820756
3- K8S provides a common abstraction layer that can help you run your solution to on-premise or any major cloud vendor. This portability capability was the main reason to focus on K8S
4- The book covers AirFlow. AirFlow provides the workflow component required for data pipelines, model lifecycle and more.
Let us know if you have any further comments.
Thank you for the reply Machine Learning in Production your response to Q4 got me excited as Airflow + K8s is something I’m interested in.
I’ll also be looking out for the other book your recommended. Thanks again 🙏
Q/ Ross Brigoli Do you think K8s should be the primary tool for those who are exploring FinOps (optimising Cloud costs) space ? What advantages does K8s provide over K8s alternative tools ?
Some Kubernetes distros have cost management features. For example Openshift has Cost Management Service. https://www.redhat.com/en/about/videos/overview-cost-management
However, this is not exclusive to Kubernetes cluster. Each cloud vendor has its own feature for cloud cost optimization, so I personally don’t think that Kubernetes has a significant advantage in the FinOps space.
We have proposed a platform to perform data pipelines, model training, analytics and model execution. These components are generally required to convert raw data into models. If you need to deploy a single model without worrying about the data lifecycle and how teams can collaborate, Kubernetes may not be the right choice.
Got it. Thank you for your reply!
Hi, Ross Brigoli and Machine Learning in Production! I have a question. It seems that Google Run is good enough to run a simple ML service. Is there any criteria that it’s time to use Kubernetes?