Properly tuning Kubernetes applications is a daunting task, often resulting in reliability and performance issues, as well as unexpected costs. We describe how ML-based optimization enabled a digital service provider to automatically tune Kubernetes pods and dramatically reduce the associated cost.
Properly tuning Kubernetes microservice applications is a daunting task even for experienced Performance Engineers and SREs, often resulting in companies facing reliability and performance issues, as well as unexpected costs.
In this session, we first explain Kubernetes resource management and autoscaling mechanisms and how properly setting pod resources and autoscaling policies is critical to avoid over-provisioning and impacting the bottom line.
We discuss a real-world case of a digital provider of accounting & invoicing services for SMB clients. We demonstrate how ML-based optimization techniques allowed the SRE team and service architects to automatically tune the pod configuration and dramatically reduce the associated Kubernetes cost. We also describe the results of incorporating resilience-related goals in the optimization goals.
Finally, we propose a general approach to tune pods and autoscaling policies for Kubernetes applications.
Stefano Doni is a recognized performance expert and frequent speaker on related topics, including Computer Measurement Group, Facebook Performance Summit, Neotys Performance Advisory Council, MongoDB Performance Tech Talks, etc. In 2015, he was awarded the CMG best paper award on performance optimization of Java applications. Stefano has more than 15 years of hands-on experience in performance tuning and capacity planning, starting in 2006 with a master thesis on queuing network modeling. As a result of many international capacity and performance projects, he provided major contributions to developing one of the leading capacity management products (later acquired by BMC).
In 2019, Stefano co-founded Akamas, a software company devoted to automating performance optimization by leveraging ML techniques, whose development and research he is leading
as CTO (see latest VLDB paper 2021: http://vldb.org/pvldb/vol14/p1401-cereda.pdf).
From Stefano: “I’ve always loved sharing my finding and insights with the wider performance engineering community! I recently started a series of blog posts on the Akamas website (see https://www.akamas.io/category/java-tuning-insights/ and https://www.akamas.io/kubernetesoptimization-costs-slo/)”
IMPACT 2022 Session Video:
To view the video you must have a CMG membership or had registered as an IMPACT 2022 Attendee. Sign up today!For existing members sign in here.
We will explore the Responsible Artificial Intelligence (AI) concept while emphasizing the need for AI...
Find out moreBenchmarking AI models from an ethical angle involves ensuring that the evaluation processes promote fairness,...
Find out more