How did we minimize the risk of outages during the k8s upgrade in a system that contained over 100 microservices?

Increased Efficiency and Productivity

GCP & AWS

Tooling

Golang, Terraform, eks-clt

Golang, Terraform,
eks-clt

Golang, Terraform,
eks-clt

Improved Customer Satisfaction Rates

6000 production containers

Cost Savings Achieved

GKE / EKS

Enhanced Security Measures Implemented

4 Engineers

Benefits

Maintenance cost reduction

Summary:

To minimize the risk of outages during k8s upgrades or maintenance, it's best to have multiple Kubernetes clusters in production. Even with a sizable number of nodes, relying on just one cluster can be risky. To ensure high availability, we've adopted the paradigm of immutable infrastructure and established a fleet of independent Kubernetes clusters.

Challenges:

  • The production system contained over 100 microservices

  • During peak hours there are over 6000 containers in the cluster

  • Standard auto-discovery is too slow to catch up with 3000 changes

Solution:

  • Streamline cluster creation with internal cli (cli eks create [role])

  • Automated deployment with GitOps model to get clusters up and running

  • Autoscaling capabilities to ensure clusters can independently handle traffic

  • Redesign service-discovery across the infrastructure, to avoid 502 errors while sunseting clusters

Address:

Let's Go DevOps Sp z o.o.
Zamknięta Str. 10/1.5
30-554 Cracow, Poland

We’re your partner in expert DevOps

Let’s Talk

How we can optimize your costs?

© 2024 Let’s Go DevOps. All rights reserved.

Address:

Let's Go DevOps Sp z o.o.
Zamknięta Str. 10/1.5
30-554 Cracow, Poland

We’re your partner in expert DevOps

Let’s Talk

How we can optimize your costs?

© 2024 Let’s Go DevOps. All rights reserved.

Address:

Let's Go DevOps Sp z o.o.
Zamknięta Str. 10/1.5
30-554 Cracow, Poland

We’re your partner in expert DevOps

Let’s Talk

How we can optimize your costs?

© 2024 Let’s Go DevOps. All rights reserved.