architecture diagram

Designing a Well-Architected Amazon EKS (Kubernetes) Cluster: Best Practices and Insights

Spread the love

Since 2012, AWS architects and engineers have developed the Well-Architected Framework, a collection of standardized guidelines and best practices designed to help customers assess and enhance the design and performance of their workloads on AWS Cloud infrastructure. These principles are broadly applied across AWS’s global infrastructure, tools, and services—ranging from EC2, ECS, S3, and Lambda to Elastic Beanstalk, Amazon DynamoDB, RDS, and over 200 other services within the AWS ecosystem.

The Pillars

The AWS Well-architected Framework is made up of six pillars

  • Reliability Pillar
  • Security Pillar
  • Cost Pillar
  • Performance Efficiency Pillar
  • Operational Excellence Pillar
  • Sustainability Pillar

This article was authored by my friend Veliswa Boya, who has conducted an in-depth exploration of the AWS Well-Architected Framework and how its various pillars can be effectively applied to your infrastructure.

Apply Well-Architected to EKS

Reliability Pillar

First, let us define the Reliability Pillar. This pillar ensures systems recover from failures, meet operational demands, and perform correctly over time through fault tolerance and recovery planning.

In Kubernetes deploying and effectively utilizing the following tools will guarantee the reliability of your workloads:

A tool like Velero will ensure your cluster is reliable and can easily recover from failure when it occurs. The autoscaling services, the cluster can scale up and down during high and low operational demands respectively.

Security Pillar

First, let us define the Security Pillar. This emphasizes protecting data, systems, and assets through risk assessment, proper identity management, encryption, and security monitoring.

In Kubernetes ensuring high security standards from access to the API Server and making it private and the following security standards and tools will improve the security of your workloads;

  • Falco — https://falco.org/: helps to detect security threats in real-time
  • Amazon GuardDuty on EKS: AWS proprietary threat detection and prevention tools within your EKS cluster.
  • Ensure Secure Configurations of Workloads.
  • Tigera: Security and observability for containers on Kubernetes.
  • Deploy Service Mesh to improve internal microservices communication security.

Cost Pillar

First, let us define the Cost Pillar. This focuses on managing costs, avoiding unnecessary expenses, and maximizing return on investment by utilizing cost-effective resources.

Apart from the Cloud Provider, even in your Kubernetes cluster, you can have over-provisioning of compute resources which will require continuous optimization from time to time to avoid wastage. The following processes and tools can be deployed to help optimize cost efficiently.

  • Opencost/Kubecost: monitor and properly optimize your Kubernetes spending across all Kubernetes workloads
  • Karpenter: optimize and right-size worker nodes for optimal cost.
  • Allocate resources according to application and not randomly
  • Use tools like Robusta KRR and Goldilocks that help with resource recommendations for workloads.
  • Horizontal Pod Autoscaler: to help optimize resource consumption and thereby reduce the cost of running workloads.

Performance Efficiency Pillar

First, let us define the performance efficiency pillar. This involves using computing resources efficiently to meet system requirements, leveraging scalable solutions, and monitoring performance.

In Kubernetes Observability is a key aspect to ensure efficient behavior of workloads. The following tools and processes can be used to ensure efficient performance of your Kubernetes cluster and workloads:

  • Container Insights on Amazon EKS: get insights on all components running in the EKS cluster
  • Service Mesh: with tools like AWS AppMesh, Istio, and Linkrd, you can observe the behavior of services running within the cluster and monitor traffic.
  • Experiment with new tools in releases from the Kubernetes.io community with a focus on security.

Operational Excellence Pillar

First, let us define the Operational Excellence Pillar. This focuses on running systems effectively, improving processes, and achieving business value through continuous improvement.

Operational Excellence in Kubernetes focuses on how to manage processes and improve collaboration and continuity of cluster operations. Some of the processes and tools that are essential to keep your cluster well-architected in this pillar are:

  • Implement IaC for end-to-end cluster management (Cloudformation, Terraform, Pulumi, etc).
  • Use Kubernetes Manifests for all forms of deployments (Helm or Kustomize) to manage deployment workflows.
  • Use Git as a single source of truth for all deployments and cluster management.
  • Create documentation and architecture diagrams that explain internal workings, tools deployed, and how they all tie together.
  • Fullstack Observability with LGTM stack: observe the logs, metrics, traces of pods, and metrics running within your Kubernetes cluster

Sustainability Pillar

First, let us define the Sustainability Pillar. This focuses on optimizing workloads for environmental impact, focusing on reducing energy consumption, and improving resource efficiency.

Applying this to Kubernetes is a culmination of Cost Optimization and Operational Excellence to ensure reduced energy consumption in running workloads and improve the efficiency of the workloads by allocating optimal resources for the application to run efficiently.

Conclusion

The AWS Well-architected Framework can be applied to different systems apart from the AWS services. Using it on Kubernetes will unlock a lot of values for your applications, and business and proper management and stability of solutions


Spread the love

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
×