Kashish Lakhara
AWS SAA Certified DevOps Engineer, managing multiple production Kubernetes clusters across AWS, GCP, Azure, and on-premise environments, building resilient infrastructure with GitOps, observability pipelines, and cloud-native tooling.
Passionate about distributed systems, infrastructure reliability, and the open-source cloud-native ecosystem.
About
AWS SAA Certified DevOps Engineer managing 15+ production Kubernetes clusters across AWS EKS, GCP GKE, Azure AKS, and on-premise environments including air-gapped clusters bootstrapped with kubeadm. Strong focus on infrastructure automation with Terraform and Ansible, GitOps workflows with FluxCD, and observability pipelines using Prometheus, Grafana, and OpenSearch.
Comfortable working deep in the stack from bare-metal cluster bootstrapping and etcd operations to Kafka, VerneMQ MQTT, and Longhorn distributed storage. Currently learning Go with a focus on contributing to the OpenTelemetry collector-contrib project.
name: Kashish Lakhara
role: DevOps Engineer @ Cloud Solitaire Technologies
location: Ahmedabad, Gujarat, India
aws_cert: Solutions Architect Associate
experience:
- Multi-cloud Kubernetes
- Prometheus · Grafana · AlertManager
- Longhorn · CrateDB · Patroni
- FluxCD · Terraform · Ansible
- HAProxy · MetalLB · Proxmox VE
Experience
DevOps Engineer
Cloud Solitaire Technologies · Ahmedabad, Gujarat
- › Automated bare-metal Kubernetes cluster setup end-to-end with Ansible and kubeadm control-plane bootstrap, etcd init, worker join, CNI install, and kube-reserved memory + eviction thresholds for production-grade on-premise deployments.
- › Resolved etcd disk exhaustion in production via defragmentation then rolled out automated compaction across all on-prem clusters eliminating the issue before it could cause an outage.
- › Deployed HAProxy + Keepalived for bare-metal HA with virtual IPs across master nodes; validated failover end-to-end on Proxmox.
- › Diagnosed KubeAPIErrorBudgetBurn alerts traced to etcd I/O errors via smartctl; identified a failing 8-year-old HDD, coordinated live SSD replacement, and migrated all workloads without downtime.
- › Created a Kubernetes CronJob to auto-renew API server certificates before expiry preventing cluster authentication failures in on-premise environments where managed cert rotation isn't available.
- › Set up Azure infrastructure from scratch with Terraform (AKS, Application Gateway for Containers, Azure Front Door) and wrote Azure DevOps pipelines deploying to AKS via self-hosted runners.
- › Migrated production infrastructure from on-premise to AWS EKS, implemented Karpenter for node autoscaling, and analysed EC2 usage patterns to evaluate Savings Plans post-migration.
- › Built centralised logging on GKE with OpenSearch and Fluent Bit namespace-isolated indexes for Kubernetes events, application logs, and kube-system logs.
- › Configured AlertManager routing to Microsoft Teams with alert severity segregation across 15+ clusters, giving clients a single pane of glass for infrastructure health.
Stack
Kubernetes & Containers
Cloud
Observability
GitOps & CI/CD
Storage & Data
Networking
Infrastructure
Get In Touch
Open to collaborating on interesting projects and technical challenges.
Message sent!
I'll get back to you soon.
Something went wrong. Please email me directly at kashishlakhara04@gmail.com