Find Your Dream Job

Search through thousands of job postings to find your next opportunity

Date Posted

Job Type

Technology

Work Setting

Salary Range

$0k $100k $200k+

Experience Level

SRE AKS

PwC Acceleration Centers

Bengaluru, Karnataka, India

Site Reliability Engineer (SRE)

  • Azure Kubernetes Service (AKS) Specialist


We are looking for a skilled Site Reliability Engineer (SRE) specializing in Azure Kubernetes Service (AKS) to join our Cloud Operations team. The ideal candidate will specialize in application deployments, code promotions, and AKS resource troubleshooting and administration while also supporting a diverse IT stack including Azure Virtual Desktop/Intune, M365, Kafka on HDInsight, and MS SQL MI.

The candidate will help to ensure the uptime, reliability, and performance of our applications. This involves implementing comprehensive monitoring solutions and maintaining observability using tools like Azure Application Insights. To prepare applications for varying levels of demand, the candidate will design and deploy automated scaling solutions that dynamically adjust resources based on workload requirements. Additionally, they will leverage automation to streamline incident response, conduct thorough root cause analyses, and implement robust, long-term solutions to prevent recurring issues.

Essential Functions

  • Manage and automate application deployments and code promotions across environments.
  • Understanding of container orchestration, containerization technologies (Docker, Kubernetes), and infrastructure as code (IaC) tools such as Terraform or Bicep.
  • Troubleshoot and administer Azure Kubernetes Service (AKS) resources to ensure high availability and performance.
  • Collaborate with development teams to enhance CI/CD pipelines using tools like GitHub Actions and Helm.
  • Monitor and optimize system reliability, utilizing Azure Monitor and other observability tools.
  • Implement and manage network security and access control, using tools such as Palo Alto VM-Series VMSS.
  • Support incident response and root cause analysis for production issues.
  • Ensure compliance with security standards and best practices in cloud environments.
  • Participate in on-call rotations to provide 24/7 support for critical systems.


Minimum Requirements

  • Proven experience in a Site Reliability Engineer role with a focus on Azure AKS.
  • Strong understanding of container orchestration and management with Kubernetes.
  • Proficiency in CI/CD tools and practices, including GitHub Actions and Helm.
  • Experience with monitoring and observability tools like Azure Monitor.
  • Familiarity with network security and access management solutions.
  • Knowledge of scripting and automation using PowerShell or similar languages.
  • Excellent problem-solving skills and ability to work in a fast-paced environment.
  • Strong communication skills for effective collaboration with cross-functional teams.
  • Azure certifications (e.g., Azure Administrator, Azure DevOps Engineer) are a plus.

New SRE Jobs

Connecting top SRE talent with leading companies.

For SRE Professionals

For Employers

Company