Find Your Dream Job

Search through thousands of job postings to find your next opportunity

Date Posted

Job Type

Technology

Work Setting

Salary Range

$0k $100k $200k+

Experience Level

Senior Site Reliability Engineer

Staffing Lab LLC

Charlotte Metro

Staffing Lab is looking for a Senior Site Reliability Engineer for one of it's clients. This position could be contract to hire or Full Time. No C2C, No Recruiters, 3rd Parties, etc...


We are seeking a Senior Site Reliability Engineer (SRE) with deep expertise in AWS networking, infrastructure automation, and production system reliability. This role demands a strong grasp of observability, operational excellence, and the ability to drive the adoption of DevOps/SRE best practices across engineering teams. You will be instrumental in shaping SLIs/SLOs, defining our DevOps maturity roadmap, and building robust, scalable infrastructure using Terraform, Lambda, Step Functions, and more.


You’ll be leading a team of SREs and collaborating closely with DevOps, Security, and Application teams to ensure reliable delivery and availability of services.


Key Responsibilities:

  • Lead and mentor a team of SREs in developing scalable infrastructure and operational processes.
  • Design and implement SLIs, SLOs, and Error Budgets across critical services and evangelize them across product teams.
  • Architect and manage AWS networking environments including VPCs, Transit Gateways, Route 53, VPNs, NACLs, and Security Groups.
  • Manage and monitor Palo Alto and Fortigate firewalls, and integrate them with cloud environments for hybrid network visibility.
  • Define and evolve DevOps maturity models, guiding teams toward higher automation and reliability.
  • Build and manage observability dashboards using Grafana, Cloudwatch and Datadog to track application and infrastructure health.
  • Implement and maintain Infrastructure as Code (IaC) using Terraform to automate cloud deployments across environments.
  • Develop and maintain serverless applications using AWS Lambda and Step Functions to support platform automation and operations.
  • Collaborate with developers to define GitLab CI/CD pipelines and streamline the build, test, and deployment lifecycle.
  • Champion incident response, blameless postmortems, and continuous improvement initiatives.
  • Write scripts in Python or Bash to automate tasks and integrate systems.


Required Qualifications:

  • 7+ years in SRE, DevOps, or Systems Engineering roles with increasing responsibility.
  • Proven experience managing AWS production environments with a focus on networking.
  • In-depth knowledge of Palo Alto and/or Fortigate firewall management and troubleshooting.
  • Expertise in monitoring and observability tools, including Grafana and Datadog.
  • Hands-on experience with Terraform in managing cloud infrastructure at scale.
  • Experience building and deploying serverless architectures using Lambda and Step Functions.
  • Demonstrated understanding of SLI/SLO design, error budgets, and reliability metrics.
  • Strong understanding of CI/CD principles and tools like GitLab CI/CD.
  • Proficiency in scripting using Python or Bash.


Preferred Qualifications:

  • AWS Certifications (e.g., Solutions Architect, Advanced Networking, DevOps Engineer)
  • Familiarity with DevOps/SRE maturity models and implementing organizational transformation.
  • Experience with compliance frameworks (SOC2, ISO 27001, etc.) as they pertain to infrastructure reliability.
  • Familiarity with container orchestration is a plus.


Soft Skills:

  • Strong leadership and mentoring capabilities.
  • Ability to translate complex technical problems into actionable initiatives.
  • Excellent communication and cross-functional collaboration skills.
  • Bias for automation and continuous improvement.

New SRE Jobs

Connecting top SRE talent with leading companies.

For SRE Professionals

For Employers

Company