Senior Site Reliability Engineer

Optomi in partnership with one of our premier clients Senior Site Reliability Engineer (SRE) to join our Data Platform team within a leading global media organization. In this mission-critical role, you’ll design, scale, and maintain the infrastructure powering data products and real-time insights across digital and physical experiences. This position sits at the intersection of DevOps, data engineering, and platform reliability—working closely with cross-functional teams to ensure the scalability, observability, and reliability of high-throughput data systems.

You’ll drive innovation across petabyte-scale data pipelines using automation, infrastructure-as-code, and cloud-native technologies—reducing operational overhead, improving incident response, and unlocking greater velocity for data-driven products.

What the right candidate will enjoy

A mission-critical role shaping the backbone of real-time data products at a global media leader
Full remote flexibility with a high-impact team working on cutting-edge infrastructure
Hands-on work with petabyte-scale pipelines and cloud-native tooling
A solid runway—initial 6-month contract with likely long-term extension

Required Qualifications

6+ years in software engineering focused on SRE, DevOps, or platform infrastructure
Fluent in Python and one statically typed language (e.g., Go, Java, TypeScript)
Deep AWS experience: Lambda, ECS/EKS, Kinesis, S3, IAM, SNS/SQS, API Gateway
Strong background in distributed systems at scale
Expert in observability: metrics, logs, traces, and system health design
Skilled with Terraform, AWS CDK, and CI/CD automation
Comfortable working with SQL/NoSQL data systems and understanding architectural trade-offs
History of managing SLAs, SLOs, SLIs, and leading incident response
Strong communication skills across teams and functions

Nice to Have

Real-time data infra or analytics pipeline experience
DataDog and serverless observability chops
Experience with performance tuning, distributed tracing, and post-incident retros
Proven impact on system reliability metrics (MTTR, MTTD, deployment cadence)
Understanding of cloud data compliance and security best practices
Bonus points for media, streaming, or high-availability consumer tech backgrounds

Find Your Dream Job

Date Posted

Job Type

Technology

Work Setting

Salary Range

Experience Level

4330 matching jobs

Associate DevOps Engineer(Kubernetes, CI/CD, container orchestration)

DevOps Engineer - AWS

Devops Junior

Site Reliability Engineer

Site Reliability Engineer

Director Site Reliability Engineering

DevOps Engineer

DevSecOps Engineer

Site Reliability Engineer

Azure DevOps Engineer

Senior Site Reliability Engineer

New SRE Jobs

For SRE Professionals

For Employers

Company