Find Your Dream Job

Search through thousands of job postings to find your next opportunity

Date Posted

Job Type

Technology

Work Setting

Salary Range

$0k $100k $200k+

Experience Level

Senior Site Reliability Engineer

Glocomms

San Jose, CA

$200,000.00 - $315,000.00

Site Reliability Engineer

At the intersection of machine learning and large-scale infrastructure, the SRE team for our Applied Machine Learning group is redefining how intelligent systems operate at global scale. We blend the principles of software engineering with systems reliability to keep our AI and recommendation systems resilient, high-performing, and ever-evolving.

As a Site Reliability Engineer on this team, you'll be hands-on with some of the most advanced AI technologies, helping architect, maintain, and scale machine learning platforms that serve millions-if not billions-of users. You'll also play a critical role in optimizing system performance, making hardware and capacity recommendations, and automating everything possible.

What You'll Do:

  • Ensure our ML systems run smoothly, efficiently, and reliably-no matter how complex or large they get.

  • Dive deep into the guts of distributed systems to identify and resolve bottlenecks before they become outages.

  • Contribute to and lead the automation of infrastructure, pipelines, and operational routines.

  • Collaborate with engineering and hardware teams on capacity planning, architecture choices, and performance tuning.

What You Bring:

  • Deep knowledge of distributed systems and the experience to troubleshoot them with precision.

  • A Bachelor's or Master's in Computer Science or a closely related field focused on software development or systems engineering.

  • Solid programming chops in at least one of the following: Python, C/C++, or Go.

  • Strong foundation in algorithms, data structures, and computer science fundamentals.

Preferred Extras:

  • Experience designing and operating high-scale, high-availability systems.

  • Passion for writing clean, optimized code and automating away manual tasks.

  • Prior SRE experience in large distributed production environments.

New SRE Jobs

Connecting top SRE talent with leading companies.

For SRE Professionals

For Employers

Company