About Unstructured

Unstructured builds open-source and commercial tools that enable developers to preprocess and transform unstructured data — PDFs, HTML, Word docs, images, and more — for AI/ML pipelines. Our solutions power production-grade, scalable generative AI use cases at leading enterprises.

We’re a team of builders obsessed with performance, simplicity, and reliability. If you’re excited by complex systems, cutting-edge ML infrastructure, and high-impact problems, we’d love to meet you.

The Role

We’re looking for a Site Reliability Engineer to help us scale our infrastructure, automate deployments, and ensure the reliability and performance of our systems as we grow. This role is critical to the health of our platform and will work closely with Engineering, Product, and Customer teams to deliver resilient and efficient software systems for enterprise deployments.

You'll have the opportunity to work across a modern stack (Python, Kubernetes, Helm, CI/CD with GitHub Actions, etc.), influence infrastructure decisions from day one, and help shape reliability culture across the company.

What You'll Do

Design and implement highly available, scalable, and observable systems across our platform
Automate infrastructure with tools like Terraform, Pulumi, and build reusable CI/CD pipelines
Maintain and optimize Kubernetes clusters, container orchestration, and service mesh configurations
Set up and manage monitoring and alerting for performance, reliability, and uptime (e.g., Elastic, Prometheus, Grafana, Datadog)
Improve developer velocity through tooling, automation, and infrastructure improvements
Lead or support incident response, root cause analysis, and blameless postmortems
Partner with engineering teams on production readiness, capacity planning, and rollout strategies

What We're Looking For

4+ years of experience in an SRE, DevOps, or Infrastructure Engineering role
Deep expertise in cloud platforms (AWS, GCP, or Azure)
Hands-on experience with Kubernetes, Docker, and container orchestration
Strong skills in Linux systems, networking, and scripting (e.g., Bash, Python, Go)
Proficiency with Infrastructure-as-Code (Terraform, CloudFormation, Ansible, etc.)
Familiarity with monitoring, logging, and observability practices and tools
Experience supporting production systems and operating in high-scale environments

Bonus Points

Experience with machine learning infrastructure or data pipeline systems
Exposure to serverless or event-driven architectures
Contributions to open source projects or DevOps communities
Familiarity with security best practices for cloud-native environments

Why Join Us?

Remote-first team with flexible work style and async collaboration
Opportunity to own critical infrastructure at a fast-growing company
Work on impactful problems at the intersection of data and AI
Competitive salary, equity, and benefits package
Supportive, high-performance team culture

Find Your Dream Job

Date Posted

Job Type

Technology

Work Setting

Salary Range

Experience Level

4330 matching jobs

Associate DevOps Engineer(Kubernetes, CI/CD, container orchestration)

DevOps Engineer - AWS

Devops Junior

Site Reliability Engineer

Site Reliability Engineer

Director Site Reliability Engineering

DevOps Engineer

DevSecOps Engineer

Site Reliability Engineer

Azure DevOps Engineer

Site Reliability Engineer

New SRE Jobs

For SRE Professionals

For Employers

Company