Find Your Dream Job

Search through thousands of job postings to find your next opportunity

Date Posted

Job Type

Technology

Work Setting

Salary Range

$0k $100k $200k+

Experience Level

DevOps or Reliability Engineer

Largeton Group

Minneapolis, MN

Job Summary - DevOps/Reliability Engineer

  • Support non-production environments by establishing and maintaining monitoring, alerting, and strict SLAs.
  • Set up and manage observability tools, including APM, logs, monitoring, alerts, and dashboards.
  • Utilize Datadog, AppDynamics, and Splunk for full-stack monitoring and logging.
  • Employ ServiceNow for incident, problem record review, and reporting.
  • Respond promptly to system issues and engage relevant teams for resolution.
  • Provide analytics and reporting on incident rates, SLA compliance, and system noise.
  • Participate in on-call rotations to ensure environment reliability.
  • Analyze and research incidents to identify trends, root causes, and areas for improvement.
  • Collaborate closely with cross-functional teams to resolve issues and improve system reliability.
  • Batch monitoring and troubleshooting to ensure performance and uptime of scheduled jobs.
  • Ensure adherence to tight SLAs and escalate issues as required.
  • Work onsite at one of the following locations: Minneapolis, MN; Dallas, TX; Brookfield, WI.

Required Skills/Experience

  • 3+ years with each: Splunk, AppDynamics, Datadog, ServiceNow, Batch Monitoring.
  • Strong experience in observability, system monitoring, and incident management.
  • Excellent analytical, communication, and problem-solving abilities.

New SRE Jobs

Connecting top SRE talent with leading companies.

For SRE Professionals

For Employers

Company