Total Years of Experience: 6+ Yrs

Relevant years of Experience: 5+ Yrs

Good To Have

Reviewing system performance metrics and addressing any anomalies.
Leading incident response calls and coordinating with relevant teams.
Meeting with stakeholders to discuss reliability goals and progress.
Developing scripts and automation tools for system maintenance tasks.
Conducting training sessions for team members on best practices.
Planning and executing system upgrades and infrastructure improvements.

Detailed Job Description

Monitoring and Performance: Setting up and maintaining monitoring tools and dashboards to track system performance and detect issues proactively.
Team Leadership: Leading and mentoring the SRE team, ensuring they have the resources and guidance needed to perform their roles effectively.
System Design and Architecture: Overseeing the design and architecture of reliable systems, ensuring scalability, fault tolerance, and high availability.
Incident Management: Coordinating response to incidents, conducting post-mortems, and implementing measures to prevent recurrence.
Automation: Developing and promoting automation for repetitive tasks to reduce human error and improve efficiency.
Stakeholder Management: Meeting regularly with stakeholders to discuss reliability goals, project progress, and challenges faced in achieving high system reliability.
Training & Development: Conducting training sessions for team members on best practices, new tools, and techniques to enhance their skill sets.

Skills: architecture,management,automation,training,training and development,dashboards,system design,automation tools,monitoring tools,incident response,stakeholder management,reliability engineering,reliability

Find Your Dream Job

Date Posted

Job Type

Technology

Work Setting

Salary Range

Experience Level

4330 matching jobs

Associate DevOps Engineer(Kubernetes, CI/CD, container orchestration)

DevOps Engineer - AWS

Devops Junior

Site Reliability Engineer

Site Reliability Engineer

Director Site Reliability Engineering

DevOps Engineer

DevSecOps Engineer

Site Reliability Engineer

Azure DevOps Engineer

Site Reliability Engineering (SRE) Lead

New SRE Jobs

For SRE Professionals

For Employers

Company