Performance & Reliability Engineer ( Senior, Lead , Principal & Manager)

Hybrid

Location: Pune, Chennai, Bangalore & Gurgaon

Need immediate joiners only

Job description

Role: Performance & Reliability Engineer

Job Location: Gurgaon, Chennai, Pune, Bangalore

Hybrid

Job Overview:

We are seeking a highly skilled and motivated Performance & Reliability Engineer to join our team. In this role, you will be responsible for ensuring the reliability, scalability, and performance of our systems and applications. You will leverage tools such as Dynatrace, CloudWatch, and Python to monitor and optimize system performance, troubleshoot issues, and enhance the overall reliability of our infrastructure with SRE Best Practices.

Key Responsibilities:

Performance Monitoring & Optimization:
Use Dynatrace and CloudWatch to monitor system performance and availability.
Implement performance tuning techniques to ensure high availability and optimal system performance.
Identify performance bottlenecks and optimize applications and infrastructure for scalability.
System Observability
AppDynamics and monitoring dashboards.
Collaborate with development and operations teams to troubleshoot incidents and provide recommendations for performance improvements.
Proactively identify areas of risk and implement preventive measures.
Automation & Scripting:
Develop automation scripts in Python to enhance monitoring, incident response, and reporting processes.
Write and maintain Python-based tools for proactive monitoring, alerting, and issue resolution.
Cloud Monitoring & Alerts:
Configure CloudWatch for real-time monitoring and alerting of cloud infrastructure,
Develop and manage dashboards to visualize system health and performance metrics.
Prepare and present performance reports, incident post-mortems, and improvement recommendations to senior leadership.
Chaos Engineering
Vulnerability identification, Failure simulation, Stress Management

Required Skills and Experience:

Strong experience with Dynatrace for application performance monitoring and root cause analysis.
Proficiency in CloudWatch for monitoring AWS cloud infrastructure, configuring alerts, and visualizing metrics.
Solid understanding of Python for automating tasks, building performance tools, and writing scripts to enhance operations.
Experience in analyzing system logs, troubleshooting performance issues, and providing technical recommendations.
Hands-on experience with cloud environments (AWS preferred), including development knowledge
Experience with load testing and performance benchmarking.

About Xebia: https://xebia.com/

https://www.linkedin.com/company/xebia/about/

Find Your Dream Job

Date Posted

Job Type

Technology

Work Setting

Salary Range

Experience Level

4330 matching jobs

Associate DevOps Engineer(Kubernetes, CI/CD, container orchestration)

DevOps Engineer - AWS

Devops Junior

Site Reliability Engineer

Site Reliability Engineer

Director Site Reliability Engineering

DevOps Engineer

DevSecOps Engineer

Site Reliability Engineer

Azure DevOps Engineer

Site Reliability Engineer

New SRE Jobs

For SRE Professionals

For Employers

Company