We seek a highly skilled and experienced SRE Engineer to join our dynamic team. The ideal candidate will have a robust background in software engineering, with extensive experience in using Terraform for infrastructure as code (IaC) to manage and provision cloud resources on AWS.
- Design, implement, and manage cloud infrastructure using Terraform and other IaC tools
- Proactively monitor, troubleshoot, and optimize the performance of cloud environments to guarantee high availability and efficiency
- Implement and maintain CI/CD pipelines for automated code deployment and infrastructure changes
- Develop and manage the data stack, encompassing infrastructure resources, implementation, and data lake setup
- Respond to and manage critical alerts and incidents, coordinating swift response efforts to mitigate impact and downtime
- Perform thorough root cause analysis (RCA) for incidents to identify and address underlying issues, developing solutions that prevent future occurrences
- Document and maintain comprehensive guides for system configurations and operational procedures, promoting knowledge sharing and operational excellence
Requirements
- Bachelor's degree in Computer Science, Engineering, or a related field
- Solid background in software engineering with advanced proficiency in high-level programming languages such as JavaScript or Python. Additional experience with NodeJS is a plus
- Expertise in AWS cloud services, with a keen understanding of architecture, security principles, and cost optimization strategies
- Strong experience in Linux system administration, including setup, configuration, and maintenance tasks
- Proficiency in containerization with Docker and orchestration with Kubernetes is highly valued
- Experience with scripting languages (Bash, Python, Go) and familiarity with Git workflows
- Demonstrated ability to automate and manage infrastructure using Terraform and other IaC tools in cloud environments like AWS or Azure
- Understanding of microservices architecture, Serverless technologies, and both SQL and NoSQL database systems
- Exceptional problem-solving abilities, alongside excellent teamwork and communication skills. Adaptability and a commitment to continuous learning in the fast-evolving cloud technology landscape are essential