Our customer is a multinational corporation with more than a century of history and offices in over 180 countries. Their most ambitious goal at the time is to introduce a range of Reduced-Risk Products (RRPs). The target audience is more than 1 billion of consumers around the globe.
Must-Have Capabilities:
Intermediate understanding of SRE principles and practices.
Ability to handle more complex tasks and contribute to the improvement of processes.
Intermediate troubleshooting and problem-solving skills.
Intermediate knowledge of the following technologies:
New Relic: Advanced monitoring and alerting setup, including custom dashboards.
ELK: Advanced log management, analysis, and visualization.
Opsgenie: Advanced alert management, including integration with other tools.
Terraform/Terraform Enterprise: Advanced IaC tasks, including module creation and management.
Bitbucket/GitHub: Advanced version control, including branching strategies and code reviews.
Python: Advanced scripting and automation, including API integrations.
JavaScript​: Advanced scripting for automation tasks and tool integrations
.Jenkins: Advanced CI/CD pipeline setup, including complex workflows and integrations
.AWS: Intermediate understanding of cloud platforms and services
.
Should-Have Capabilities
:
Ability to mentor junior engineers and share knowledge
.Strong communication and collaboration skills
.
Nice-to-Have Capabilities
:
Understanding of Node.js
.Familiarity with container technologies (e.g., Docker, Kubernetes)
.Familiarity with Ansible
.
Responsibilities
:
Design, build, and maintain software delivery pipelines and infrastructure that support continuous integration, delivery, and deployment
.Collaborate with development and operations teams to ensure that software is delivered with high quality, speed, and reliability
.Automate manual processes, such as testing, deployment, and monitoring, to improve efficiency and reduce errors
.Develop and maintain monitoring and alerting systems to proactively identify and address issues in production environments
.Troubleshoot production issues, conducting root cause analysis, and implementing remediation plans
.Manage and scale infrastructure resources, such as servers, databases, and cloud services, to ensure optimal performance and cost-effectiveness
.Implement security best practices and ensure compliance with industry standards and regulations
.Continuously learn and keep up to date with new technologies and industry trends to improve system performance, security, and efficiency