Client: Wealth Management / Investment Firm
Position: Site Reliability Engineer Lead
Pay Rate: $170K - $190K
Locations: ON SITE Plano, TX or Camas, WA
The Opportunity:
The main accountabilities for this candidate is setting up Splunk/Dynatrace monitoring, working with the team to onboard the applications (have been uploading for the last 2 years), working with the Solutions Architect on Ansible automation and Mentoring Jr team members
Education: Bachelor's degree in MIS, computer science, math, or other science field required, advanced degree in a related field
MUST Have Experience:
- MUST BE USC or GC holder!!
- 10+ years overall experience in application engineering
- 5+ years of SRE experience (architect or engineer) with toolsets for Dynatrace and Splunk are a MUST
- In automation leveraging Ansible and Python which has just started about 2 months ago, so moving in that direction
Other Experience Desired:
- 3+ years' experience monitoring applications using various SDLC methodologies preferably Agile
- 3+ years of technology design expertise which includes Performance, Security, Availability, as well as Operations, Monitoring and Support
- 2+ years of experience in Relational database management skills like MSSQL, MySQL, SQL, PostgreSQL or MongoDB
- 2+ years of experience in any of the scripting languages like Unix Shell Scripting, Python, or PowerShell
- 2+ years of experience in technology design expertise which includes Containerization, Performance, Security, Availability, Operations, Monitoring, and Support
- Experience in Systems Architecture, in-depth knowledge on SRE, IT Operations, Cloud, Coding and Scripting experience with Java, JavaScript, python and .NET, understanding of AI/ML
- Experience in a regulated industry; financial services experience ideal
Responsibilities:
- Design, configure and sets up observability platform tools (Splunk and Dynatrace), both on-premises and cloud, to guide application development efficiencies and improve operational stability of the applications
- Work with Observability Manager and Architect to develop Monitoring capabilities strategy and Roadmaps and accomplish agreed upon priorities
- Develop tooling and processes to increase automation of monitoring and adherence to security and audit systems and controls
- Integrate and configure additional tools/frameworks to support and enable automation of various monitoring activities across the enterprise
- Perform analytics on incidents and usage patterns to better predict issues and take proactive actions
- Collaborate across the departments to gauge the effectiveness and efficiency of existing systems
- Foster the adoption of Observability tools and capabilities across Technology groups
- Partner with Service owners to implement Service Level Metrics & Service Level Objectives that act as service level health indicators
- Measure, communicate and deliver on enterprise platforms stability, scalability and technology organizations maturity in DevOps
- Resolve issues, alerts, and incidents based on predefined service level agreements regarding system availability, performance, and service levels
- Analyze the monitoring requirements in close collaboration with the architect and translate them into tasks for engineers to develop
- Deliver presentations to managers and other technology and business partners
- Be a mentor to engineers, providing assistance, guidance and training
Benefits:
100% paid medical, dental and vision premiums for you and your qualifying dependents
A 50% 401(k) match, up to the IRS maximum
20 days of PTO, plus 10 paid holidays
Family Support programs including 8 week Paid Primary Caregiver Leave, $10,000 fertility, family forming, and hormonal health assistance, and back-up child, adult, and elder care