Enterprise wide-area networking is primed for a new paradigm with the introduction of software defined networking architecture to deliver agility, performance, services and software innovations. Graphiant is changing the networking industry and you will be part of the charge to drive evolution. You will collaborate with industry leading engineers to build a development and deployment infrastructure for the best product portfolio in the industry.
Reports To
Director of Site Reliability Engineering
The Team
As part of Graphiant engineering team, you will be responsible for the development of Graphiant’s offerings and execute on the Product and Portfolio strategy.
The Work
Your primary responsibilities in this role will be to monitor and support the Graphiant service. This will include monitoring of the Graphiant Portal as well as the Graphiant Backbone network. You will:
Take ownership of the underlying infrastructure that supports the production backbone, including compute, storage, networking, and platform layers
Ensure that servers, networks, and distributed systems are architected and managed to deliver maximum uptime, resiliency, and operational efficiency
Monitor and manage production workloads, ensuring that software and services run smoothly with minimal disruption
Quickly diagnose, triage, and resolve incidents, while performing root cause analyses to prevent recurrence
Design and implement observability solutions (logging, monitoring, alerting) to proactively detect and respond to performance issues or failures
Collaborate closely with other SREs, DevOps, and engineering teams to define and enforce SLAs, SLOs, and SLIs, and drive continuous improvement in system reliability
Contribute to automation efforts, including infrastructure-as-code (IaC), CI/CD pipelines, and self-healing capabilities for production environments
Competencies
Expertise in Programming: Essential for automating tasks and designing resilient systems.
Understanding of IT Operations: Vital to manage infrastructure, diagnose issues, and keep services running.
Leadership Skills: Necessary for guiding teams and influencing tech strategy.
Strategic Vision: Enables the SRE to anticipate challenges and steer the company towards reliability and scalability.
Experience
Bachelor’s Degree, or higher, in Computer Science or related technical field, or equivalent experience
1-3+ years of software development experience
Kubernetes
Proficient in C, C++, Java, Go, Nodejs, php, Scala, Python or similar language
1-3+ years of experience with different deploying and running services in a public or private cloud; AWS, Azure, GCP, etc
1-3+ years of experience with service discovery tools; Kubernetes, Zookeeper, HashiCorp consul, or similar software
1-3+ years of experience with RPC technologies and messaging systems; Google protobuf, apache thrift, ZeroMQ, RabbitMQ, Kafka or similar
1-3+ years of experience with different SQL and No-SQL datastores; MySQL, MongoDB, ElasticSearch, InfluxDB, Redis, DynamoDB, Cassandra or similar
Graphiant is an Affirmative Action and Equal Opportunity Employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, gender, sexual orientation, national origin, genetic information, age, disability, veteran status, or any other legally protected basis.