Find Your Dream Job

Search through thousands of job postings to find your next opportunity

Date Posted

Job Type

Technology

Work Setting

Salary Range

$0k $100k $200k+

Experience Level

DevOps Engineer

Meril

Chennai, Tamil Nadu, India

Job Title: DevOps Engineer – AI/ML Infrastructure


Location: Meril Healthcare Pvt. Ltd, IITM research park, Chennai, Parent company – Meril (https://www.merillife.com/) .


Shift: General shift - Monday to Saturday (9.30 am to 6.00 pm).


Summary

We are seeking a skilled DevOps Engineer with expertise in managing cloud-based AI/ML infrastructure, automation, CI/CD pipelines, and containerized deployments. The ideal candidate will work on AWS-based AI model deployment, database management, API integrations, and scalable infrastructure for AI inference workloads. Experience in ML model serving (MLflow, TensorFlow Serving, Triton Inference Server, BentoML) and on- prem/cloud DevOps will be highly valued.


Key Responsibilities

Cloud s Infrastructure Management


  • Manage and optimize cloud infrastructure on AWS (SageMaker, EC2, Lambda, RDS, DynamoDB, S3, CloudFormation).
  • Design, implement, and maintain highly available, scalable AI/ML model deployment pipelines.
  • Set up Infrastructure as Code (IaC) using Terraform, CloudFormation, or Ansible.


CI/CD s Automation


  • Develop and manageCI/CD pipelines using GitLabCI/CD, Jenkins, and AWS CodeBuild.
  • Automate deployment of AI models and applications using Docker, Kubernetes (EKS).
  • Write automation scripts in Bash, Python, or PowerShell for system tasks.


APIs AI Model Deployment


  • Deploy and manage Flask/FastAPI-based APIs for AI inference.
  • Optimize ML model serving using MLflow, TensorFlow Serving, Triton Inference Server, and BentoML.
  • Implement monitoring for AI workloads to ensure inference reliability and performance.


Security, Monitoring s Logging


  • Implement AWS security best practices (IAM, VPC,Security Groups, Access Controls).
  • Monitor infrastructure using Prometheus, Grafana, CloudWatch, or ELK Stack.
  • Set up backup and disaster recovery strategies for databases, storage, and models.


Database s Storage Management


  • Maintain and optimize MySQL (RDS) and MongoDB (DynamoDB) databases.
  • Handle structured (RDS) and unstructured (S3, DynamoDB) AI data storage.
  • Improve data synchronization between AI models, applications, and web services.


On-Prem s Hybrid Cloud Integration (Optional)


  • Manage on-prem AI workloads with GPU acceleration.
  • Optimize AI workloads across cloud and edge devices.


Required Skills and Qualifications

  • 3 to 5 years of experience in DevOps, Cloud Infrastructure, or AI/ML Ops.
  • Expertise in AWS (SageMaker, EC2, Lambda, RDS, DynamoDB, S3).
  • Experience with Docker s Kubernetes (EKS) for container orchestration.
  • Proficiency in CI/CD tools (Jenkins, GitLab CI/CD, AWS CodeBuild).
  • Strong scripting skills in Bash, Python, or PowerShell.
  • Knowledge of Linux ecosystem (Ubuntu, RHEL, CentOS).


  • Hands-on experience with ML model deployment (MLflow, TensorFlow Serving, Triton, BentoML).
  • Strong understanding of networking, security, and monitoring.
  • Experience with database management (MySQL, PostgreSQL, MongoDB).



Preferred Skills

  • AWS Certified DevOps Engineer, CKA (Kubernetes), or Terraform certification.
  • Experience with hybrid cloud (AWS + on-prem GPU servers).
  • Knowledge of edge AI deployment and real-time AI inference optimization.


Interested , Please share your resume to [email protected]

New SRE Jobs

Connecting top SRE talent with leading companies.

For SRE Professionals

For Employers

Company