Search through thousands of job postings to find your next opportunity
No technologies match your search.
SAP
Bengaluru, Karnataka, India
Posted 1mo
Indegene
Poland
Posted 1mo
Readiness IT LATAM - una empresa CONKORD
Providencia, Santiago Metropolitan Region, Chile
Posted 1mo
Xero
Seattle, WA
Posted 1mo
Xero
Denver, CO
Posted 1mo
Akamai Technologies
United States
Posted 1mo
Hirenza
United States
Posted 1mo
Sanderson Government & Defence
England, United Kingdom (Remote)
$65,000.00 - $75,000.00
Posted 1mo
Cloudbeds
Romania
Posted 1mo
Informatech Pty Ltd
Canberra, Australian Capital Territory, Australia
$160,000.00 - $200,000.00
Posted 1mo
Job Summary:
We are seeking a skilled and motivated Cloud Engineer with focus on AI workloads to design, implement, and maintain cloud infrastructure optimized for AI and Generative AI workloads. This role involves provisioning cloud resources, automating deployments, integrating cloud-native AI services, and ensuring secure, scalable, and observable AI/ML environments across AWS, Azure, and GCP platforms.
Key Responsibilities:
Cloud Platform & AI Services Management:
Administer and troubleshoot AI/ML services across AWS and Azure.
Apply best practices for managing cloud-native AI services such as Azure OpenAI and AWS SageMaker.
Support hybrid and multi-cloud environments for AI workloads.
AI/ML Platform Engineering:
Deploy and manage secure, scalable AI/ML workloads in the cloud.
Integrate vector databases and similarity search services into AI pipelines.
Infrastructure as Code (IaC):
Provision AI-ready infrastructure using Terraform, Bicep, and CloudFormation.
Maintain reusable IaC modules for consistent and automated deployments.
API Management & Integration:
Design and maintain API gateways (e.g., Azure API Management) for AI-powered applications.
Ensure secure and scalable API integrations for ML services.
DevOps & CI/CD for AI Pipelines:
Build and maintain CI/CD workflows for ML model training, deployment, and retraining.
Integrate with tools like GitHub Actions or Azure DevOps.
Scripting & Automation:
Develop automation scripts in Python, Bash, or PowerShell for provisioning, data preparation, and operational tasks.
Container Orchestration:
Deploy and manage containerized AI workloads using Kubernetes.
Secure runtime environments and manage resource scaling.
Security & Compliance:
Implement encryption, access controls, and compliance policies for cloud-based LLMs and AI services.
Collaborate with InfoSec teams to enforce governance standards.
Monitoring & Observability:
Set up metrics, logging, and alerting for AI model performance and infrastructure health using tools like Prometheus and ELK.
Required Skills and Qualifications: