Find Your Dream Job

Search through thousands of job postings to find your next opportunity

Date Posted

Job Type

Technology

Work Setting

Salary Range

$0k $100k $200k+

Experience Level

Cloud Engineer

Dicetek LLC

Dubai, United Arab Emirates

Job Summary:

We are seeking a skilled and motivated Cloud Engineer with focus on AI workloads to design, implement, and maintain cloud infrastructure optimized for AI and Generative AI workloads. This role involves provisioning cloud resources, automating deployments, integrating cloud-native AI services, and ensuring secure, scalable, and observable AI/ML environments across AWS, Azure, and GCP platforms.


Key Responsibilities:


Cloud Platform & AI Services Management:

Administer and troubleshoot AI/ML services across AWS and Azure.

Apply best practices for managing cloud-native AI services such as Azure OpenAI and AWS SageMaker.

Support hybrid and multi-cloud environments for AI workloads.


AI/ML Platform Engineering:

Deploy and manage secure, scalable AI/ML workloads in the cloud.

Integrate vector databases and similarity search services into AI pipelines.


Infrastructure as Code (IaC):

Provision AI-ready infrastructure using Terraform, Bicep, and CloudFormation.

Maintain reusable IaC modules for consistent and automated deployments.


API Management & Integration:

Design and maintain API gateways (e.g., Azure API Management) for AI-powered applications.

Ensure secure and scalable API integrations for ML services.


DevOps & CI/CD for AI Pipelines:

Build and maintain CI/CD workflows for ML model training, deployment, and retraining.

Integrate with tools like GitHub Actions or Azure DevOps.


Scripting & Automation:

Develop automation scripts in Python, Bash, or PowerShell for provisioning, data preparation, and operational tasks.


Container Orchestration:

Deploy and manage containerized AI workloads using Kubernetes.

Secure runtime environments and manage resource scaling.


Security & Compliance:

Implement encryption, access controls, and compliance policies for cloud-based LLMs and AI services.

Collaborate with InfoSec teams to enforce governance standards.


Monitoring & Observability:

Set up metrics, logging, and alerting for AI model performance and infrastructure health using tools like Prometheus and ELK.


Required Skills and Qualifications:

  • 5+ years of experience managing cloud platforms (AWS, Azure, GCP), including AI/ML services.
  • Hands-on experience with Azure OpenAI, AWS SageMaker, or similar platforms.
  • Proficiency in Infrastructure as Code tools (Terraform, Bicep, CloudFormation).
  • Experience with API management tools (e.g., Azure API Management).
  • Strong scripting skills in Python, Bash, or PowerShell.
  • Experience deploying and managing Kubernetes clusters.
  • Knowledge of cloud-native vector databases and similarity search services.
  • Understanding of cloud security principles and compliance for AI/ML.
  • Familiarity with monitoring tools like Prometheus, Grafana, and ELK.

New SRE Jobs

Connecting top SRE talent with leading companies.

For SRE Professionals

For Employers

Company