We are looking for a highly skilled Cloud Engineer with strong expertise in Azure Machine Learning and DevOps practices to join our team. In this role, you will design, develop, and maintain scalable cloud-based machine learning solutions and implement DevOps pipelines to support continuous integration and deployment (CI/CD) across AI models and cloud infrastructure. You will work closely with data scientists, software engineers, and DevOps teams to streamline machine learning workflows and ensure efficient cloud operations.
Key Responsibilities:
- Cloud & Machine Learning:
- Design and manage cloud-based machine learning environments using Azure services like Azure Machine Learning Studio, Azure Databricks, and Azure Kubernetes Service (AKS).
- Develop, deploy, and monitor machine learning models on Azure, ensuring scalability, security, and availability.
- Collaborate with data science teams to operationalize machine learning models and optimize cloud infrastructure for AI workloads.
- Cloud & DevOps Best Practices:
- Optimise cloud infrastructure for cost, performance, and security.
- Ensure compliance with cloud security standards and data governance within the Azure environment.
- Collaborate with cross-functional teams to implement best DevOps practices, improve workflows, and enhance the efficiency of AI/ML model deployment.
- Troubleshoot and resolve any issues related to cloud infrastructure, DevOps pipelines, or machine learning deployments.
Key Qualifications:
- Cloud & ML Experience:
- Proven experience as a Cloud Engineer with expertise in Microsoft Azure.
- Strong experience with Azure Machine Learning services (Azure ML Studio, Azure Databricks, AKS, etc.) and managing AI/ML workloads in the cloud.
- Proficiency in Python and experience with ML libraries such as scikit-learn, TensorFlow, and PyTorch.
- DevOps Expertise:
- Deep understanding of DevOps practices, including experience building and managing CI/CD pipelines.
- Hands-on experience with Azure DevOps, GitHub Actions, or similar tools for automation.
- Familiarity with containerization (Docker) and orchestration (Kubernetes) in a cloud environment.
- Experience with Infrastructure as Code (IaC) tools like Terraform or Azure Resource Manager (ARM).
- Additional Skills:
- Strong understanding of cloud security, networking, and monitoring in Azure environments.
- Knowledge of MLOps and the integration of DevOps practices into the machine learning lifecycle.
- Excellent problem-solving and troubleshooting skills, with the ability to collaborate in a fast-paced environment.
- Experience in Snowflake
#SeniorLevel