Senior Cloud Engineer (Cloud & AI Infrastructure)
Plano TX (Onsite) Local only
Onsite Only [NO REMOTE OPTION]
Job Description:
We are seeking a Senior DevOps Engineer to lead the technical implementation of our Azure Enterprise Landing Zones and AI-ready infrastructure. You will bridge the gap between core cloud architecture and MLOps, ensuring that our AI/ML workloads-from Azure OpenAI to custom models-are deployed onto a secure, high-performance, and fully automated foundation.
## Key Responsibilities
1. Azure Architecture & Landing Zones
? Landing Zone Implementation: Deploy and manage scalable Azure Landing Zones, ensuring enterprise-grade governance, subscription organization, and resource hierarchy.
? Networking & Security: Architect secure Azure Networking (VNet, Peerings, Private Links, Hub-and-Spoke) and implement robust security guardrails via Azure Policy and Azure Active Directory (Entra ID).
2. Containerization & Orchestration
? AKS & Kubernetes: Act as the subject matter expert for Azure Kubernetes Service (AKS), managing cluster lifecycles, namespaces, and pod security policies.
? Docker Expert: Build, optimize, and secure Docker images for microservices and AI model serving.
? Helm Mastery: Utilize Helm Charts for consistent, version-controlled application deployments.
3. Infrastructure as Code (IaC) & Automation
? Terraform Mastery: Develop and maintain modular, enterprise-scale Terraform code to ensure & quot;Everything as Code" for both IaaS (VMs, Network) and PaaS (APIM, Event Hubs).
? CI/CD Governance: Build and optimize sophisticated pipelines using Azure DevOps and GitHub Actions, integrating security scanning and automated testing.
4. AI & MLOps Integration
? AI Workloads: Provision and scale infrastructure for Azure Machine Learning and OpenAI services, specifically managing GPU node pools and model monitoring.
? MLOps Pipelines: Implement deployment workflows for AI models, focusing on model performance tracking and automated drift detection.
5. Observability & Operations
? Monitoring: Lead environmental instrumentation using Azure Monitor, Log Analytics, and Application Insights.
? FinOps: Monitor and optimize cloud spend with custom cost-tracking and alerting for high-compute AI resources.
## Technical Requirements
? 6+ Years in DevOps/Cloud: Deep experience with Azure IaaS and PaaS.
? IaC Specialist: Advanced proficiency in Terraform for multi-region deployments.
? K8s Expert: Hands-on experience with Docker, Kubernetes (AKS), and ingress controllers.
? Automation Lead: Expert in Azure DevOps and/or GitHub Actions for CI/CD.
? Networking Guru: Strong understanding of Azure VNet, Firewall, and Load Balancing.
? AI Aware: Exposure to deploying and managing AI/ML workloads on Azure