Senior Cloud Engineer (Cloud & AI Infrastructure) ::Plano TX (Local only) (onsite)

Talent Movers

Plano, TX

JOB DETAILS
SKILLS
Artificial Intelligence (AI), Automation, Cloud Architecture, Cloud Computing, Continuous Deployment/Delivery, Continuous Integration, DevOps, Docker, Environmental Monitoring, Firewalls, GPU (Graphics Processing Unit), GitHub, Government Organizations, Hubs, Infrastructure as a Service (IaaS), Instrumentation, Load Balancing, Machine Learning, Microservices, Microsoft Active Directory, Microsoft Windows Azure, Network Architecture/Engineering, Network Security, Performance Analysis, Performance Modeling, Platform as a Service (PaaS), Security Architecture, Software Engineering, Test Automation, VMS Operating System, Virtual Machine (VM)
LOCATION
Plano, TX
POSTED
7 days ago

Senior Cloud Engineer (Cloud & AI Infrastructure)

Plano TX (Onsite) Local only

Onsite Only [NO REMOTE OPTION]

Job Description:

We are seeking a Senior DevOps Engineer to lead the technical implementation of our Azure Enterprise Landing Zones and AI-ready infrastructure. You will bridge the gap between core cloud architecture and MLOps, ensuring that our AI/ML workloads-from Azure OpenAI to custom models-are deployed onto a secure, high-performance, and fully automated foundation.

## Key Responsibilities

1. Azure Architecture & Landing Zones

? Landing Zone Implementation: Deploy and manage scalable Azure Landing Zones, ensuring enterprise-grade governance, subscription organization, and resource hierarchy.

? Networking & Security: Architect secure Azure Networking (VNet, Peerings, Private Links, Hub-and-Spoke) and implement robust security guardrails via Azure Policy and Azure Active Directory (Entra ID).

2. Containerization & Orchestration

? AKS & Kubernetes: Act as the subject matter expert for Azure Kubernetes Service (AKS), managing cluster lifecycles, namespaces, and pod security policies.

? Docker Expert: Build, optimize, and secure Docker images for microservices and AI model serving.

? Helm Mastery: Utilize Helm Charts for consistent, version-controlled application deployments.

3. Infrastructure as Code (IaC) & Automation

? Terraform Mastery: Develop and maintain modular, enterprise-scale Terraform code to ensure & quot;Everything as Code" for both IaaS (VMs, Network) and PaaS (APIM, Event Hubs).

? CI/CD Governance: Build and optimize sophisticated pipelines using Azure DevOps and GitHub Actions, integrating security scanning and automated testing.

4. AI & MLOps Integration

? AI Workloads: Provision and scale infrastructure for Azure Machine Learning and OpenAI services, specifically managing GPU node pools and model monitoring.

? MLOps Pipelines: Implement deployment workflows for AI models, focusing on model performance tracking and automated drift detection.

5. Observability & Operations

? Monitoring: Lead environmental instrumentation using Azure Monitor, Log Analytics, and Application Insights.

? FinOps: Monitor and optimize cloud spend with custom cost-tracking and alerting for high-compute AI resources.

## Technical Requirements

? 6+ Years in DevOps/Cloud: Deep experience with Azure IaaS and PaaS.

? IaC Specialist: Advanced proficiency in Terraform for multi-region deployments.

? K8s Expert: Hands-on experience with Docker, Kubernetes (AKS), and ingress controllers.

? Automation Lead: Expert in Azure DevOps and/or GitHub Actions for CI/CD.

? Networking Guru: Strong understanding of Azure VNet, Firewall, and Load Balancing.

? AI Aware: Exposure to deploying and managing AI/ML workloads on Azure

About the Company

T

Talent Movers