Staff Engineer, Cloud & Platform Engineering (Application Modernization - Agentic AI)
Location: Onsite Grapevine, TX (5 days/week)
Employment Type: Full-Time
Staff: 9 engineers
About the Role
We are seeking a Staff Engineer, Cloud & Platform Engineering to lead and influence the modernization of the technology platform, migrating legacy applications to a cloud-native architecture on AWS. This role is hands-on and strategic, with a strong focus on application modernization, containerization, and cloud-native infrastructure.
As a Staff Engineer, you will play a critical role in modernizing legacy Windows-based applications, building and evolving our application deployment pipelines, and leading migrations to Linux-based, containerized workloads running on AWS EKS. You will partner closely with application teams, DevOps, and leadership to ensure platforms are scalable, reliable, secure, and cost-efficient.
Key Responsibilities
Platform & Infrastructure Leadership- Lead the design, optimization, and evolution of AWS-native infrastructure, ensuring scalability, reliability, and security.
- Serve as a technical authority for cloud and platform decisions, aligning infrastructure strategy with business objectives.
Agentic AI & Infrastructure Automation:- Design, implement, and maintain agentic AI systems for self-healing workflows, intelligent capacity planning, automated incident response, and infrastructure-as-code generation.
Application Modernization & Migration- Lead and support application modernization efforts, including refactoring and replatforming legacy applications.
- Drive migrations from Windows-based environments to Linux-based, containerized architectures.
- Partner with application teams to migrate legacy runtimes (including older .NET frameworks) into container-based deployments on AWS EKS.
Kubernetes & Cloud Operations- Design, manage, and optimize AWS EKS clusters, including capacity management, performance tuning, and cost optimization.
- Implement and maintain scalable deployment patterns using Kubernetes, autoscaling, and modern cloud-native practices.
CI/CD & Automation- Design and evolve CI/CD pipelines using GitLab and related tooling to support modern application delivery.
- Leverage Terraform and Infrastructure as Code (IaC) to automate provisioning and ensure consistency across environments.
- Utilize Python and other scripting languages to improve automation and operational efficiency.
Incident Management & Reliability- Lead investigation and resolution of complex cloud and platform incidents.
- Ensure systems meet or exceed performance, reliability, and SLA expectations.
Cost Optimization & Governance- Partner cross-functionally to identify and implement cloud cost optimization strategies, including use of cloud financial management tools.
- Ensure platform designs follow operational best practices and support long-term maintainability.
Collaboration & Influence- Work independently while collaborating across engineering, application, and leadership teams.
- Mentor engineers and influence best practices across cloud, DevOps, and platform engineering functions.
Required Qualifications- Senior/Staff-level experience designing and managing large-scale IT infrastructure in cloud environments.
- Agentic AI Experience: Hands-on experience with agentic AI systems for infrastructure automation (AI-driven runbook execution, LLM-based ops tooling, autonomous remediation pipelines).
- minimum 3 years of production grade Gen AI & Agentic AI implementations experience on cloud, devops, SRE & platform engineering.
- Strong hands-on experience with AWS, including services such as EKS, Lambda, and supporting AWS-native tooling.
- Deep knowledge of Kubernetes, container orchestration, and cloud-native deployment patterns.
- Proven experience with application modernization and migration projects, particularly replatforming legacy applications.
- Strong experience with CI/CD pipelines, GitLab, and DevOps methodologies.
- Experience using Terraform and Infrastructure as Code to manage cloud environments.
- Solid understanding of Linux-based environments OR strong Windows-to-cloud migration experience with the ability to operate in Linux-based platforms.
- Experience driving cost optimization initiatives using data and cloud financial tools.
- Excellent communication and decision-making skills, with the ability to influence across teams.
- Prior experience supporting retail or e-commerce platforms is strongly preferred.
Preferred / Nice-to-Have Skills- Experience migrating applications from Windows to Linux/container-based environments.
- Exposure to .NET application modernization, including legacy frameworks.
- Experience with Karpenter, cloud capacity optimization, and autoscaling strategies.
- Familiarity with InOps / FinOps practices.
- Strong scripting experience (Python preferred).
Why This Role- Direct ownership and influence over a major enterprise modernization initiative.
- Opportunity to shape how applications are built, deployed, and operated in a cloud-native environment.
- High-impact role working with modern AWS and Kubernetes technologies at scale.
IND123