Director-Infrastructure Engineering

American Express Co

Phoenix, AZ

JOB DETAILS
SKILLS
Analysis Skills, Application Programming Interface (API), Automation, Business Strategy, Career Development, Change Management, Chargebacks, Cloud Applications, Cloud Computing, Communication Skills, Computer Science, Computer Security, Configuration Management, Continuous Deployment/Delivery, Continuous Improvement, Continuous Integration, Cost Control, Cost Forecasting, Cross-Functional, Customer Support/Service, DevOps, Disaster Recovery, Elasticsearch, Emerging Technology, Enterprise Applications, Enterprise Protection, Financial Management, High Availability, Identify Issues, Leadership, Machine Tool, Mentoring, Middleware, Network Security, Open Source, Operational Improvement, Operational Support, Operations Management, Performance Management, Performance Metrics, Private Cloud, Problem Solving Skills, Production Systems, Redis, Relationship Management, Reliability Engineering, Risk, Sales Pipeline, Scripting (Scripting Languages), Security Architecture, Service Level Agreement (SLA), Software Administration, Software Engineering, Splunk, Strategic Planning, Succession Planning, Team Lead/Manager, Team Player
LOCATION
Phoenix, AZ
POSTED
8 days ago

As part of our diverse tech team, you can architect, code and ship software that makes us an essential part of our customers' digital lives. Here, you can work alongside talented engineers in an open, supportive, inclusive environment where your voice is valued, and you make your own decisions on what tech to use to solve challenging problems. American Express offers a range of opportunities to work with the latest technologies and encourages you to back the broader engineering community through open source. And because we understand the importance of keeping your skills fresh and relevant, we give you dedicated time to invest in your professional development. Find your place in technology on #TeamAmex.

American Express Platform Services team is looking for innovators to help us build world-class applications, Cloud platforms and infrastructure supported by integrated CICD, Observability and security capabilities.

The Director Infrastructure Engineering - Head of Private Cloud (OpenShfit, IAC, Data Middleware Services) Operations - US is responsible for leading the strategy, execution, and continuous improvement of cloud operations across OpenShift, Redis, Kafka, Elasticsearch, Terraform and other platform services. This role ensures secure, reliable, scalable, and cost-effective cloud environments that support enterprise applications and digital transformation initiatives.

The ideal candidate combines strong technical depth in cloud infrastructure with operational excellence, financial governance (FinOps), DevOps, Site Reliability Engineering, automation leveraging GenAI/AgenticAI, and people leadership.

At American Express, our culture is built on a 175-year history of innovation, shared values and Leadership Behaviors, and an unwavering commitment to back our customers, communities, and colleagues. From delivering differentiated products to providing world-class customer service, we operate with a strong risk mindset, ensuring we continue to uphold our brand promise of trust, security, and service.

As part of Team Amex, you'll experience our powerful backing with comprehensive support for your holistic well-being and many opportunities to learn new skills, develop as a leader, and grow your career. Here, your voice and ideas matter, your work makes an impact, and together, you will help us define the future of American Express.

  • Bachelor's degree in Computer Science, Engineering, or related field (Master's preferred).

  • 8+ years of experience in Platform Engineering & Operations, API Support or Site Reliability Engineering (SRE), with a proven track record of leading teams in managing large-scale cloud infrastructure with a focus on reliability and resilience.

  • Deep hands-on experience with any Kubernetes platform(multi-cloud preferred).

  • Strong experience with:

  • Infrastructure as Code (Terraform, CloudFormation, ARM)

  • Container platforms (OpenShift/Kubernetes)

  • Monitoring tools (Prometheus, OTEL, LOKI)

  • CI/CD pipelines (Jenkins, GitHub Actions)

  • Strong understanding of cloud networking, security, and architecture.

  • Experience managing large-scale, mission-critical production environments.

  • Proven experience in financial management and cloud cost optimization.

  • Relevant certifications preferred

  • Experience with DevOps practices and methodologies, including CI/CD pipelines, configuration management, and infrastructure as code.

  • Experience in leveraging GenAI and AgenticAI in automation and self-healing of platforms

  • Experience with observability tools such as Prometheus, Splunk, ELK, Dynatrace.

  • Strong analytical and problem-solving skills, with the ability to troubleshoot complex issues and drive resolution in a fast-paced environment.

  • Excellent communication and leadership skills, with the ability to effectively collaborate with cross-functional teams and influence decision-making at all levels of the organization.

Employment eligibility to work with American Express in the United States is required as the company will not pursue visa sponsorship for these positions.

Key Responsibilities:

Cloud Operations Leadership

  • Lead and manage Private Cloud operations for production and non-production environments.
  • Establish and enforce operational standards, SLAs, and SLOs.
  • Drive incident, problem, and change management processes.
  • Ensure high availability, performance, and resilience of cloud platforms.

Cloud Infrastructure & Reliability

  • Oversee infrastructure design, deployment, monitoring, and optimization.
  • Implement Infrastructure as Code (IaC) using Terraform
  • Drive SRE principles including reliability engineering and automation.
  • Manage Disaster Recovery, and business continuity strategies.

Automation & DevOps Enablement

  • Champion automation-first operational models.
  • Leverage GenAI/AgenticAI to automate common platform operations including customer support
  • Integrate CI/CD pipelines with cloud infrastructure.
  • Reduce manual operational overhead through scripting and tooling.
  • Enable platform engineering capabilities for internal teams.

Financial Governance

  • Own cloud cost management, forecasting, and optimization.
  • Implement tagging standards and chargeback/showback models.
  • Drive cost-efficiency initiatives across workloads.

Vendor & Stakeholder Management

  • Manage relationships with Service providers
  • Collaborate with application teams, architecture, security, and enterprise IT.
  • Support cloud migration and modernization programs.

Team Leadership & Development

  • Build, mentor, and retain high-performing cloud operations teams.
  • Define hiring strategy and succession planning.
  • Establish performance metrics and career development plans.
  • Foster a culture of accountability, innovation, and continuous improvement.

Key Responsibilities:

Cloud Operations Leadership

  • Lead and manage Private Cloud operations for production and non-production environments.
  • Establish and enforce operational standards, SLAs, and SLOs.
  • Drive incident, problem, and change management processes.
  • Ensure high availability, performance, and resilience of cloud platforms.

Cloud Infrastructure & Reliability

  • Oversee infrastructure design, deployment, monitoring, and optimization.
  • Implement Infrastructure as Code (IaC) using Terraform
  • Drive SRE principles including reliability engineering and automation.
  • Manage Disaster Recovery, and business continuity strategies.

Automation & DevOps Enablement

  • Champion automation-first operational models.
  • Leverage GenAI/AgenticAI to automate common platform operations including customer support
  • Integrate CI/CD pipelines with cloud infrastructure.
  • Reduce manual operational overhead through scripting and tooling.
  • Enable platform engineering capabilities for internal teams.

Financial Governance

  • Own cloud cost management, forecasting, and optimization.
  • Implement tagging standards and chargeback/showback models.
  • Drive cost-efficiency initiatives across workloads.

Vendor & Stakeholder Management

  • Manage relationships with Service providers
  • Collaborate with application teams, architecture, security, and enterprise IT.
  • Support cloud migration and modernization programs.

Team Leadership & Development

  • Build, mentor, and retain high-performing cloud operations teams.
  • Define hiring strategy and succession planning.
  • Establish performance metrics and career development plans.
  • Foster a culture of accountability, innovation, and continuous improvement.

About the Company

A

American Express Co