Android, Architectural Services, Artificial Intelligence (AI), Best Practices, Budgeting, Capacity Management, Cloud Computing, Coaching, Computer Firmware, Computer Science, Computer Software, Continuous Deployment/Delivery, Continuous Improvement, Continuous Integration, Cost Control, Cross-Functional, Data Management, Data Recovery, Design Patterns Programming Methodologies, Distributed Computing, Ecosystems, Engineering, GPU (Graphics Processing Unit), High Throughput, Hybrid Cloud, Incident Management, Infrastructure Software, Instrumentation, Large-Scale Systems, Leadership, Linux Operating System, Machine Tool, Mentoring, Metrics, Microsoft Windows Azure, Microsoft Windows Operating System, Network Scalability, Performance Engineering, Performance Tuning/Optimization, Reliability Engineering, Reporting Dashboards, Scalable System Development, Smartphones, Stock Market, System Architecture, Systems Reliability, Systems Scalability, Talent Management, Team Lead/Manager, Technical Delivery, Telemetry, Traffic Shaping, Work From Home
General Information
Req # WD00096060
Career area: Software Engineering
Country/Region: United States of America
State: North Carolina
City: Morrisville
Date: Friday, March 6, 2026
Working time: Full-time
Additional Locations:
- United States of America - Illinois - Chicago
Why Work at Lenovo
We are Lenovo. We do what we say. We own what we do. We WOW our customers. Lenovo is a US$69 billion revenue global technology powerhouse, ranked #196 in the Fortune Global 500, and serving millions of customers every day in 180 markets. Focused on a bold vision to deliver Smarter Technology for All, Lenovo has built on its success as the worlds largest PC company with a full-stack portfolio of AI-enabled, AI-ready, and AI-optimized devices (PCs, workstations, smartphones, tablets), infrastructure (server, storage, edge, high performance computing and software defined infrastructure), software, solutions, and services. Lenovos continued investment in world-changing innovation is building a more equitable, trustworthy, and smarter future for everyone, everywhere. Lenovo is listed on the Hong Kong stock exchange under Lenovo Group Limited (HKSE: 992) (ADR: LNVGY). This transformation together with Lenovos world-changing innovation is building a more inclusive, trustworthy, and smarter future for everyone, everywhere. To find out more visit www.lenovo.com, and read about the latest news via our StoryHub.
Description and Requirements
About Our Team
Lenovo is building Quantum, a next-generation hybrid AI platform that spans Windows, Android, and cloud. As part of this vision, we are expanding the reliability engineering organization powering Qira, Lenovos cross-device Personal AI that operates seamlessly across Lenovo and Motorola products. We are hiring a Senior Manager, AI Reliability Engineering to lead the engineering teams responsible for Qiras foundational reliability capabilities - including system-level observability, telemetry, performance engineering, resiliency architecture, and the reliability of Qiras hybrid edge/cloud AI service. This is a high-impact leadership role shaping how we measure, operate, and improve reliability across one of Lenovos most ambitious AI initiatives.
Location: Open to remote work in the US. The preferred work location is Chicago, IL.
What Youll Do
Engineering Leadership
- Lead and grow multiple engineering teams focused on reliability, observability, and system performance across Qiras hybrid AI ecosystem.
- Define strategy, roadmaps, and priorities to improve reliability, insight, and operational readiness across device, edge, and cloud systems.
- Champion reliability as an engineering discipline through design patterns, best practices, and a culture of continuous improvement.
Observability & Telemetry
- Own the systems that deliver metrics, logs, traces, distributed tracing, AI-specific signals, dashboards, and alerting.
- Drive the adoption of unified telemetry standards and instrumentation across all Qira components.
- Ensure engineers have actionable insight into performance, reliability, cost, and AI behavior.
Service Reliability & Performance Engineering
- Lead engineering efforts to improve the reliability, performance, and scalability of Qiras service architecture - including inference, retrieval, data pipelines, and hybrid edge/cloud workflows.
- Drive the design and adoption of resilience patterns such as graceful degradation, fallback paths, bulkheads, and rate-limiting strategies.
- Oversee capacity planning, cost optimization, and performance tuning for high-throughput AI systems.
System Design & Architectural Influence
- Work with cross-functional engineering teams to embed reliability early in the design process ("shift left").
- Guide architectural decisions to ensure Qiras engineering foundations remain stable, observable, and predictable at scale.
- Set service readiness standards for new components entering production.
Cross-Functional Collaboration
- Partner with Applied AI/ML Engineering, Platform Engineering, Firmware, Product, and Security to align reliability goals with Qiras broader roadmap.
- Collaborate closely with the incident management and operations teams to ensure strong signal quality, runbook depth, and operational tooling.
- Act as a reliability engineering representative in executive and engineering leadership forums.
Team & Talent Development
- Hire and develop world-class engineers across observability, reliability, and performance domains.
- Provide coaching, mentorship, and clear technical and leadership career paths.
- Foster a culture of ownership, operational craftsmanship, and data-driven engineering.
Basic Qualifications
- 12+ years of experience in Site Reliability Engineering, Observability Engineering, Platform Engineering, or large-scale distributed systems, including 5+ years leading engineering teams.
- Bachelors Degree in Computer Science, Engineering, or a related technical field.
- Engineering experience in several of the following:
- Observability systems (OpenTelemetry, metrics/logs/traces)
- Distributed systems reliability and performance
- Cloud infrastructure (Azure preferred)
- Kubernetes and containerized environments
- CI/CD pipelines and deployment workflows
- Infrastructure-as-Code (Terraform, Bicep, etc.)
- Deep understanding of Linux systems, networking, scalability, and system performance fundamentals.
- Proven ability to lead engineering teams and drive cross-organizational initiatives.
Preferred Qualifications
- Experience building or operating large-scale telemetry and observability platforms.
- Hands-on experience with Grafana, Prometheus, Loki, Tempo, or similar tooling.
- Experience supporting AI/ML inference systems, vector databases, or GPU-accelerated compute.
- Background in hybrid systems spanning device, edge, and cloud.
- Experience implementing resilience patterns and reliability frameworks.
- Experience with SLOs, SLIs, error budgets, and reliability governance.
- Passion for building scalable reliability engineering teams and systems.
Why This Role Matters
Qiras reliability is mission-critical to delivering a safe, fast, and trustworthy AI experience to millions of users. In this role, you will:
- Build the telemetry and reliability insights that power Qira
- Architect the service-level reliability patterns that keep Qira stable at scale
- Lead the engineering teams that ensure Qira performs predictably across devices, edge, and cloud
- Shape how reliability engineering is practiced across Lenovos AI ecosystem
This is a rare opportunity to define the engineering foundation of a next-generation global AI platform.
The base salary budgeted range for this position is $190K - $230K. Individuals may also be considered for bonus and/or commission. Lenovos various benefits can be found on www.lenovobenefits.com.
We are an Equal Opportunity Employer and do not discriminate against any employee or applicant for employment because of race, color, sex, age, religion, sexual orientation, gender identity, national origin, status as a veteran, and basis of disability or any federal, state, or local protected class.
Additional Locations:
- United States of America - Illinois - Chicago
- United States of America
- United States of America - Illinois
- United States of America - Illinois - Chicago