101 Results for

Reliability Engineer Jobs in Plano, TX

Jobs

New!

Irving, TXToday

$130,000–$156,000 Per Year

p>As the SRE Team Lead, you will be responsible for the technical leadership of a talented team of site reliability engineers dedicated to maintaining and improving the reliability, scalability, and performance of our critical systems and services. You will serve as a technical leader and mentor, driving strategic initiatives around automation, incident management, observability and system design while collaborating closely with engineering, operations, and product teams.

New!

Addison, TX6 days ago

$60–$70 Per Hour

Our services include business and technology consulting, data and artificial intelligence, industry solutions, as well as the development, implementation and management of applications, infrastructure and connectivity. NTT DATA's Client is currently seeking an SRE Engineer / Site Reliability Engineer Specialist to join their team in Plano, TX, Addison, TX or Montepelier, VT.

Westlake, TX19 days ago

$65–$71 Per Hour

We are seeking a highly skilled Site Reliability Engineer (SRE) with strong expertise in Terraform, Cloud Infrastructure, DevOps, Automation and Load Balancing, The ideal candidate will be responsible for ensuring the reliability, scalability, performance, and availability of critical enterprise applications across hybrid and multi-cloud environments. Required Skills & Qualifications

5+ years of experience in Site Reliability Engineering, DevOps Engineering, Platform Engineering, or related disciplines (understanding reliability engineering principles, SLIs, SLOs, error budgets, and operational excellence).

Plano, TX30+ days ago

$70–$73.68 Per Hour

In terms of professional development, Apex hosts an on-demand training program, provides access to certification prep and a library of technical and leadership courses/books/seminars once you have 6+ months of tenure, and certification discounts and other perks to associations that include CompTIA and IIBA. Experience: A minimum of 5 years of experience is required in supporting enterprise solutions, including enterprise security, orchestration, workflow automation, CI/CD pipelines, and cloud platforms.

Plano, TX30+ days ago

$65–$70 Per Hour

In terms of professional development, Everforth Apex hosts an on-demand training program, provides access to certification prep and a library of technical and leadership courses/books/seminars once you have 6+ months of tenure, and certification discounts and other perks to associations that include CompTIA and IIBA. Everforth Apex also offers a HSA (Health Savings Account on the HDHP plan), a SupportLinc Employee Assistance Program (EAP) with up to 8 free counseling sessions, a corporate discount savings program and other discounts.

Plano, TX30+ days ago

$68–$73.68 Per Hour

Dallas, TX30+ days ago

The platforms we offer include central logging, monitoring, agents and alerting and we provide tools to drive adoption and improvements to capacity planning, operational readiness assessments, production incident postmortems, SLIs / SLOs, and deployment automation including canary releases.

Experience: Minimum of 6+ years of hands-on experience in Site Reliability Engineering, with a proven track record in architecting, designing, building, and maintaining highly available, scalable, and fault-tolerant systems at an enterprise level.

Richardson, TX30+ days ago

Experience: Minimum of 6+ years of hands-on experience in Site Reliability Engineering, with a proven track record in architecting, designing, building, and maintaining highly available, scalable, and fault-tolerant systems at an enterprise level.

Irving, TX30+ days ago

The role will also be supportive of overall Cloud Transformation initiatives designed to meet key goals in creating a service-driven culture through performance and delivery of SaaS, PaaS, and IaaS solutions by public cloud vendors such as Azure and AWS. Knowledge of Public Cloud Governance frameworks, architectures, configurations, services, and solutions, specifically within Microsoft Azure, but may also include AWS and GCP.

New!

Dallas, TX1 day ago

The Senior Principal Reliability Engineer will be expected to have deep knowledge and experience with a variety of reliability engineering sub-disciplines including Failure Modes Effects Criticality Analysis (FMECA), Environmental Stress Screening (ESS) process optimization, Reliability Predications, Derating Analysis and overseeing Failure Reporting and Corrective Action (FRACAS) practitioners. More information about Security Clearances can be found on the US Department of State government website here:

Tucson, AZ:

As part of our commitment to maintaining a secure hiring process, candidates may be asked to attend select steps of the interview process in-person at one of our office locations, regardless of whether the role is designated as on-site, hybrid or remote.The salary range for this role is 132,400 USD - 251,600 USD.

Irving, TX26 days ago

$60–$65 Per Hour

In terms of professional development, Everforth Apex hosts an on-demand training program, provides access to certification prep and a library of technical and leadership courses/books/seminars once you have 6+ months of tenure, and certification discounts and other perks to associations that include CompTIA and IIBA. Review and analyze complex multi-faceted, larger scale or longer-term Systems Operations Engineering challenges that require in-depth evaluation of multiple factors including intangibles or unprecedented factors.

Dallas, TX26 days ago

This is also a hands-on technologist role requiring exposure to SRE and DevOps technology stacks and strong understanding of application support processes, including monitoring and addressing incidents/alerts across engineering applications and ensuring effective coordination and handoffs with vendors, partners and internal Synchrony teams. Role Summary/Purpose: The Reliability Engineer - OnePay plays a pivotal technical role within Synchrony Financial to ensure high availability of our applications to enhance and maintain customer experiences for OnePay integrations while providing operational excellence and adherence to program SLAs.

Southlake, TX11 days ago

li>5+ years of experience leading the implementation and scaling of reliability engineering practices such as service level objectives, monitoring strategies, incident reviews, and automation-driven improvements. You will lead efforts to elevate production operations through modern Site Reliability Engineering practices, shaping how engineering teams design, build, and operate resilient systems at scale.

New!

Plano, TX4 days ago

Write and maintain scripts and automation workflows to reduce manual toil and streamline operational tasks (e.g., provisioning, configuration management, log rotation, disk cleanup, service restarts).

New!

Plano, TX4 days ago

Toyota is proud to have 10+ different Business Partnering Groups across 100 different North American chapter locations that support team members' efforts to dream, do and grow without questioning that they belong. You must have the right to work in the United States and not require Toyota support or sponsorship for immigration-related employment (e.g., H-1B, O-1, E-3, H-1B1, TN, F-1 OPT, F-1 STEM OPT, F-1 CPT, TN, 'job flexibility benefits' (also known as I-140 or Adjustment of Status portability), etc.

Dallas, Texas15 days ago

p/>

As part of our journey from traditional operations toward a mature SRE model, the Senior SRE will partner with product engineering, platform teams, and the Command Center including Service Desk and Major Incident Command (MIC) to deliver measurable improvements in service reliability.

Deep knowledge of:

Azure: AKS, App Services, Functions, VMSS, Storage, Front Door, API Management, Load Balancers, Monitor, Log Analytics, App Insights, Key Vault, Policy, Defender.

Dallas, MN30+ days ago

We are seeking a highly skilled Site Reliability Engineer (SRE) to support and enhance the reliability, scalability, and performance of enterprise applications and infrastructure. The ideal candidate will have strong experience in cloud environments, automation, monitoring, and production support.

TX26 days ago

p>Support reliability test methods including thermal cycling, thermal shock, high-temperature exposure, humidity, corrosion, pressure cycling, leak testing, coolant compatibility, mechanical fatigue, and bond/interface reliability.

Lead and manage reliability test planning and execution for semiconductor packaging, liquid cold plates, T800 Thermadite, CVD diamond, thermal spreaders, embedded cooling structures, and related thermal assemblies.

New!

Plano, TX4 days ago

p>As a Lead Site Reliability Engineer at JPMorgan Chase within the Infrastructure Platforms, Web Hosting team , you hold a leadership role in your team, demonstrate strong knowledge across multiple technical domains, and advise others on the technical and business issue facing them.

Leads reuse-first adoption of AI-assisted reliability workflows across SDLC/toolchain practices (e.g., CI/CD quality checks, test/validation automation, and operational readiness), ensuring traceability/auditability, resiliency, and security controls.

Dallas, TX17 days ago

p>As a Senior Lead Site Reliability Engineer at JPMorgan Chase within the Chief Data & Analytics Office (CDAO) AI/ML & Data Platforms team, you work with your fellow stakeholders to define non-functional requirements (NFRs) and availability targets for services supporting large-scale data platforms and data lake ecosystems. You will ensure those NFRs are embedded into product design and testing phases, that service level indicators effectively measure customer and data platform performance, and that service level objectives are defined with stakeholders and implemented in production to support secure, scalable, and high-performing analytics and AI/ML workloads.

New!

Dallas, TX6 days ago

$60–$65

p> Determining compensation for this role (and others) at Vaco by Highspring depends upon a wide array of factors including but not limited to:

the individual’s skill sets, experience and training;
licensure and certification requirements;
office location and other geographic considerations;
other business and organizational needs. Determining compensation for this role (and others) at Vaco/Highspring depends upon a wide array of factors including but not limited to the individual’s skill sets, experience and training, licensure and certifications, office location and other geographic considerations, as well as other business and organizational needs.

Dallas, TX24 days ago

Participate in incident management and on-call rotation, providing technical support for SRE tools, troubleshooting production issues, and collaborating with teams to reduce incident recurrence through proactive detection and pattern analysis. Build and optimize Infrastructure as Code (IaC) using Terraform to manage AWS resources related to SRE solutions, incorporating cost-efficient design principles.

Irving, Texas30+ days ago

p>This role is ideal for someone who enjoys working directly in Azure, improving production systems, troubleshooting issues across infrastructure and application layers, and building practical monitoring and alerting solutions that help teams respond faster and operate more confidently.

Wellfit is the dental industry’s fintech solution, breaking down financial barriers so patients, providers, employers, and payors can all access better care.

Plano, TX30+ days ago

Full-time

As a Lead Site Reliability Engineer at JPMorgan Chase within the Infrastructure & Production Management sector of Consumer & Community Banking, you hold a leadership role in your team, demonstrate strong knowledge across multiple technical domains, and advise others on the technical and business issues facing them. Take lead and conduct resiliency design reviews, break up complex problems into digestible work for other engineers, act as a technical lead for medium to large-sized products, and provide advice and mentoring to other engineers.

New!

Southlake, TX6 days ago

This requires an oversight of all routine and strategic infrastructure initiatives, including operating system upgrades, patching, EOL remediation, infrastructure changes, middleware and database activities, cloud technologies and readiness, tooling modernization, and automation at scale. You will lead ongoing improvements in automation, resilience engineering, disaster recovery readiness, and operational maturity, creating repeatable, well-engineered processes that support rapid change with minimal risk.

New!

Dallas, TX6 days ago

We build and operate a suite of platforms and applications that prevent, detect, and mitigate regulatory and reputational risk across the firm, have access to the latest technology and to massive amounts of structured and unstructured data, leverage modern frameworks to build responsive and intuitive front end and Big Data applications. We''re committed to fostering and advancing diversity and inclusion in our own workplace and beyond by ensuring every individual within our firm has a number of opportunities to grow professionally and personally, from our training and development opportunities and firmwide networks to benefits, wellness and personal finance offerings and mindfulness programs.

Dallas, Texas30+ days ago

div>

Core Responsibilities:

Enterprise Architecture: Lead the design, governance, and rollout of Dynatrace observability for distributed microservices, serverless workloads, and multi-region cloud environments. This is a high-impact role designed for a technical leader with nearly a decade of specialization in Dynatrace SaaS, tasked with architecting and automating large-scale monitoring solutions across complex AWS and Azure environments.

New!

Plano, TX4 days ago

$96,800–$145,200 Per Year

NTT DATA recruiters will never ask for payment or banking information and will only use @nttdata.com, @nttdatafed.com and @talent.nttdataservices.com email addresses. If you would like to contact us regarding the accessibility of our website or need assistance completing the application process, please contact us at https://us.nttdata.com/en/contact-us.

Plano, TX30+ days ago

As a Lead Site Reliability Engineer at JPMorgan Chase within the Corporate sector, Enterprise technology team, you hold a leadership role in your team, demonstrate strong knowledge across multiple technical domains, and advise others on the technical and business issues facing them. Take lead and conduct resiliency design reviews, break up complex problems into digestible work for other engineers, act as a technical lead for medium to large-sized products, and provide advice and mentoring to other engineers.

Arlington, TX30+ days ago

Work closely with data scientists, data architects, data engineers, ETL developers, cybersecurity, network, Linux, other IT counterparts, and business partners to design and setup the environments to manage the ingested and processed datasets from the external sources, internal systems, and the data warehouse to extract features of interest. Solid experience in High Availability and distributed systems, Linux , Data and SAN Storage Networks, NAS and Networking, leveraging tools to instrument and automate proactively and eventually predictive availability solutions.

New!

TX2 days ago

$83,538–$137,241 Per Year

li>Protocol Expertise: Mastery of DNS-specific protocols including DNSSEC, DoT, and DoH, with a firm grasp of underlying transport layers (UDP/TCP) and dual-stack (IPv4/IPv6) networking. You will combine deep IP networking and DNS expertise with modern security protocols to ensure our platforms remain resilient against evolving threats and perform at the highest level for millions of users.

New!

Frisco, TX6 days ago

$107,300–$193,500 Per Year

To find the pay range for this role based on hiring location, https://paylookup.t-mobile.com/paylookup?reqID=REQ356753¶dox=1.

The Senior Site Reliability Engineer leverages automation, CI/CD practices, scripting, observability, and incident management expertise to improve reliability, scalability, and operational efficiency across a complex technology environment.

Plano, Texas30+ days ago

p/>

The ideal candidate will bring deep expertise in distributed systems, cloud-native infrastructure, SaaS application support and DevOps/SRE principles, along with strong leadership and collaboration skills to influence cross-functional engineering and Production management teams and drive continuous improvement in service reliability. This includes our commitment to being an inclusive workplace, attracting and developing exceptional talent, supporting our teammates’ physical, emotional, and financial wellness, recognizing and rewarding performance, and how we make an impact in the communities we serve.

Irving, TX30+ days ago

$130,000–$150,000 Per Year

p>You'll combine hands-on Azure experience with code-level debugging, observability best practices, and automation to prevent issues before they occur, drive down MTTD/MTTR, and deliver an exceptional experience for patients and providers.

Make an Impact: Your work will directly shape the financial backbone of one of the most innovative healthcare fintech companies in the U.S.
Work Flexibly: Hybrid model based in Dallas with 3 days/week in-office.

Dallas, TX30+ days ago

p>With offices in Toronto, San Francisco, Dallas, and Pittsburgh, Waabi is growing quickly and looking for diverse, innovative and collaborative candidates who want to impact the world in a positive way.

Experience with established methodologies (e.g., 5 Whys, Fishbone diagrams, Fault Tree Analysis, or FMEA) to support the responsibility of conducting thorough root cause analyses on recurring issues.

Plano, TX30+ days ago

p>We are hiring a Site Reliability Engineer (SRE) with Oracle DBA expertise to join our client's Data Center Engineering team. This role focuses on ensuring uptime, scalability, and resilience of mission‑critical Oracle database infrastructure.

Dallas, TX30+ days ago

p>Our holistic approach to decisioning is powered by our industry-leading platform and team of experts, who help leaders make better decisions, faster - unlocking business growth and creating powerful customer connections.

With clients in 50+ countries and global offices across New York City, Miami, Dallas, Dublin, London, Paris, Singapore, Shanghai, Munich, Poznan, Sydney, Melbourne, Charlottesville and Denver, we're growing fast.

Plano, TX11 days ago

$117,000–$209,330 Per Year

p>The ideal candidate has deep experience operating production systems at scale, an automation-first mindset, and the ability to improve reliability through engineering practices such as SLOs/SLIs, production readiness, incident management, observability, resilience testing, and toil reduction.

As part of a new SRE team supporting Autodesk GovCloud, you will have a unique opportunity to help shape how Autodesk deploys, runs, and improves production services in restricted cloud environments.

Plano, TX30+ days ago

Full-time

As a Site Reliability Engineer III at JPMorgan Chase within the Consumer and Community Banking, you will serve as an experienced member of an agile team, focusing on designing and delivering trusted, market-leading technology products that are secure, stable, and scalable. Experience in observability such as white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, and others.

Plano, TX30+ days ago

Full-time

Replace the first sentence with \"As a Senior Lead Site Reliability Engineer at JPMorgan Chase within Consumer and Community banking team, you will set clear quality gates across requirements, design, secure coding, testing, releases, and post-production monitoring to ensure reliability, performance, security, and observability. Lead and participate in major incident response (including outside business hours when needed), run post-incident reviews, and drive improvements against KPIs like availability, MTTR, and change failure rate.

Dallas, TX30+ days ago

li>Built, using, and automating monitoring systems such as NewRelic, DataDog, SignalFX, Kibana,

Hands-on experience deploying, operating, and monitoring production-grade AI/ML microservices (e.g., RAG pipelines, agentic systems) on cloud platforms like AWS Fargate/ECS.

Hands-on experience building and operating distributed systems in a public cloud environment (preferably AWS), using CI/CD to deploy, manage and operate production systems, focusing on tooling and automation using tools such as maven and Jenkins.

Dallas, TX13 days ago

$145,000–$217,000 Per Year

p>The Technology & Operational Risk department within the Multifamily (MF) division is seeking a Site Reliability Engineer (SRE) who will blend software engineering with IT operations to ensure the reliability, availability, scalability, in the performance of key systems, services, and environments.

Qualifications:

Proven expertise in designing, developing, and maintaining automation frameworks for application operations, including infrastructure provisioning, deployment pipelines, monitoring, and incident response, using tools such as Ansible, Terraform, Jenkins, and related technologies.

TX30+ days ago

$104,000–$166,000 Per Year

The AWS Site Reliability Engineer (SRE) will collaborate closely with cross-functional teams, including development, quality assurance, and operations, to ensure seamless software releases and continuous improvement of our release processes.

What you will do:

Infrastructure Automation: Design, implement, and manage infrastructure as code (IaC) solutions using tools like AWS CloudFormation, Terraform or Helm Charts to automate continuous database deployment and scaling processes.

New!

Plano, Texas5 days ago

Toyota is proud to have 10+ different Business Partnering Groups across 100 different North American chapter locations that support team members’ efforts to dream, do and grow without questioning that they belong.

Dallas, TX30+ days ago

$81,456–$137,490 Per Year

p>In this role, you will contribute to environmental and stress testing efforts, support failure analysis investigations, and help analyze test and field data to identify potential reliability risks. You'll work closely with design, manufacturing, and supplier teams to help implement design-for-reliability best practices and assist with reliability verification activities from concept through production.

New!

Plano, TX4 days ago

$96,800–$145,200 Per Year

New!

TX3 days ago

Remote

$110,000–$137,485 Per Year

p>Founded in 2009, we continue to be recognized for our intentional culture and tremendous growth (Best Place to Work in Fintech; Best & Brightest to Work For Nationally; and Comparably's Best Company Culture, Best Career Growth, Best Engineering Team, and Best Places to Work in Dallas, among others).

The Sr Site Reliability Engineer, Release will prototype, write, maintain, and test code in multiple stages of the release process and in multiple environments in order to rapidly deliver automated solutions to our application releases.

Plano, TX30+ days ago

$50–$55 Per Hour

strong>Most recently, we were recognized Stevie Employer of the Year 2025, SIA Best Staffing Firm to work for 2025, Inc 5000 Best Workspaces in US (2025 & 2024) and Glassdoor's Best Places to Work (2023 & 2022)!. Primary Skills: Scripting (Expert), Java (Expert), Monitoring (Expert), Cloud Platforms (Intermediate), Database Technologies (Intermediate).

Dallas, TX30+ days ago

This is a high-impact role designed for a technical leader with nearly a decade of specialization in Dynatrace SaaS, tasked with architecting and automating large-scale monitoring solutions across complex AWS and Azure environments. AI-Driven Insights: Harness Davis AI for causal analysis and root cause identification; develop custom dashboards, alerting profiles, and auto-remediation workflows to minimize MTTR.

New!

Plano, Texas5 days ago

An important part of the Toyota family is Toyota Financial Services (TFS), the finance and insurance brand for Toyota and Lexus in North America.

Stand out to leading employers.

Upload your resume and let employers find you for new Reliability Engineer job openings. Plus, receive relevant job matches delivered straight to your inbox.

Create A Free Account

Reliability Engineer Jobs in Plano, TX

Site Reliability Engineering (SRE) Team Lead OneMain Financial

SRE Engineer / Site Reliability Engineer Specialist NTT DATA

Site Reliability Engineer NTT DATA

Site Reliability Engineer III StratAcuity Staffing Partners Inc

Site Reliability Engineer (SRE) StratAcuity Staffing Partners Inc

Senior Site Reliability Engineer (SRE) StratAcuity Staffing Partners Inc

Engineering - SRE Platforms - Site Reliability Engineer - Vice President - Dallas The Goldman Sachs Group Inc

Asset & Wealth Management - Site Reliability Engineer - Vice President - Richardson The Goldman Sachs Group Inc

Site Reliability Engineer I General Motors Financial Company, Inc.

Senior Principal Reliability Engineer Raytheon

Senior Site Reliability Engineer (SRE) - NC, TX StratAcuity Staffing Partners Inc

Reliability Engineer - OnePay Synchrony Financial

Senior Site Reliability Engineer The Charles Schwab Corp

Site Reliability Engineer - Platforms Toyota Motor Corp

Senior Site Reliability Engineer - Database Services Toyota Motor Corp

Senior Site Reliability Engineer Las Vegas Sands

Site Reliability Engineer eTeam Inc.

Reliability Engineer - Advanced Thermal Management Coherent Corp

Lead Site Reliability Engineer JPMorgan Chase & Co

Senior Lead Site Reliability Engineer - AI/ML and Data Platforms JPMorgan Chase & Co

Site Reliability Engineer III Vaco LLC

Cloud Site Reliability Engineer Stefanini International Holdings Ltd

Platform Reliability Engineer, Azure Wellfit Technologies

Lead Site Reliability Engineer JPMorgan Chase Bank, N.A.

Sr. Infrastructure Site Reliability Engineer The Charles Schwab Corp

Compliance Engineering, Site Reliability Engineer SRE, Associate, Dallas The Goldman Sachs Group Inc

Senior SRE (Site Reliability Engineer) Retail Industry

Site Reliability Engineer (Onsite Hybrid) NTT DATA Group Corp

Lead Site Reliability Engineer (GTAM) JPMorgan Chase & Co

Lead Site Reliability Engineer General Motors Financial Company, Inc.

Site Reliability Engineer, DNS Optimum Communications Inc

Sr. Engineer, Site Reliability - Retail Mobility Engineering T-Mobile US Inc

Site Reliability Engineer Lead Bank of America

Site Reliability Engineer, Azure Wellfit Technologies Inc

Vehicle Reliability Engineer Waabi

Site Reliability Engineer – Oracle DB Glint Tech Solutions LLC

Site Reliability Engineer Analytic Partners Inc

Senior Site Reliability Engineer Autodesk Inc

Site Reliability Engineer III JPMorgan Chase Bank, N.A.

Senior Lead Site Reliability Engineer JPMorgan Chase Bank, N.A.

Senior Site Reliability Engineer Navan Inc

Site Reliability Engineer Tech Lead Federal Home Loan Mortgage Corp

Senior AWS Cloud Site Reliability Engineer (SRE) with AWS Database experience Peraton Inc

Site Reliability Engineer - Platforms TCC Toyota Motor Credit Corporation Company

Hardware Reliability Engineer II (R4675) Shield AI Inc

Site Reliability Engineer (Onsite Hybrid) NTT DATA Services, LLC

Sr Site Reliability Engineer - Release Alkami Technology Inc

Site Reliability Engineer: 26-01136 Akraya Inc.

Senior SRE (Site Reliability Engineer) Vytwo

Senior Site Reliability Engineer - Database Services TCC Toyota Motor Credit Corporation Company

Similar Job Searches