136 Results for

Reliability Engineer Jobs in Bastrop, TX

Hybrid - Ability to work remotely part of the week

Travel %.

Individual Contributor

Flexible Work Option.

TX

We provide reliable and fast internet to millions of users worldwide, including populations with little or no connectivity, rural communities, aircraft, watercraft, and places where existing services are unreliable, too expensive, or disconnected by natural disasters.

As a Hardware Reliability Engineer (PCB), you will contribute to Starlink's in-house printed circuit board (PCB) production line-the largest in North America-by supporting quality and reliability efforts from material selection through end-user performance.

TX
Remote

You will work closely with infrastructure, Engineering, DevOps, and security teams to build robust systems, automate operations, and implement best practices for incident response, monitoring, and disaster recovery. Unless explicitly requested or approached by SS&C Technologies, Inc. or any of its affiliated companies, the company will not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services.

The Charles Schwab Corp logo

Austin, TX

p>Roles & Responsibilities:

  • Lead automation-first initiatives to eliminate toil and manual interventions, defining and executing the strategic roadmap for reliability, observability, and self-healing systems across AI.x platforms.
  • Implement comprehensive observability frameworks for real-time monitoring of AI services, including metrics, logs, and traces, with intelligent alerting and automated diagnostics to minimize MTTD and MTTR.

What We Offer:
Competitive salary and equity options.
Comprehensive benefits package, including health, dental, and retirement plans.
A dynamic work environment that fosters creativity and innovation.
Opportunities for professional growth and development in a rapidly evolving industry.
Relocation assistance.
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses and identifying potential inconsistencies or verification signals in application materials based on available information. You’ll qualify suppliers, drive root-cause investigations, and build the quality systems that let us scale — while simultaneously defining reliability requirements, running life-test programs, and ensuring every unit we ship meets the demanding uptime expectations of AI infrastructure customers.

Austin, Texas

As a globally distributed, fast-growing startup, we’re committed to building a diverse and inclusive team that reflects the wide range of perspectives needed to solve the world’s hardest connectivity problems. You'll collaborate with product engineering teams to improve system resilience, lead and develop incident management processes and build observability solutions for our unique distributed architecture.

Austin, TX

You'll play a critical role in building scalable infrastructure patterns, advancing observability, improving incident response, and partnering with engineering teams to embed reliability into system design and delivery. At CertifID, we take this threat seriously and provide a secure platform that verifies the identities of parties involved in transactions, authenticates wire transfer instructions, and detects potential fraud attempts.

Austin, Texas

div>Job Posting Start Date 05-26-2026 Job Posting End Date 06-30-2026

Flex is the diversified manufacturing partner of choice that helps market-leading brands design, build and deliver innovative products that improve the world.

This role provides technical leadership for inputs, data structure, and reports & analysis within the Corrective Maintenance Management System (CMMS) and other systems used to calculate OEE, Downtime, Line & Machine Utilization metrics.

TX

p>PREFERRED SKILLS AND EXPERIENCE:

• Master's degree in electrical engineering, computer engineering, physics, or other STEM discipline • Experience with both electronics and mechanical design and analysis including knowledge of RF principles • Experience analyzing circuits and PCBAs, and developing functional test plans • Experience with test equipment and measurement techniques to verify and validate product requirements (oscilloscope, multimeter, electronic load, spectrum analyzer, network analyzer, vector signal generator, source measurement unit, etc.) • Experience in environmental testing such as HALT/HASS, thermal, humidity, shock & vibration • Strong understanding of computers and programming languages (Python, C/C++) • Thorough understanding of electronics reliability, manufacturing, and failure mechanisms • Knowledge of quality tools such as Lean principles, Six Sigma, and root cause analysis methods • Thorough understanding of metrology, sources of measurement error, and uncertainty analysis.

As an Electrical Test and Reliability Engineer on Starlink, you'll contribute to the design, build, and test electronics for the worlds largest network of ground stations, which function as an intermediary between the satellites and our end-users to ensure smooth data transfer and transmission.

TX

p>ITAR REQUIREMENTS:

  • To conform to U.S. Government export regulations, applicant must be a (i) U.S. citizen or national, (ii) U.S. lawful, permanent resident (aka green card holder), (iii) Refugee under 8 U.S.C. § 1157, or (iv) Asylee under 8 U.S.C. § 1158, or be eligible to obtain the required authorizations from the U.S. Department of State.
  • Allocate resources appropriately to ensure successful day-to-day operations to process customer returns, troubleshoot failures and refurbish units and support production quality milestones.

TX

Electrical Test and Reliability Engineer on Starlink, you'll contribute to the design, build, and test electronics for the world's largest network of ground stations, which function as an intermediary between the satellites and our end-users to ensure smooth data transfer and transmission.

Experience with test equipment and measurement techniques to verify and validate product requirements (oscilloscope, multimeter, electronic load, spectrum analyzer, network analyzer, vector signal generator, source measurement unit, etc.).

Rocket Companies Inc logo

TX
Remote
  • $180,100–$278,700 / year

This role requires depth in design, collaboration with internal teams, a proactive approach to problem solving, and the ability to share complex ideas with senior leadership and secure their support.

  • You have a proven history in architecting, building, scaling, and supporting cloud infrastructure technologies, specializing in database and storage services and can communicate the direct business impact of this work.

  • Austin, TX
    • $111,600–$186,000 / year

    Cox Automotive employees get to work on iconic consumer brands like Autotrader and Kelley Blue Book and industry-leading dealer-facing companies like vAuto and Manheim, all while enjoying the people-centered atmosphere that is central to our life at Cox.

    Through groundbreaking technology and a commitment to stellar experiences for drivers and dealers alike, Cox Automotive employees are transforming the way the world buys, owns, sells - or simply uses - cars.

    Apple Inc logo

    TX

    p>Define and track key service level indicators (SLIs) and service level objectives (SLOs) to measure and improve service reliability 3+ years of experience in a Site Reliability Engineering, DevOps, or related role, supporting large-scale, enterprise-level services. You will collaborate with Engineers, Data Engineers, DBAs, and network specialists to proactively identify and resolve potential issues, automate repetitive tasks, and drive continuous improvement initiatives.

    Austin, TX
    • $113,300–$205,520 / year

    With Jamf, customers are able to confidently automate Mac, iPad, iPhone and Apple TV deployment, management, and security - anytime, anywhere - to protect the data and applications used by employees in the workplace, students learning in the classroom, and streamline communications in healthcare between patients and providers.

  • Set the conditions for AI agents to do reliable work in our environment, including repository context, well-specified tasks, integrations such as MCP servers that give AI safe access to the systems it needs, and the tests and guardrails needed for AI-authored change to be trusted.

  • Manager - Non People Leader

    Flexible Work Option.

  • Must have 6 years of experience in each of the following:
    • Working with observability platforms, including NewRelic, Cloudwatch, Grafana, or DataDog;
    • Working in AWS and CI/CD tooling and capabilities;
    • Using Infrastructure as Code with Terraform or AWS CloudFormation;
    • Working in one of the following programming languages: C#, Java, or Python; and.

  • Austin, TX

    You will remain hands-on with production systems while setting direction, driving operational excellence, and fostering a strong team culture focused on ownership, reliability, and continuous improvement.

    TeamViewer provides a leading Digital Workplace platform that connects people with technologyenabling, improving and automating digital processes to make work work better.

    The Charles Schwab Corp logo

    Austin, TX

    li>5+ years of experience leading the implementation and scaling of reliability engineering practices such as service level objectives, monitoring strategies, incident reviews, and automation-driven improvements. You will lead efforts to elevate production operations through modern Site Reliability Engineering practices, shaping how engineering teams design, build, and operate resilient systems at scale.

    Austin, TX
    • $98,583–$138,016 / year

    You will collaborate with DevOps, development, and infrastructure teams to resolve moderately complex issues, propose improvements, and strengthen the reliability, scalability, and security of our SaaS platform.

  • Experience with cloud platforms (Azure or AWS), including services such as AKS, ECS/EKS, Functions/Lambda, S3, and Blob storage.

  • Visa Inc logo

    Austin, TX
    Remote
    • $131,600–$210,300 / year

    The ideal candidate will be capable of leading the design and delivery of enterprise-grade cloud solutions using Azure-native and hybrid-cloud patterns, and of driving best practices for reliability, security, and operational excellence across the data platform.

    This role requires demonstrated experience architecting, implementing, and optimizing Azure-based platforms and services, including cloud networking, compute, storage, identity and access management, observability, and container orchestration.

    New

    Austin, TX

    You will work on critical platform systems including EKS infrastructure, Skyway (CI/CD), Frontdoor (Tyk API Gateway), Pantheon (Apollo GraphQL Federation), and our observability stack, while contributing to chaos engineering practices and cost optimization initiatives with measurable ROI. Through its robust suite of tools, Realtor.com not only makes a significant impact on the real estate industry at large, but for consumers, navigating the biggest purchase they will make in their life, by providing a user experience that is easy to use, easy to understand, and most of all, easy to make decisions.

    Visa Inc logo

    Austin, TX
    Remote
    • $131,600–$210,300 / year

    The ideal candidate will be capable of leading the design and delivery of enterprise-grade cloud solutions using Azure-native and hybrid-cloud patterns, and of driving best practices for reliability, security, and operational excellence across the data platform.

    This role requires demonstrated experience architecting, implementing, and optimizing Azure-based platforms and services, including cloud networking, compute, storage, identity and access management, observability, and container orchestration.

    Austin, TX
    Remote
    • $100,000–$120,000 / year

    Identify and eliminate toil through automation, tooling, and improved workflows Partner with product and platform teams on architecture decisions, production readiness, and designing systems that recover from failure Build reusable systems and "paved roads" that make it easier for teams to operate their services reliably Mentor other engineers and raise the overall operational maturity of the organization Qualifications 6 - 10+ years of experience in SRE, infrastructure, or backend systems engineering Demonstrated experience of owning reliability outcomes for complex, distributed systems Strong experience with cloud infrastructure (AWS, GCP, or Azure) and production-scale systems Deep understanding of observability, incident management, and system performance Proficiency in at least one programming language (e.g., Go, Python, Java) with a focus on automation and tooling Able to change how other teams work without having managerial authority over them Strong competency in making clear decisions during incidents by following a defined process without reacting emotionally. Responsibilities Lead efforts to improve system reliability, scalability, and performance across critical services Define and implement SLIs/SLOs and error budgets, and use them to guide engineering priorities Design and develop observability systems (metrics, logging, tracing, alerting) that produce actionable alerts and data with minimal noise Lead complex incident response, acting as incident commander when needed Conduct postmortems focused on systemic causes rather than individual fault, and ensure corrective actions from those reviews are completed.

    Austin, TX
    • $70,480–$179,090 / year

    p>Samsung Austin Semiconductor is looking for a BEOL reliability engineer with technical expertise, who can drive and ensure the robustness of reliability for advanced CMOS technology.

    Here's What You'll Be Responsible For:

    • Provide technical leadership for reliability improvement activities, collaborating closely with cross-functional teams.

    New

    Austin, TX

    You will work on critical platform systems including EKS infrastructure, Skyway (CI/CD), Frontdoor (Tyk API Gateway), Pantheon (Apollo GraphQL Federation), and our observability stack, while contributing to chaos engineering practices and cost optimization initiatives with measurable ROI. Through its robust suite of tools, Realtor.com not only makes a significant impact on the real estate industry at large, but for consumers, navigating the biggest purchase they will make in their life, by providing a user experience that is easy to use, easy to understand, and most of all, easy to make decisions.

    The Charles Schwab Corp logo

    Austin, TX

    This role requires a balance of strategic thinking and hands-on problem-solving to optimize systems, reduce operational toil, and improve key metrics such as MTTD and MTTR, ultimately ensuring a seamless and reliable experience for clients. As a Sr Specialist - Site Reliability Engineer (SRE) within Client Data Technology, you will play a critical role in ensuring the availability, performance, and resiliency of highly visible cloud-based platforms and applications.

    The Charles Schwab Corp logo
    New

    Austin, TX

    p>As a Senior Reliability Engineer, you'll play a pivotal role in shaping the reliability and scalability of our mission-critical applications, collaborating across teams to deliver solutions that matter.

  • Collaborate with Engineering, Scrum, and Operations teams to provide technical expertise and support key initiatives for system availability and reliability.

  • New

    TX

    li>Work closely with other SpaceX engineers to gather requirements, research, evaluate, design, plan, deploy, and support software platforms and related technologies running in Kubernetes within a world-class environment that meets the needs of the demanding SpaceX engineering teams.

    ITAR REQUIREMENTS:

    • To conform to U.S. Government export regulations, applicant must be a (i) U.S. citizen or national, (ii) U.S. lawful, permanent resident (aka green card holder), (iii) Refugee under 8 U.S.C. § 1157, or (iv) Asylee under 8 U.S.C. § 1158, or be eligible to obtain the required authorizations from the U.S. Department of State.

    Apple Inc logo

    Austin, TX

    p>In this highly visible position, you will:Innovate, architect, build, and document highly available, scalable, reliable, secure Infrastructure Troubleshoot application specific, network, system & performance issues Build and maintain CI/CD infrastructure to enable fast delivery cycles for software engineering teams Envision and build automation tools to deliver infrastructure services reliably and in a repeatable fashion Collaborate with other site reliability engineers, software engineers, quality engineers, to gather, define, and analyze non-functional/technical requirements5+ years of experience in designing and building resilient, large-scale, low latency, cloud and on-prem Infrastructure including Compute, Storage, and Network 3+ years of experience with deploying/managing Kubernetes using Helm Experience with Shell Scripting, Python, or Ansible Experience in monitoring using Splunk, Grafana, Prometheus, Alertmanager Deep understanding of networking protocols: DNS, TCP, HTTP/HTTPS Experience in setting up and managing CI/CD pipelines Bachelors or Masters in Computer Science or equivalent experience Excellent problem solving, critical thinking, and interpersonal skills Good communication skills to collaborate with distributed teams Experience with Cassandra, MongoDB, Couchbase databases, AWS S3 or similar storage technologies Experience in deploying, monitoring and supporting java applications Experience with ArgoCD and GitOps model Experience in defining, monitoring and achieving key operational metrics like MTTR and SLO Experience with GenAI tools in workflow automation for infrastructure management Ability to learn new technologies in a short time. In this role you will design, build and deliver highly scalable, reliable, secure cloud infrastructure which powers the applications and services used by Apple's customers every day.

    New

    Austin, TX

    li>

    Must have 6 years of experience in each of the following:

  • Working with observability platforms, including NewRelic, Cloudwatch, Grafana, or DataDog;

  • Working in AWS and CI/CD tooling and capabilities;

  • Using Infrastructure as Code with Terraform or AWS CloudFormation;

  • Working in one of the following programming languages: C#, Java, or Python; and. Cox Automotive employees get to work on iconic consumer brands like Autotrader and Kelley Blue Book and industry-leading dealer-facing companies like vAuto and Manheim, all while enjoying the people-centered atmosphere that is central to our life at Cox.

  • Austin, TX

    li>Broad technical experience across infrastructure and distributed systems, with the ability to design effective solutions, apply appropriate patterns, and anticipate scaling, reliability, and operational challenges .

    We are seeking a capable, motivated generalist who thrives in a change-controlled, compliant environment and enjoys working across hybrid cloud and on-premises systems.

    Austin, TX
    Remote
    • $131,600–$210,300 / year

    The ideal candidate will be capable of leading the design and delivery of enterprise-grade cloud solutions using Azure-native and hybrid-cloud patterns, and of driving best practices for reliability, security, and operational excellence across the data platform. This role requires demonstrated experience architecting, implementing, and optimizing Azure-based platforms and services, including cloud networking, compute, storage, identity and access management, observability, and container orchestration.

    Austin, TX
    • $140,000–$170,000 / year

    The ideal candidate will be a systems problem solver with a passion for crafting products that deliver incredible customer experiences, have deep experience with infrastructure, operational automation, data driven metrics collection, modern platform management, and a true desire to automate it rather than do it repeatedly. Role Description: As a Site Reliability Engineer, you will work with Agile engineering teams to provide production insight into running and operating software at-scale in a globally distributed and highly available cloud based system.

    Austin, TX

    li>Built, using, and automating monitoring systems such as NewRelic, DataDog, SignalFX, Kibana,

  • Hands-on experience deploying, operating, and monitoring production-grade AI/ML microservices (e.g., RAG pipelines, agentic systems) on cloud platforms like AWS Fargate/ECS.
  • Hands-on experience building and operating distributed systems in a public cloud environment (preferably AWS), using CI/CD to deploy, manage and operate production systems, focusing on tooling and automation using tools such as maven and Jenkins.

  • Apple Inc logo

    Austin, TX

    You can explain a complex system to a room of engineers who didnt build it Experience building internal automation or self-service tooling (Slack bots, CLI tools, workflow orchestration) that reduced manual operational workBS in Computer Science, Engineering, or equivalent practical experience, with 7+ years of experience in distributed systems Experience with event-driven architectures (Kafka, RabbitMQ, or similar messaging systems) Experience with service mesh or API gateway patterns (Istio, Envoy, Kong, or similar) Familiarity with Django/Python web applications and their operational characteristics (Celery, Gunicorn, PostgreSQL) Experience with observability tooling beyond basic monitoring: distributed tracing, SLO frameworks, structured logging Background working with sensitive data (health data, PII) and associated compliance requirements Experience leading incident response and building on-call culture Contributions to internal or open-source infrastructure tooling. Our team operates 50+ services across Kubernetes and AWS, handles sensitive health and research data, and is ramping up many architectural shifts: new service-to-service auth patterns, event-driven pipelines, and a move from on-prem to cloud-native infrastructure.

    Texas, TX
    Remote
    • $104,900–$174,700 / year

    Required Qualifications: 5+ years of hands-on experience in SRE, DevOps, or Infrastructure Engineering roles Strong production experience in AWS Required: Significant hands-on experience with Terraform in real-world environments Experience operating monitoring and uptime platforms such as Grafana, Pingdom, and Uptrends Strong Linux systems, networking, and troubleshooting skills Experience supporting production systems through incident response and on-call rotations Proficiency with GitHub and modern Git workflows Experience building or maintaining CI/CD pipelines with Azure DevOps Familiarity with ITSM and incident workflows using ServiceNow Strong written communication skills with experience documenting systems and processes in Confluence Ability to work independently in a remote or hybrid environment. Preferred Qualifications: Experience defining and operating against SLOs and error budgets Infrastructure-as-Code best practices beyond Terraform (modules, testing, CI integration) Experience with containers and orchestration (Docker, Kubernetes) Experience supporting large-scale, high-availability production systems Prior experience mentoring engineers or serving as a technical lead.

    Austin, TX

    p>This role provides technical leadership for inputs, data structure, and reports & analysis within the Corrective Maintenance Management System (CMMS) and other systems used to calculate OEE, Downtime, Line & Machine Utilization metrics.

    Reporting to Manager, as part of the site engineering team, the Equipment Reliability Engineer is responsible for Equipment Reliability Engineering across a diverse manufacturing operation.

    Visa Inc logo

    Austin, TX
    Remote
    • $152,200–$243,700 / year

    Masters, MBA, JD, MD) or 3 or more years of experience with a PhD • Familiarity and experience with DevOps ways of working • Familiarity and experience building, managing, and operating CICD platforms • Familiarity and experience with GitOps technologies • Familiarity and experience with Azure cloud services • Experience leading teams of technical professionals.

    Visa is a world leader in payments technology, facilitating transactions between consumers, merchants, financial institutions and government entities across more than 200 countries and territories, dedicated to uplifting everyone, everywhere by being the best way to pay and be paid.

    Austin, TX

    THE PERSON: Demonstrated success working in product Quality and Reliability, with strong familiarity of Semiconductor IC Product Development, Silicon Fab process, Package Reliability, high volume manufacturing, external supplier and customer interactions, and project management. THE ROLE: Join a global product reliability team that drives silicon and package qualifications to develop the next generation artificial intelligence (AI) capabilities within a single package, leveraging advanced silicon and packaging technologies.

    Austin, TX
    • Full time

    p style="line-height:1.38;margin-top:16px;margin-bottom:16px">About ShipperHQ:

    ShipperHQ is a trusted leader in the e-commerce shipping space, with over 15 years of experience helping merchants deliver better checkout experiences.

    Qualifications:

    • 5+ years of experience in Software Engineering, Site Reliability Engineering, DevOps, or similar roles.

    Apple Inc logo

    Austin, TX

    Practical fluency in applying Generative AI tools within SRE and software engineering workflows - from accelerating observability query construction and alert design to building AI-assisted debugging and triage capabilities that encode institutional knowledge into repeatable, context-aware workflows - with the engineering rigour to validate, own, and iterate on AI-assisted outputs in production-adjacent contexts. Proven ability to automate repetitive tasks and complex workflows using Python or Go Experience configuring and managing modern monitoring suites (e.g., Prometheus, Grafana, ClickHouse) with a focus on creating actionable, high-signal quality alerting.

    Apple Inc logo

    Austin, TX

    p>The cross functional team collaborates to ensure we apply a consistent incident management process across all data platform services and provide user journey based SLOs derived from exhaustive observability metrics, high availability architecture, and automation for deployments. Our Data Platform Site Reliability Engineering team manages the infrastructure and applications on bare-metal and cloud computing platforms to deliver data processing, governance, and storage for many of Apple's global products and organizations.

    Stand out to leading employers.

    Upload your resume and let employers find you for new Reliability Engineer job openings. Plus, receive relevant job matches delivered straight to your inbox.

    Create A Free Account