Site Reliability Engineer (SRE)

System One

McLean, VA(remote)

JOB DETAILS
SKILLS
Adobe Creative Suite, Agile Programming Methodologies, Amazon Web Services (AWS), Analysis Skills, Ansible, Automation, Bash Scripting, Best Practices, Business Loans, Cloud Applications, Cloud Computing, Code Reviews, Communication Skills, CompTIA - Computing Technology Industry Association, Continuous Deployment/Delivery, Continuous Integration, Cross-Functional, DevOps, Government Contracts, Identify Issues, Incident Management, Incident Response, Kanban, Linux Operating System, Mentoring, Microsoft Windows Operating System, Operational Improvement, Operations Processes, Order Picking/Packing, Outsourcing, Problem Solving Skills, Python Programming/Scripting Language, Red Hat Linux Operating System, Reliability Engineering, Risk Analysis, Root Cause Analysis, Scripting (Scripting Languages), Scrum Project Management and Software Development, Service Delivery, Software Engineering, Source Code/Configuration Management (SCM), Team Player, Technical Writing, Windows PowerShell
LOCATION
McLean, VA(remote)
POSTED
1 day ago

Site Reliability Engineer (SRE)

Remote
No sponsorship available. Must be able to obtain a Public Trust clearance.

What You Will Do

We are seeking a Site Reliability Engineer (SRE) to support the SBA Disaster Lending Platform modernization effort in a remote capacity. This role will help establish and mature SRE practices across AWS cloud environments, with a focus on reliability, automation, scalability, observability, incident response, and operational excellence.

In this role, you will work closely with engineering, DevOps, cloud, security, and product teams to improve system resilience, reduce downtime, strengthen deployment practices, and support reliable cloud-based application delivery in an Agile environment.

Responsibilities include:

• Help establish and mature SRE practices within an Agile Scrum delivery environment.
• Support system design reviews to identify reliability risks, failure points, scalability concerns, and opportunities for automation.
• Improve operational readiness by contributing to code reviews, deployment reviews, monitoring practices, and reliability-focused engineering standards.
• Support incident management activities, including troubleshooting, root-cause analysis, mitigation planning, and post-incident improvements.
• Build and maintain automation to improve reliability, reduce manual effort, and support self-healing cloud infrastructure.
• Support AWS cloud platform operations across monitoring, logging, security, scalability, and availability.
• Work with CI/CD and Infrastructure as Code tools to support repeatable, secure, and reliable deployments.
• Create and maintain clear technical documentation for systems, processes, runbooks, and operational procedures.
• Collaborate with cross-functional teams and stakeholders to promote DevOps, automation, and reliability best practices.

What You Will Need

• Minimum of four years of experience supporting the reliability, scalability, security, and operational excellence of AWS cloud platforms.
• Bachelor’s degree required, or four additional years of relevant experience in lieu of a degree.
• Hands-on experience with CI/CD and Infrastructure as Code tools such as Terraform, Ansible Automation Platform, GitLab, Artifactory, and Packer.
• Strong scripting and automation experience using Python, PowerShell, and Bash; Python experience is preferred.
• Experience supporting Windows and Linux environments.
• Strong understanding of networking concepts, cloud troubleshooting, monitoring, logging, and incident response.
• Experience designing, deploying, or supporting cloud-based systems with a focus on reliability, scalability, security, and performance.
• Knowledge of source control best practices.
• Experience working in Agile delivery environments, including Scrum, Kanban, SAFe, or similar methodologies.
• Strong analytical, troubleshooting, and problem-solving skills, including the ability to resolve complex technical issues in high-pressure situations.
• Strong communication skills and the ability to collaborate effectively with technical teams, stakeholders, and cross-functional partners.
• Must be authorized to work in the United States without sponsorship and able to obtain a Public Trust clearance.

Nice to Have

• Current or prior government contracting experience.
• Red Hat, CompTIA, AWS, or related technical certifications.
• Experience mentoring technical teams or helping promote DevOps/SRE practices across engineering groups.

System One, and its subsidiaries including Joulé and Mountain Ltd., are leaders in delivering outsourced services and workforce solutions across North America. We help clients get work done more efficiently and economically, without compromising quality. System One not only serves as a valued partner for our clients, but we offer eligible employees health and welfare benefits coverage options including medical, dental, vision, spending accounts, life insurance, voluntary plans, as well as participation in a 401(k) plan.

System One is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, age, national origin, disability, family care or medical leave status, genetic information, veteran status, marital status, or any other characteristic protected by applicable federal, state, or local law.

#M1
#LI-CS1
Ref: #851-Rockville-S1


About the Company

S

System One

Every day, System One focuses on services and solutions that require a high degree of specialization, in-demand technical skills, and large-scale operational expertise. We are essential partners to those on the front lines of our nation’s most critical infrastructure, technology, and life sciences initiatives. 

Founded more than 40 years ago as a staffing partner to the engineering industry, today System One is a diversified organization operating in over 50 locations and putting more than 9,000 people to work in the United States, Canada, and the United Kingdom.

COMPANY SIZE
2,500 to 4,999 employees
INDUSTRY
Staffing/Employment Agencies
WEBSITE
https://systemone.com