Senior Cloud Systems Engineer

Lunar Outpost Inc.

Golden, CO

JOB DETAILS
SKILLS
Administrative Skills, Amazon Web Services (AWS), Autoscaling, Best Practices, Budgeting, Business Operations, Cloud Computing, Continuous Deployment/Delivery, Continuous Integration, DNS (Domain Name System), DevOps, Genetics, GitHub, High Availability, Human Resources, Instrumentation, Jenkins, LifeTime Value (LTV), Load Balancing, Metrics, Network Administration/Management, Performance Metrics, Production Systems, Record Keeping, Release Management/Engineering, Robotics, Service Level Agreement (SLA), Startup, Systems Engineering, Volume Manager, Writing Skills
LOCATION
Golden, CO
POSTED
7 days ago

Are you passionate about shaping the future of humanitys presence in space? Lunar Outpost, an industry leader in space robotics and planetary vehicles, invites you to join our team! Lunar Outpost is dedicated to creating a permanent presence in space, while also driving positive impacts here on Earth. Currently, we are seeking a Senior Cloud Systems Engineer to contribute to our mission in a dynamic startup environment.The main responsibilities of this role include managing Stargate deployments in production, ensuring high availability and uptime, executing reliable releases, and driving operational excellence through comprehensive monitoring, metrics, and infrastructure management. Stargate is a next-generation Command and Control (C2) platform-the ground software that enables and empowers all Lunar Outpost missions, including the Lunar Terrain Vehicle (LTV) program. As mission-agnostic software used by all operators in mission control, Stargates reliability and uptime are critical to mission success.

Take the #NextLeap with Lunar Outpost and work on the Pegasus LTV, which will carry NASA astronauts farther than theyve ever been before on the lunar surface!

Key Responsibilities:

Own and manage Stargate production releases and deployment pipelines using GitOps practices

Drive operational excellence initiatives including metrics collection, log aggregation, uptime monitoring, KPI tracking, and SIM Integration

Maintain and achieve 99.99% (four nines) to 99.999% (five nines) uptime SLAs

Design, develop, and maintain Helm charts for Stargate and related infrastructure components

Implement and manage progressive deployment strategies including canary deployments and blue-green deployments

Oversee critical Kubernetes infrastructure including volume management, DNS configuration, load balancer provisioning, and secret monitoring/management

Manage and optimize Kubernetes deployments and related AWS services

Implement and maintain observability stack using OpenTelemetry for comprehensive monitoring and alerting

Collaborate with engineering teams to establish and enforce operational best practices and reliability standards

Required Qualifications:

5+ years of production DevOps/SRE experience with demonstrable track record of maintaining high-availability systems

Kubernetes administration experience with elevated cluster access in production environments

Strong proficiency writing and maintaining Helm charts for complex, multi-component applications

Hands-on experience implementing canary deployments, blue-green deployments, and other progressive delivery patterns

Deep knowledge of Kubernetes infrastructure management: persistent volumes, DNS/networking, load balancers, and secrets management

Production experience with GitOps workflows and Flux CD

Proven track record maintaining 99.99%+ uptime in production environments

Excellent judgment and decision-making skills when working with production systems

Preferred Qualifications:

Experience with AWS cloud services, particularly EKS (Elastic Kubernetes Service), Secrets Manager, VPC networking, IAM, and AWS Load Balancers

Experience with Karpenter for Kubernetes node autoscaling and cluster optimization

Experience with OpenTelemetry instrumentation and observability platforms

Kubernetes certifications (CKA, CKAD, or CKS)

Experience building and maintaining CI/CD pipelines (GitHub Actions, Jenkins, GitLab CI, etc.)

Knowledge of infrastructure-as-code tools (Terraform, CDK)

Experience implementing SRE practices including SLIs, SLOs, and error budgets

Compensation & Benefits: Compensation level and base salary are competitively structured and thoughtfully determined based on factors such as relevant skills, experience, education, and the scope of the role.

Comprehensive health coverage: Medical, dental, and vision benefits, with 70% of premiums covered by the employer Paid time off: Three (3) weeks per year of vacation Retirement plan: Up to 4% employer match on 401(k) contributions Paid holidays: 11 company-recognized holidays Parental leave Educational reimbursement opportunities to support company objectives, continued learning, and career development

Lunar Outpost Inc. is an equal opportunity employer. Lunar Outpost Inc. does not discriminate on the basis of race, color, religion, sex (including pregnancy, sexual orientation, and gender identity), national origin, ethnicity, age, disability, veteran status, genetic information, or any other characteristic protected by applicable law. All employees, including executives and human resources personnel, are expected to conduct themselves with professionalism and treat others with dignity and respect in accordance with this policy.

About the Company

L

Lunar Outpost Inc.