Cloud Systems Engineer

Cubesmart

Malvern, Pennsylvania

JOB DETAILS
LOCATION
Malvern, Pennsylvania
POSTED
5 days ago
Overview:

This is a hybrid role - 2 days remote and 3 days in the Malvern, PA office.

 

We are seeking a highly skilled Site Reliability & Cloud Systems Engineer to design, build, and operate scalable, secure, and highly automated cloud platforms in AWS. This role combines hands-on reliability engineering with cloud architecture and automation expertise, with a strong emphasis on building immutable infrastructure and improving system resilience.

You will play a key role in evolving our AWS ecosystem into a “push-button” platform—reducing manual operations, embedding security into every layer, and ensuring production systems are observable, performant, and self-healing. This role is well-suited for a proactive engineer who excels at the intersection of infrastructure, automation, and system reliability, blending responsibilities across SRE, DevOps, and Cloud Engineering.

 

Who we are:

At CubeSmart, we’re intentional about culture. You can experience it everywhere from our mission statement of “genuine care” to our “It’s What’s Inside That Counts” tagline to calling each other “teammates” rather than employees. This spirit fosters a fun and collaborative environment that has resulted in our rapid growth and being recognized amongst the top in our industry. 

 

CubeSmart’s award-winning team is made up of people who genuinely care. Teammates care about our customers and the life events and/or business needs they are facing. Teammates are passionate, responsible and understanding. The CubeSmart team is made up of people who have a can-do attitude, are committed to their own success and the success of the company, and lead by example. 

If this sounds like a team and culture that matches your personal values and motivations, we want to hear from you.

Responsibilities:

Reliability, Performance & Operations

  • Ensure uptime, reliability, and performance of AWS-hosted, Linux-based (Ubuntu) production systems and associated lower environments
  • Build and optimize observability using tools like Datadog, CloudWatch, Prometheus/Grafana, and PagerDuty
  • Working closely with the Dev teams, you will be diagnosing site issues, mitigating impact, and restoring system reliability while communicating clearly with stakeholders.
  • Lead incident response, root cause analysis, and post-incident reviews
  • Participate in on-call rotations and support 24/7 production environments

Cloud Architecture & Automation

  • Architect and implement fully automated, ephemeral, and immutable AWS production and lower environments
  • Design scalable, resilient distributed systems using AWS best practices
  • Eliminate manual processes through Infrastructure as Code (Terraform,  Ansible, Packer)
  • Build and maintain CI/CD and GitOps workflows (Jenkins, GitHub Actions, GitLab CI, ArgoCD/Flux)
  • Develop automation and tooling using Python and Bash to reduce operational toil

Infrastructure & Platform Engineering

  • Deploy and manage AWS services including EKS, ECS, Fargate, Lambda, and RDS (Aurora PostgreSQL), Opensearch, Redis,Elasticache
  • Design and manage networking components such as Transit Gateways, load balancers, and service meshes
  • Implement caching, microservices, and distributed system design patterns

Security & Governance

  • Architect and implement zero-trust security models using IAM, SCPs, and OIDC
  • Embed security into CI/CD pipelines using SAST/DAST tools (e.g., Snyk) 
  • Ensure compliance through automated auditing, backup strategies, and governance controls

Collaboration, Leadership & Strategy

  • Partner with development, security, and operations teams to build reliable, observable platforms
  • Document systems, runbooks, and operational procedures
  • Drive FinOps initiatives for cost optimization and forecasting
  • Integrate infrastructure changes into ITIL-compliant workflows (e.g., Freshservice)
  • Influence architectural decisions and promote engineering best practices across teams
Qualifications:
  • 6–10+ years of experience in Site Reliability Engineering, DevOps, or Cloud Engineering roles
  • Deep hands-on expertise with AWS services and cloud architecture
  • Strong Linux systems engineering experience (Ubuntu preferred)
  • Proven experience with Infrastructure as Code (Terraform, Ansible, etc.)
  • Experience building and maintaining CI/CD pipelines
  • Proficiency in scripting/programming (Python, Bash)
  • Hands-on experience with monitoring and observability platforms
  • Solid understanding of cloud security principles (IAM, KMS, Secrets Management, Ansible Vault, Hashicorp Vault)
  • Bachelor’s degree or equivalent practical experience

Preferred Qualifications

  • Experience with containerization and orchestration (Docker, Kubernetes, EKS/ECS)
  • Familiarity with GitOps tools such as ArgoCD or Flux
  • Experience with SAST/DAST tools and secure SDLC practices
  • Knowledge of distributed systems, caching, and microservices architectures
  • Experience with FinOps and cost optimization strategies
  • Exposure to ITIL processes and service management platforms

#LI-MT1

 

About the Company

C

Cubesmart