Senior Site Reliability Engineer
The Senior Site Reliability Engineer (SRE) will have the primary responsibility for the availability and reliability of the clients Digital Products, ensuring they meet the requirements of our internal and external users, as well as participating as a Software Engineer for important innovation prototypes.
The role will be a trusted advisor for closely liaising with customers, developers, architects, and product owners to help automate system management requirements and quickly launch new digital products & prototypes. The Senior SRE will have responsibility to streamline, scale and optimize digital products infrastructure capability while improving reliability, performance, and cost.
ESSENTIAL JOB FUNCTIONS
Describe the primary duties and/or responsibilities to be performed, methods used, and why. Your statements should include WHAT you do, HOW you do it, and WHY you do it. Use action verbs such as operate, develop, verify, supervise, etc.
• Lead digital product development initiatives by collaborating with our business and innovation teams to continuously refine our K8s deployment practices for improved reliability, repeatability and security in our fast-paced, global business environment.
• Collaborate and coordinate directly with product developers, architects, and business teams to help continuously improve LivaNova digital products and customer experiences.
• Write code and scripts to automate provisioning of K8s services and configure services using tools and languages such as Git, Spinnaker, Terraform, Quay, Vault, Atlantis, Helm and ISTIO.
• Design effective monitoring, alerting and log aggregation to identify and respond to issues proactively using tools such as DataDog and Sentry.
• Configure build pipelines to support continuous integration and delivery by automating testing, deployment and business continuity using tools such as Spinnaker. Pipelines will be configured for specific digital products that will help optimize for performance, scalability and supportability.
• Clearly document and diagram architecture, security controls and K8s environments to ensure compliance with best practices and regulatory standards.
• Participate in end-to-end digital products delivery from system design consulting, platform management, infrastructure support, capacity planning and performance tuning.
• Monitors digital products platform health for up time, security, performance and end-user experience.
• Engage with key vendors in assessing technology fit with LivaNova's future technology architecture and provide recommendations.
• Troubleshoot issues in K8s environments by applying debugging and problem-solving techniques and partnering with product development teams, full-stack community and vendors.
• Recommend deployment patterns & practices improvement based on deployment learnings, production issues.
KNOWLEDGE, SKILLS AND ABILITIES REQUIRED
• Significant software engineering experience required with proven understanding of full Software Development Life Cycle (Waterfall and Agile)
• Demonstrable involvement with continuous improvement and automation initiatives
• Experience with distributed systems, maintenance, debugging
• Familiarity with infrastructure migration, virtualization, performance analysis, log storage systems and new functionality ennoblement
• Exposure to design/implementation of infrastructure, configuration, build, installation and running.
• Experience working with one or more cloud providers (AWS, GCP, Azure), exposure to multi-cloud is a bonus
• Proven experience with key container orchestration technologies and containerization principles
• Experience defining and implementing CI/CD process and tools
• Knowledge of production hosting DBs such as MySQL, PostGREs, MongoDB, neoj4s & Aerospike, and the impacts these have on application design
• Familiarity with CI/CD orchestration tools such as Kubernetes Operators, Helm, Spinnaker, Quay, & Terraform
• Strong understanding of K8s administration
• Networking expertise include VPCs, SDNs, VLANs, routers and firewalls
• Familiarity with at least one IAC/CM tool such as Terraform
• Familiarity with at least one code build/deploy tool such as Jenkins
• Experience with K8s on public cloud provider such as Azure, GCP and AWS
• Bachelor’s degree in Computer Science or related technical field involving coding (e.g., physics or mathematics), or equivalent practical experience
• 7+ years of work experience in distributed systems design, maintenance, and troubleshooting
• 4+ years of experience working in a DevOps environment
• 3+ years of software engineering in fast moving technical environments, preferably start-ups / scale-ups.
Apex Systems is an equal opportunity employer. We do not discriminate or allow discrimination on the basis of race, color, religion, creed, sex (including pregnancy, childbirth, breastfeeding, or related medical conditions), age, sexual orientation, gender identity, national origin, ancestry, citizenship, genetic information, registered domestic partner status, marital status, disability, status as a crime victim, protected veteran status, political affiliation, union membership, or any other characteristic protected by law. Apex will consider qualified applicants with criminal histories in a manner consistent with the requirements of applicable law. If you have visited our website in search of information on employment opportunities or to apply for a position, and you require an accommodation in using our website for a search or application, please contact our Employee Services Department at 844-463-6178