PROCESSING APPLICATION
Hold tight! We’re comparing your resume to the job requirements…
ARE YOU SURE YOU WANT TO APPLY TO THIS JOB?
Based on your Resume, it doesn't look like you meet the requirements from the employer. You can still apply if you think you’re a fit.
Job Requirements of Lead Site Reliability Engineer:
-
Employment Type:
Full-Time
-
Manage Others:
Unknown
-
Location:
Marlborough, MA (Onsite)
Do you meet the requirements for this job?
Lead Site Reliability Engineer
Join our team of more than 34,000 team members, supporting our members and communities in our Club Support Center, 235+ clubs and eight distribution centers. BJ’s Wholesale Club offers a collaborative and inclusive environment where all team members can learn, grow and be their authentic selves. Together, we’re committed to providing outstanding service and convenience to our members, helping them save on the products and services they need for their families and homes.
The Benefits of working at BJ’s
• BJ’s pays weekly
• Eligible for free BJ's Inner Circle and Supplemental membership(s)*
• Generous time off programs to support busy lifestyles*
o Vacation, Personal, Holiday, Sick, Bereavement Leave, Jury Duty
• Benefit plans for your changing needs*
o Three medical plans**, Health Savings Account (HSA), two dental plans, vision plan, flexible spending
• 401(k) plan with company match (must be at least 18 years old)
*eligibility requirements vary by position
**medical plans vary by location
As a Lead Site Reliability Engineer (SRE) for BJ’s Wholesale Club’s digital team, you will play a critical role in ensuring the reliability, scalability, and performance of our digital platforms, including bjs. com, mobile apps, and in-club digital capabilities. Supporting the full lifecycle of digital experiences — from order placement to fulfillment — you will lead efforts to enhance system reliability, implement SRE best practices, and foster collaboration across teams.
Key Responsibilities:
- Ensure Reliability and Scalability: Lead the design, deployment, and management of highly reliable and scalable systems for BJ's digital properties, including e-commerce platforms and mobile applications.
- Full-Stack Expertise: Responsible for analyzing and fixing issues for Java-based microservices, React applications, and respective backend services
- Proactive Monitoring and Incident Response: Build and maintain monitoring systems using tools like New Relic, dataset, or similar to identify and resolve issues before they impact members.
- Collaboration and Leadership: Drive cross-functional collaboration with development and operations teams to align on goals for system performance and reliability.
- SRE Methodologies: Implement SRE principles to enhance service level objectives (SLOs), service level indicators (SLIs), and ensure a seamless customer experience.
- Automation and Optimization: Automate repetitive tasks and optimize workflows to improve operational efficiency and reduce toil.
- Incident Management: Lead root cause analyses for critical production incidents, generating detailed RCA reports and driving actions to prevent recurrence.
- Change Management: Ensure reliable deployments by following structured change management processes and leveraging version control systems.
- Security and Compliance: Collaborate with security teams to ensure systems and applications meet stringent security standards.
Qualifications:
- Educational Background: Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience).
- Experience: At least 6+ years in SRE or related roles, with experience in large-scale e-commerce or digital platforms.
-
Technical Skills:
- Strong hands-on knowledge of Java-based microservices and APIs.
- Proficiency in observability tools (e. g., New Relic, Splunk).
- Hands-on experience with automation scripts (Bash, Python).
- Experience with change management tools like terraform etc
- Knowledge in CI/CD, containerization (e. g., Docker, Kubernetes), and cloud technologies.
-
Soft Skills:
- Exceptional communication skills to work effectively with cross-functional teams and leadership.
- Strong analytical and problem-solving abilities.
- Demonstrated ability to learn and adapt to new tools and technologies quickly.
Job Conditions:
- Collaborate within a dynamic and innovative team environment.
- Participate and initiate cross-training and knowledge-sharing initiatives.
- Be part of an escalation on-call rotation to provide 24/7 support for critical systems and lead team during critical issues.
Recommended Skills
- Prototype (Manufacturing)
- Maintainability
- Process Control
- Reliability
- Calibration
- Electrical Engineering
Help us improve CareerBuilder by providing feedback about this job: Report this job
Job ID: AC02-17467_R147855-2
CareerBuilder TIP
For your privacy and protection, when applying to a job online, never give your social security number to a prospective employer, provide credit card or bank account information, or perform any sort of monetary transaction. Learn more.
By applying to a job using CareerBuilder you are agreeing to comply with and be subject to the CareerBuilder Terms and Conditions for use of our website. To use our website, you must agree with the Terms and Conditions and both meet and comply with their provisions.