Create a Job Alert.

Enter your email below to save this search and receive job recommendations for similar positions.
Thank you. We'll send jobs matching these to
You already suscribed to this job alert.
No Thanks
US
What job do you want?

Site Reliability Engineer job in Boston at

Create Job Alert.

Get similar jobs sent to your email

Apply to this job.
Think you're the perfect candidate?
Apply Now

You’re being taken to an external site to apply.

Enter your email below to receive job recommendations for similar positions.
Site Reliability Engineer at

Site Reliability Engineer

Boston, MA Contractor
$110,500.00 - $155,000.00 / year
Apply Now

Create Job Alert.

Get similar jobs sent to your email

Hi

Hope you are doing well

Immediate need Site Reliability Engineer @ Boston,MA once Covid clears

Duration: Long Term Boston once Covid clears



We are looking for a Senior Site Reliability Engineer with a proven track record for delivering software infrastructure while working closely with software engineering teams.

You will develop, maintain and scale our infrastructure for deploying and monitoring our software using the latest tools and methodologies, including agile, CI/CD, and infrastructure as code. More specifically, you will help constructing from ground up and configuring the Azure cloud infrastructure, common core services, and observability tools that will be used by our engineers and other SREs.


Key Responsibilities:

• Contribute to the advancement of both software development and cloud infrastructure efforts

• Partner with developers to apply best practices to ensure full working test and production environments using logging/monitoring tools (ie. Prometheus, Grafana, home-grown), alerting/notification tools (ie. Opsgenie), and any other tools that help reduce time-to-detect/time-to-mitigate & tools for disaster recovery, high availability, and business continuity

• Design, build and maintain CI/CD, testing, and operations infrastructure for our systems

• Create documentation, runbooks, and operational standards with a focus on automation

• Support Production systems and respond to operational incidents


Requirements


• 5-7 years of relevant experience

• Excellent communication skills, both verbal and written

• Proved hands-on experience with Azure Cloud Computing Resources

• Strong knowledge of scripting languages such as PowerShell, Bash, and Python

• Experience using infrastructure automation tools

• Strong Linux background

• Previous experience being embedded within a Software Engineering teams

• Ability to work on a team and independently

• Participate in on-call rotation

• Open to mentor other team members

• Time management skills – ability to adjust to shifting priorities

• Experience in the financial industry would be a plus


General Comments

Additional notes from the manager:


• Prefer to see candidates that have some longevity with their previous firms. At least more the 6mos-1 year at a company.

• Needs to have Azure hands on experience

• Prefer local candidates only or those who are seriously committed to relocating.

• Flexible with start/end times, on-call rotations, some weekend work (couple hours) at least once a month-- for the releases, production problems, or maintenance

• We’re not yet on the cloud and taking quick strides to get there. There is some on-premise work involved, too.


  • Coordinate failure analysis of reliability devices
  • Utilized for reliability data analysis
  • Attend routine reliability team meetings, maintain reliability data
  • Improving equipment and plant reliability
  • Compare theoretical system reliability metrics to actual reliability metrics
  • View the reliability requirements and role in improving reliability
  • Improve the reliability tools and methods utilized inincreasing plant reliability
  • Ensure maintainability and reliability ii
  • Process equipment for reliability capability
  • Maintaining equipment to increase reliability
  • Report the reliability test result and failure
  • Enhance product reliability and quality
  • Use understanding of various reliability tests to assess reliability capability
  • Identify product and reliability requirements
  • Ensuring historical product reliability programs
  • Assure reliability performance of products
  • Ensure new product reliability and maintainability
  • Improve the reliability tools and methods utilized in increasing plant reliability
  • Measure the system's reliability
  • Improve functional performance and reliability

Recommended Skills

Azure
Site Reliability Engineer
Apply to this job.
Think you're the perfect candidate?
Apply Now

Help us improve CareerBuilder by providing feedback about this job: Report this job

Report this Job

Once a job has been reported, we will investigate it further. If you require a response, submit your question or concern to our Trust and Site Security Team

CareerBuilder TIP

For your privacy and protection, when applying to a job online, never give your social security number to a prospective employer, provide credit card or bank account information, or perform any sort of monetary transaction. Learn more.

By applying to a job using CareerBuilder you are agreeing to comply with and be subject to the CareerBuilder Terms and Conditions for use of our website. To use our website, you must agree with the Terms and Conditions and both meet and comply with their provisions.