Senior site reliability engineer
Auckland, Auckland, New Zealand
Reports to : Project Lead
Experience: 5+ years
Start date: 1st August 2022
Responsibilities Responsible for Toil Reduction, implementing identified improvement opportunities, and handling minor enhancement and non-ticketed activity.
Define and monitor service level metrics that include Reliability metrics like MTTD, MTTR, MTBF, MTTF, Unavailability rate, Incident count, etc.
Create rules to optimize incident response by metrics, streamlining alert flows, and collaboration and communication across squads.
Proactively identify the issues that might disrupt the service in production
Address incoming service requests to their support groups/Jira tool
Create and maintain alerts
Change validation or change planning-related requests
Assist business stakeholders in determining SLO or adjusting threshold limits
Demand and capacity management & make corrections to SLI/SLO threshold limits
Gather and analyze metrics from both Infrastructure and applications to assist in bug fixing
Engage in capacity planning & performance tuning exercises
Partner with development teams to improve services through rigorous testing and release procedures
Participate in system design consulting, platform management, and capacity planning
Create sustainable systems and services through automation and uplifts
Balance feature development speed and reliability with well-defined service level objective (SLO, SLI)
Debug production issues across services and levels of the stack.
required skills and qualification Bachelor’s degree in computer science or other highly technical, scientific discipline
Experience in AEM, Webservices/APIs
Experience in working with Public Clouds (Min 3 years experience is a must )
Experience with Git or other source control systems
Experience using tools to create and manage CI (continuous integration) and CD (continuous delivery) pipelines
Working knowledge in service level definitions and identifying the KPIs
Working knowledge of the TCP/IP stack, internet routing, and load balancing
Experience with distributed storage technologies like NFS, HDFS, Ceph
Experience in Observability strategy
Delivery Model: Onsite
Job Type: Full Time
Job Location: Auckland
Apply for this position First Name *
Last Name *
Email *
Phone Number *
Job Title *
Upload CV/Resume *
EIL Global is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick below to say how you would like us to contact you:
I agree to receive other communications from EIL Global.
In order to provide you the content requested, we need to store and process your personal data. If you consent to us storing your personal data for this purpose, please tick the checkbox below.
I agree to allow EIL Global to store and process my personal data.
I agree to allow EIL Global to store and process my personal data.
*
You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy.
#J-18808-Ljbffr