Principal Site Reliability Engineer
Company: VirtualVocations
Location: Toms River
Posted on: November 15, 2024
|
|
Job Description:
A company is looking for a Principal Site Reliability Engineer
to lead the implementation of observability and automated incident
management solutions.
Key Responsibilities
Design and implement alert correlation, auto-triage, and
auto-remediation frameworks for a microservices-based SaaS
architecture
Define and monitor Service Level Objectives (SLOs) in collaboration
with product and engineering teams
Mentor engineers and promote best practices in reliability
engineering across the organization
Required Qualifications
15+ years of professional experience with 5+ years in enterprise
SaaS environments
Proven experience in architecting and implementing SRE solutions at
scale
Deep knowledge of incident management, alert correlation, and
self-healing strategies
Proficiency in programming languages such as Python, Go, or
Java
Expertise in cloud platforms and container orchestration, with
experience in infrastructure-as-code
Keywords: VirtualVocations, Fairfield , Principal Site Reliability Engineer, Professions , Toms River, Connecticut
Click
here to apply!
|