- Company Name
- Broad Reach Partners
- Job Title
- Senior SRE / DevOps Engineer (Atlanta)
- Job Description
-
**Job Title**
Senior Site Reliability Engineer / DevOps Engineer
**Role Summary**
Senior SRE/DevOps Engineer responsible for ensuring high availability, performance, and reliability of production systems. Works collaboratively across development, operations, and security to improve observability, automate deployments, and maintain scalable cloud infrastructure. Focuses on Kubernetes on AWS EKS, monitoring, capacity planning, and incident response.
**Expectations**
- Deliver continuous improvements to system reliability and operational excellence.
- Own end‑to‑end monitoring, alerting, and incident management.
- Lead automation efforts to reduce manual processes and improve release cadence.
- Communicate effectively with cross‑functional teams and maintain clear documentation.
**Key Responsibilities**
- Maintain and enhance monitoring via New Relic and Graylog for service health and performance.
- Implement high‑availability designs, capacity planning, performance tuning, and fault tolerance.
- Define, track, and report Service Level Indicators, Objectives, and Agreements.
- Deploy and manage Kubernetes workloads on AWS EKS using Helm and ArgoCD.
- Automate operational workflows to reduce manual interventions.
- Participate in on‑call rotation; troubleshoot production incidents and coordinate permanent fixes.
- Collaborate with DevOps to refine CI/CD pipelines and embed resilience into application development.
- Document runbooks, escalation procedures, and production playbooks.
**Required Skills**
- 8+ years in Site Reliability Engineering or equivalent role.
- Deep expertise in Kubernetes (pods, nodes, networking, scaling, logs, service communication).
- Experience with AWS (preferred) or Azure; 3+ years in a public cloud environment.
- Proficiency in monitoring (New Relic, Graylog) and logging tools.
- Scripting: Bash, Groovy, Python.
- Strong debugging, troubleshooting, and prioritization abilities.
- Familiarity with CI/CD, Helm, ArgoCD, and container orchestration best practices.
**Required Education & Certifications**
- Bachelor’s degree in Computer Science, Information Technology, or related field (or equivalent experience).
- AWS Cloud certification (preferred but not mandatory).
---
Alpharetta, United states
Hybrid
Senior
28-01-2026