- Company Name
- ELLKAY
- Job Title
- Application SRE (DevOps)
- Job Description
-
Job title: Application Site Reliability Engineer (SRE) – DevOps
Role Summary: Own and enhance the reliability, performance, and scalability of production and non‑production application environments. Lead CI/CD, infrastructure automation, incident response, and cross‑team reliability initiatives.
Expectations: Deliver highly available, resilient software systems with minimal operational toil, ensuring clear SLIs/SLOs, effective monitoring, and rapid incident resolution.
Key Responsibilities:
- Own application reliability, availability, performance, and scalability across all environments.
- Design, build, and maintain CI/CD pipelines for application deployments.
- Automate infrastructure provisioning and configuration using IaC tools.
- Monitor application health via metrics, logs, and traces; define SLIs, SLOs, and error budgets.
- Lead incident response, root‑cause analysis, and corrective action communication for Sev1/Sev2 events.
- Improve system resilience through capacity planning, tuning, and fault tolerance.
- Partner with development teams to meet reliability, performance, and scalability objectives.
- Reduce manual operational effort through automation and self‑healing solutions.
Required Skills:
- 7+ years as SRE, DevOps Engineer, or Production Support Engineer.
- Strong Linux/Unix and Windows system and networking fundamentals.
- Hands‑on experience with AWS, Azure, or GCP.
- Proficiency with Docker, Kubernetes, and container orchestration.
- Expertise in CI/CD tools (Jenkins, GitHub Actions, etc.) and IaC (Terraform, CloudFormation/ARM).
- Scripting in Python or Bash.
- Monitoring/observability tools (Prometheus, Grafana, ELK, Datadog).
- Understanding of SLAs, SLOs, error budgets, and incident management.
- Excellent problem‑solving, calm under pressure, and strong communication.
Required Education & Certifications:
- Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).
- Certifications such as AWS Certified DevOps Engineer, Certified Kubernetes Administrator, or equivalent are preferred.
Elmwood park, United states
On site
Senior
02-02-2026