- Company Name
- Insight International (UK) Ltd.
- Job Title
- Site Reliability Engineer (AWS)
- Job Description
-
**Job Title:** Site Reliability Engineer (AWS)
**Role Summary:**
Senior SRE responsible for ensuring the reliability, performance, and scalability of an AWS‑based engineering platform. Works across teams to design, automate, and optimize infrastructure, monitor systems, and lead incident response, while mentoring junior staff and driving continuous platform improvements.
**Expectations:**
- Deliver a stable, highly available platform that meets defined SLAs.
- Lead critical incident management and post‑incident reviews.
- Automate routine tasks and enhance CI/CD pipelines.
- Collaborate with developers and platform teams to embed observability and reliability.
- Mentor and share best practices with junior engineers.
**Key Responsibilities:**
- Maintain and improve monitoring, logging, and alerting frameworks (Dynatrace, Prometheus, Grafana).
- Identify and eliminate performance bottlenecks in production.
- Create and manage runbooks, playbooks, and incident SOPs.
- Build IaC solutions using Pulumi or Terraform to provision AWS resources.
- Contribute to CI/CD pipeline development (GitLab, Jenkins, etc.).
- Install and manage container orchestration (Kubernetes, Docker).
- Collaborate with cross‑functional teams (SRE, CI/CD, Developer Experience, Templates) to elevate platform reliability.
- Mentor junior engineers on SRE principles and operational excellence.
**Required Skills:**
- 5+ years in SRE, software engineering, or related field.
- 3+ years managing AWS environments.
- Strong programming in Python, Java, Node.js, or TypeScript.
- Experience with Kubernetes, Docker.
- Proficiency with CI/CD tools (GitLab, Jenkins, etc.).
- Hands‑on monitoring/logging/alerting experience; Dynatrace familiarity a plus.
- IaC expertise with Pulumi or Terraform.
- Excellent troubleshooting, automation, and communication skills.
**Required Education & Certifications:**
- Bachelor’s degree in Computer Science, Engineering, or equivalent experience.
- AWS certifications (e.g., Solutions Architect, SysOps Administrator, Cloud Practitioner) are desirable but not mandatory.
Birmingham, United kingdom
Hybrid
Mid level
20-11-2025