- Company Name
- Top Tier Candidates
- Job Title
- Site Reliability Engineer
- Job Description
-
Job Title: Site Reliability Engineer
Role Summary:
The Site Reliability Engineer (SRE) will design, build, and maintain highly reliable, scalable, and high‑performance infrastructure for a web and streaming platform that serves millions of users globally. Candidates will focus on automation, CI/CD, monitoring, and incident response in a distributed environment.
Expectations:
* Deliver continuous reliability and performance improvements.
* Reduce manual operational effort through tooling and automation.
* Collaborate closely with development, security, and product teams to support rapid deployment and high availability.
Key Responsibilities:
* Maintain and optimize CI/CD pipelines and build infrastructure.
* Automate provisioning, configuration, and deployment using IaC (Ansible, Terraform).
* Monitor system health, analyze metrics, and troubleshoot performance or availability incidents.
* Implement and manage container orchestration (Docker, Kubernetes) across production workloads.
* Apply updates, patches, and security hardening across Linux systems (Debian/Red Hat).
* Design and implement caching, CDN, and edge delivery solutions.
* Conduct capacity planning, load testing, and scalability assessments.
* Assist in on‑call rotations, responding to outages and coordinating restoration.
* Provide technical guidance to developers and customers on best practices.
Required Skills:
* 3–5 years of DevOps/SRE experience in a Git‑centric workflow.
* Strong Linux administration (Debian/Red Hat).
* Proficiency with containerization and orchestration (Docker, Kubernetes).
* Experience with configuration management and IaC (Ansible, Terraform).
* Solid understanding of web protocols (HTTP, TCP/IP), DNS, and networking fundamentals.
* Scripting expertise in Bash, Python, or Go.
* Familiarity with CI/CD tools (GitHub Actions, GitLab CI, Jenkins).
* Exposure to cloud platforms (AWS, Azure, or GCP).
* Knowledge of caching, CDN, or edge delivery technologies.
* Awareness of web security, performance tuning, and traffic management.
Required Education & Certifications:
* Bachelor’s degree in Computer Science, Information Technology, or related technical field.
* Relevant certifications such as AWS Certified DevOps Engineer, Google Cloud Professional DevOps Engineer, or Kubernetes Administrator (CKA) are a plus.