- Company Name
- Curve
- Job Title
- Senior Site Reliability Engineer
- Job Description
-
**Job Title**
Senior Site Reliability Engineer
**Role Summary**
Senior Site Reliability Engineer (SRE) ensures the reliability, scalability, and security of Curve’s cloud‑based services. Works closely with engineering, product, and platform teams to design, implement, and maintain infrastructure, observability, and automation solutions, driving continuous improvement and fast, reliable delivery of features.
**Expactations**
- Own end‑to‑end reliability and performance of production systems.
- Collaborate cross‑functionally, communicating findings and solutions clearly.
- Demonstrate ownership from concept through deployment and post‑mortem.
- Adapt rapidly to a high‑velocity startup environment.
**Key Responsibilities**
- Design, deploy, and manage scalable cloud infrastructure (AWS, GCP) using Terraform and Atlantis.
- Operate Kubernetes (EKS) clusters with Istio, Helm, and Flux CI/CD pipelines.
- Implement and maintain observability stack (Prometheus, Coralogix, Grafana) for monitoring, alerting, and performance insights.
- Build and maintain databases (PostgreSQL, MongoDB) and secrets management (Vault).
- Conduct post‑mortem analyses to identify root causes and implement preventive measures.
- Develop proof‑of‑concepts for emerging technologies and evaluate fit for Curve.
- Automate manual workflows, reduce toil for engineering teams, and accelerate feature delivery.
- Document best practices, share knowledge across the platform, and mentor junior staff.
- Ensure compliance with security standards (PCI, data protection).
**Required Skills**
- 2+ years of Kubernetes production deployment experience.
- 2+ years of cloud provider experience (AWS preferred).
- Strong Terraform proficiency (modules, code review, IaC best practices).
- Programming in Go or Python for tooling and automation.
- Deep understanding of system diagnostics across client, network, server, database, and OS layers.
- Experience with CI/CD tools (GitLab, Flux, Helm).
- Familiarity with observability tools (Prometheus, Grafana, Coralogix).
- Knowledge of security and compliance (PCI, Vault).
- Agile sprint experience and ability to deliver in fast‑paced settings.
- Excellent documentation and communication skills.
- Eagerness to learn new technologies and stay current with cloud-native trends.
**Required Education & Certifications**
- Bachelor’s degree in Computer Science, Engineering, or related technical field.
- Preferred: AWS Certified Solutions Architect, Terraform Associate, or equivalent cloud/infra certifications.