cover image
Van Kaizen

Sr. Site Reliability Engineer (SRE)

Hybrid

New york, United states

Senior

Full Time

01-12-2025

Share this job:

Skills

CI/CD DevOps Docker Kubernetes Monitoring Ansible AWS CI/CD Pipelines Terraform Prometheus Grafana Infrastructure as Code

Job Specifications

Overview:

Industry leading geolocation and compliance solution company, is looking for a Sr. SRE (Site Reliability Engineer) to join their growing North American team!

To note: this role is hybrid with a few office visits throughout the year and needing to be East Coast time zone. At this time, we are looking for US Citizens or Permanent Residents (Green Card holders) without future sponsorship!

Responsibilities:

Partner with SRE and DevOps teammates to enhance automation, monitoring, alerting, and self-healing capabilities, while mentoring junior engineers.
Work with development teams to design systems that prioritize stability, scalability, and performance.
Build and maintain infrastructure using Infrastructure as Code tools.
Maintain monitoring platforms and track system health, capacity, and performance through automated processes.
Implement and improve multi-region architectures and auto-scaling in AWS to support fluctuating traffic and maintain cost efficiency.
Identify potential reliability risks early and resolve them before they affect users.
Troubleshoot incidents, outages, and performance degradation, perform root cause analysis, and apply long-term fixes.
Build and optimize CI/CD pipelines and automate deployment processes.
Apply security standards across infrastructure and address vulnerabilities as they arise.
Keep documentation current across infrastructure, pipelines, and operational processes.

Qualifications:

Bachelor’s or Master’s degree in a technical field is preferred.
8+ years of experience with AWS, designing and maintaining cloud infrastructure through IaC tools such as Terraform, CloudFormation, or Ansible.
Strong experience with Docker and Kubernetes.
Strong experience with observability tools, such as Prometheus, Grafana, CloudWatch, or DataDog.
Solid understanding of SRE concepts including SLOs, SLIs, and error budgets.
Demonstrated ability to work within engineering teams and support code quality and collaboration.
Preferred: background in online gaming or sports betting, startup experience, AWS certifications.

About the Company

Global, multicultural recruiters for iGaming and technology. Our diverse team lives and works across six continents and speaks over 20 languages. Nobody is more motivated than our elite headhunters, who earn the industry’s highest revenue share. Know more