- Company Name
- Q2
- Job Title
- Site Reliability Engineer
- Job Description
-
Job title: Site Reliability Engineer
Role Summary:
Senior SRE responsible for managing, optimizing, and securing Snowflake-based data platforms. Works closely with Data Engineering, DevOps, and Security teams to ensure high availability, performance, and cost efficiency of data workloads.
Expectations:
- Deliver reliable Snowflake operations at scale.
- Participate in on‑call rotation and incident response.
- Drive automation, cost optimization, and best‑practice compliance.
Key Responsibilities:
- Administer Snowflake environments: warehouses, databases, schemas, roles, and RBAC.
- Monitor performance, query efficiency, and warehouse utilization; create dashboards and alerts.
- Optimize compute and storage costs through tuning and proactive analysis.
- Maintain Snowflake tasks, streams, pipes, and Snowpipe ingestion pipelines.
- Automate deployments with Terraform, DBT, and IaC; build and maintain CI/CD pipelines (Kubernetes, ArgoCD, Helm).
- Script operational tasks in Python, SQL, Bash.
- Support data pipelines (Airflow, Kafka) for reliable ingestion and processing.
- Manage AWS integration: IAM, access policies, security controls.
- Perform root‑cause analysis, write runbooks, and document SOPs.
Required Skills:
- 5+ years as SRE, DevOps, or Data Platform Engineer.
- Hands‑on Snowflake administration, performance tuning, and security.
- SQL proficiency and database fundamentals.
- Scripting in Python (plus Bash/SQL).
- Experience with Kubernetes, Helm, and GitOps (ArgoCD).
- IaC with Terraform; CI/CD pipeline design.
- Familiarity with Airflow, Kafka, or similar orchestration.
- Monitoring/logging tools: Datadog, Splunk, CloudWatch, etc.
- English fluency (written & spoken).
Required Education & Certifications:
- Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).
- Optional: SnowPro Core or Advanced certification; DBT experience; Snowflake cost optimization.