- Company Name
- Reddit, Inc.
- Job Title
- Software Engineer, Ads ML Features Platform
- Job Description
-
**Job Title:** Software Engineer, Ads ML Features Platform
**Role Summary:**
Design, develop, and maintain scalable data infrastructure and feature management systems that support large‑scale machine‑learning (ML) pipelines for advertising. Collaborate with ML engineers to ensure reliable, high‑quality feature computation, storage, and delivery for model training and serving.
**Expectations:**
- 3+ years of experience in infrastructure or platform engineering, focusing on distributed systems.
- 2+ years hands‑on with SQL‑based cloud data warehouses (e.g., BigQuery, Snowflake, Redshift, Databricks).
- Proven ability to build and optimize batch and event‑driven feature pipelines using Spark, PySpark, or Scala.
- Strong understanding of scaling, partitioning, fault tolerance, and caching in distributed environments.
- Familiarity with ML production workflows and MLOps concepts; experience with ML feature platforms is a plus.
**Key Responsibilities:**
- Design and implement data pipelines for large‑scale feature computation, transformation, and storage.
- Develop frameworks for batch and event‑driven feature generation emphasizing reliability, scalability, and usability.
- Implement data quality controls: validation, anomaly detection, drift monitoring, and lineage tracking.
- Partner with ML engineers to integrate feature engineering workflows into production ML systems.
- Build and maintain training‑set generation pipelines with reproducibility and versioning support.
- Contribute to the roadmap for streaming feature management and other next‑generation platform capabilities.
**Required Skills:**
- Distributed systems design (scaling, partitioning, fault tolerance, caching).
- Proficiency with SQL‑based cloud data warehouses (BigQuery, Snowflake, Redshift, Databricks).
- Experience with large‑scale data processing frameworks (Spark, PySpark, Scala).
- Strong programming skills in Python and/or Scala.
- Knowledge of MLOps pipelines, feature stores, and ML production environments.
- Ability to work collaboratively across cross‑functional teams and communicate technical concepts clearly.
**Required Education & Certifications:**
- Bachelor’s degree in Computer Science, Software Engineering, Electrical Engineering, or a related technical field, **or** equivalent practical experience.
- Relevant certifications (e.g., Google Cloud Professional Data Engineer, AWS Certified Big Data – Specialty) are optional but beneficial.