- Company Name
- ThoughtBot
- Job Title
- Senior Data Engineer - Enterprise Data Warehouse & Lakehouse Solutions
- Job Description
-
Job Title: Senior Data Engineer – Enterprise Data Warehouse & Lakehouse Solutions
Role Summary: Lead the design, development, and maintenance of enterprise data warehouse (EDW) and lakehouse solutions. Build scalable, reliable data pipelines that transform structured and unstructured data, integrate diverse sources, and optimize performance for analytics workloads.
Expactations: Expect proficiency in English, 13+ years of IT experience with at least 5 years each in data warehouse architecture, code‑based data transformation (dbt, Spark), SQL/ETL tools, and Python with orchestration (Airflow, Dagster). Demonstrated expertise in modern data platforms (Databricks, Snowflake, Fabric, Talend), containerization (Docker, Podman, Kubernetes), and data modeling/OLAP techniques.
Key Responsibilities:
- Develop, test, and maintain scalable data pipelines for EDW and lakehouse environments.
- Design and implement data transformation logic using dbt, Spark, and Python.
- Integrate data from relational databases, APIs, streaming services, and cloud platforms.
- Optimize queries and workflows for performance and cost-efficiency.
- Write modular, production‑grade code and develop comprehensive unit and integration tests.
- Monitor and enforce data quality, validation, and governance across pipelines.
- Document architecture, processes, and troubleshooting procedures.
- Assist with deployment, configuration, and containerization of data platforms.
Required Skills:
- Strong relational database design and data warehouse architecture.
- Advanced SQL and experience with ETL/ELT tooling.
- Proficiency in dbt, Spark, and Python for data transformation.
- Experience with orchestration (Airflow, Dagster) and scheduling.
- Knowledge of data modeling, OLAP, and data mining concepts.
- Hands‑on with Databricks, Snowflake, Fabric, Talend, and similar platforms.
- Containerization and orchestration skills (Docker, Podman, Kubernetes).
- Excellent English communication and documentation.
- Ability to troubleshoot and scale data solutions in a production environment.
Required Education & Certifications:
- University degree in Information Technology, Computer Science, or related field.
- Minimum 13 years of professional IT experience; no specific certifications required but relevant platform or data engineering certifications (e.g., Snowflake, Databricks, Spark) are advantageous.