- Company Name
- Mastech Digital
- Job Title
- Data Engineer
- Job Description
-
**Job Title**
Data Engineer
**Role Summary**
Build, maintain, and optimize scalable data pipelines that ingest enterprise data streams into AWS S3, transform them into curated, query‑ready formats, and load them into Amazon Redshift. Collaborate with data architects, analytics teams, and business stakeholders to ensure data quality, accessibility, and performance for analytics and reporting.
**Expectations**
- Deliver reliable, automated data ingestion and transformation pipelines on schedule.
- Ensure high data quality, integrity, and consistency across raw and curated layers.
- Optimize loading and query performance in Amazon Redshift.
- Maintain clear documentation and version control for all data workflows.
- Proactively identify and resolve data-related issues and collaborate on process improvements.
**Key Responsibilities**
- Ingest data from enterprise sources (SAP, Salesforce, Google Analytics, OKTA, and others) into the AWS S3 Raw layer.
- Curate and transform raw data into standardized datasets within the curated layer.
- Design and implement data mapping and transformation logic using code‑driven automation.
- Load curated datasets into Amazon Redshift tables, views, and schemas optimized for analytical queries.
- Work closely with data architects, analysts, and business stakeholders to gather requirements and refine data assets.
- Monitor pipeline performance, troubleshoot failures, and apply performance tuning techniques.
- Maintain comprehensive documentation of data flows, data models, and ETL processes.
**Required Skills**
- Strong experience with AWS services: S3, Glue, Kinesis, Lambda, Redshift, and IAM.
- Proven ability to build ETL/ELT pipelines using Python, PySpark, or SQL.
- Expert knowledge of data modeling and transformation principles.
- Hands‑on experience with SAP, Salesforce, Google Analytics, OKTA data ingestion.
- Proficiency in SQL (Redshift, PostgreSQL) and query optimization.
- Familiarity with data quality tools and monitoring (Data Pipeline, CloudWatch).
- CI/CD integration for data pipelines (Git, Jenkins, Terraform, AWS CloudFormation).
- Excellent communication skills for cross‑functional collaboration.
**Required Education & Certifications**
- Bachelor’s degree in Computer Science, Information Systems, or a related field (or equivalent practical experience).
- AWS Certified Data Analytics – Specialty (preferred).
- Additional certifications in Amazon Redshift, SAP, Salesforce, or related technologies are a plus.