Job Specifications
Role Overview:
We are seeking a Data Engineer / Developer to design, build, and optimize Azure-based data pipelines and lakehouse solutions supporting Global Asset Management.
This role focuses on delivering secure, reliable, and well-governed datasets using Databricks and Azure-native services in a regulated financial environment.
You will work closely with data consumers, platform teams, and stakeholders to enable scalable analytics and reporting while contributing to engineering standards, automation, and Agile delivery practices.
Key Responsibilities:
Design, build, and maintain end-to-end data pipelines using Azure and Databricks
Develop scalable data transformations using Python, Spark, and Pandas
Implement Delta Lake and Medallion architecture (bronze/silver/gold) in production
Ingest data from diverse sources using Azure Data Factory
Optimize SQL and Spark workloads with performance and cost awareness
Apply data modeling techniques (ETL/ELT, dimensional/star schemas)
Implement CI/CD for data engineering workflows
Ensure data quality through validation, testing, and monitoring
Document pipelines, datasets, and data contracts clearly
Collaborate in Agile/Scrum teams and support junior team members
Required Qualifications:
4-6 years of hands-on experience in data engineering
Strong experience with Databricks using Python, Spark, and Pandas (notebooks and modular code)
Experience building pipelines with Azure Data Factory (pipelines, integration runtimes)
Production experience with Delta Lake and Medallion architecture
Strong knowledge of Azure Data Lake Storage (ADLS) and SQL
Experience with ETL/ELT patterns and data modeling (dimensional/star schema)
CI/CD experience for data projects using Azure DevOps or GitHub Enterprise
Familiarity with Azure Entra ID for secure access, SSO, and RBAC
Experience implementing data validation, testing, code reviews, and basic monitoring/alerting
Experience working in Agile/Scrum teams using Jira and Confluence
Strong communication skills and ability to collaborate with technical and business stakeholders
Generative AI Expectations:
Practical use of GitHub Copilot and ChatGPT for:
Scaffolding notebooks and jobs
Generating tests and documentation
Optimizing SQL and Spark code
Ability to validate AI-generated output before production use
Understanding of responsible GenAI usage in a regulated environment, including:
Avoiding sensitive or PII data in prompts
Following governance and security guardrails
Awareness of model limitations at a high level
Nice-to-Have (Trainable within 60-90 Days)
Unity Catalog migration and governance (Hive to Unity)
Databricks DevOps (cluster configuration, secrets, workspace automation)
Azure Functions (C# or Python) for orchestration or integrations
Synapse dedicated SQL pools, dbt, or Delta Live Tables
Data governance tools (Purview, Collibra)
Infrastructure as Code (Terraform or Bicep)
Experience in regulated or locked-down environments
Financial services domain experience
About the Company
In our relentless pursuit of greatness, we are dedicated to developing individuals, creating exceptional teams, and cultivating a unique culture of unity and care. As providers of digital talent solutions, we aim to positively impact businesses and communities globally. We would be honored to be your trusted and uncommon partner on this journey.
Know more