REQUIRED SKILLS & EXPERIENCE
Core Data Engineering
▸ 5+ years of professional experience in data engineering
▸ Strong understanding of data platform architecture: Lakehouse, Data Warehouse, Data Lake
patterns
▸ Hands-on experience with ETL/ELT design patterns including batch processing and stream
processing
▸ Familiarity with ingestion patterns: full load, incremental, CDC, event-driven
Databricks
▸ Experience building data pipelines on Databricks (Delta Live Tables, Jobs, Notebooks)
▸ Proficiency with PySpark or Spark SQL for large-scale data processing
▸ Familiarity with Delta Lake concepts: ACID transactions, time travel, schema evolution
Orchestration & Ingestion
▸ Proficiency with Apache Airflow — authoring, scheduling, and monitoring DAGs ▸
Experience with Airbyte for managing source-to-destination data connectors
SQL & dbt
▸ Strong SQL skills — query optimization, window functions, CTEs, and complex joins ▸
Experience with dbt (data build tool) for transformation, testing, and documentation –
Model layering: staging → intermediate → marts
– Writing schema tests, source freshness checks, and macros
– dbt tests
Cloud & Infrastructure
▸ Practical experience with AWS services like (S3, Lambda, IAM, CloudWatch etc)
NICE TO HAVE
▸ Experience with Docker & Kubernetes (EKS) for deploying and scaling data services –
Experience running Airflow & Airbyte on Kubernetes
▸ Experience with data quality frameworks (Great Expectations, Soda ▸
Infrastructure as Code experience (Terraform)
▸ Exposure to data governance tools or data cataloging (Databricks Catalog)
▸ Familiarity with CI/CD pipelines for data engineering (GitHub Actions) ▸
Experience with Python for pipeline scripting and automation4
ABOUT THE ROLE
We are looking for a skilled and passionate Data Engineer to join our growing Data Platform team.
In this role, you will design, build, and maintain robust data pipelines on databricks and aws
infrastructure that power analytics and reporting capabilities across the organization.
KEY RESPONSIBILITIES
▸ Design and implement scalable ETL/ELT pipelines using both batch and streaming patterns –
Build and maintain ingestion workflows from diverse sources (databases, APIs, event streams) –
Implement Change Data Capture (CDC), full-load, and incremental ingestion strategies
▸ Develop and manage data workflows using Apache Airflow for orchestration ▸
Configure and manage data ingestion connectors using Airbyte
▸ Work with Databricks to build and optimize data engineering workloads on the Lakehouse
platform
▸ Write and optimize complex SQL queries;
▸ Solid hands-on experience on dbt with databricks, build modular, testable dbt models for data
transformation
▸ Develop and maintain data models in staging, intermediate, and mart layers following data
warehousing best practices
▸ Working knowledge of AWS services like S3, Lambda, EC2, IAM, etc.
▸ Containerize data services and applications using Docker & EKS
▸ Ensure data quality, observability, and reliability across the data platform ▸
Document pipelines, models, and data dictionaries to maintain platform knowledge
Application Confirmation
You're applying for the role below: