- Design of data solutions on Databricks including delta lake, data warehouse, data marts and other data solutions to support the analytics needs of the organization.
- Apply best practices during design in data modeling (logical, physical) and ETL pipelines (streaming and batch) using cloud-based services especially Python & Pyspark
- Design, develop and manage the pipelining (collection, storage, access), data engineering (data quality, ETL, Data Modelling) and understanding (documentation, exploration) of the data.
- Interact with stakeholders regarding data landscape understanding, conducting discovery exercises, developing proof of concepts, and demonstrating it to stakeholders.
- Experience to work on Collibira for DQ and data governance is plus
- Knowledge on dbt to model and build out layers in dbx is plus
- AWS experience on s3,redshift,glue etc is also required
- Databricks – Pyspark, Python , SQL
- Unity catalog
- Collibira DQ
- AWS Services ex. Glue,s3,redshift,lambda etc