As a Principal Data Engineer, you will design, build, and maintain scalable data pipelines and infrastructure to support analytics, reporting, and data science initiatives. You will work closely with cross-functional teams to ensure data is accessible, reliable, and secure across the organization.
MidAmerican Energy Company, a Midwest utility, provides regulated electric and natural gas service to more than 1.6 million customers in Illinois, Iowa, Nebraska and South Dakota. The company owns and operates a portfolio of power-generating assets, approximately 61% of which is wind generation.
Primary Job Duties and Responsibilities (Essential Job Functions)
Design and Develop Scalable Data Pipelines
Design and implement scalable data ingestion and transformation frameworks using Azure services enabling structured, semi-structured, and unstructured data to be efficiently processed and integrated into enterprise data platforms
Build and maintain robust ETL/ELT pipelines using Azure Data Factory and Azure Databricks.
Integrate data from diverse sources including on-premises systems, cloud storage, APIs, and streaming platforms.
Databricks Development and Optimization
Develop and optimize notebooks and workflows in Azure Databricks using PySpark, SQL.
Implement Delta Lake for efficient data storage, versioning, and ACID transactions.
Leverage Databricks features such as Unity Catalog and job orchestration.
Data Modeling and Architecture
Design and implement data models (star/snowflake schemas) for analytics and reporting.
Collaborate with architects to define data lakehouse architecture and best practices.
Hands-on experience implementing and optimizing data solutions using the Medallion Architecture (Bronze, Silver, Gold layers) for scalable and structured data processing
Data Quality and Governance
Implement data validation, profiling, and cleansing routines.
Ensure compliance with data governance policies, including data lineage and metadata management.
Performance Tuning and Monitoring
Monitor and optimize performance of Spark jobs and data pipelines.
Troubleshoot and resolve issues related to data latency, job failures, and resource utilization.
Collaboration and Stakeholder Engagement
Work closely with data scientists, analysts, and business units to understand data requirements.
Translate business needs into technical solutions that are scalable and maintainable.
Security and Compliance
Implement role-based access control (RBAC), encryption, and secure data handling practices.
Ensure compliance with industry regulations (e.g., NERC CIP, GDPR, HIPAA if applicable).
Documentation and Best Practices
Maintain clear documentation of data flows, architecture, and operational procedures.
Promote best practices in code versioning, testing, and CI/CD for data engineering.
Primary Job Duties and Responsibilities (Essential Job Functions)
Design and Develop Scalable Data Pipelines
Design and implement scalable data ingestion and transformation frameworks using Azure services enabling structured, semi-structured, and unstructured data to be efficiently processed and integrated into enterprise data platforms
Build and maintain robust ETL/ELT pipelines using Azure Data Factory and Azure Databricks.
Integrate data from diverse sources including on-premises systems, cloud storage, APIs, and streaming platforms.
Databricks Development and Optimization
Develop and optimize notebooks and workflows in Azure Databricks using PySpark, SQL.
Implement Delta Lake for efficient data storage, versioning, and ACID transactions.
Leverage Databricks features such as Unity Catalog and job orchestration.
Data Modeling and Architecture
Design and implement data models (star/snowflake schemas) for analytics and reporting.
Collaborate with architects to define data lakehouse architecture and best practices.
Hands-on experience implementing and optimizing data solutions using the Medallion Architecture (Bronze, Silver, Gold layers) for scalable and structured data processing
Data Quality and Governance
Implement data validation, profiling, and cleansing routines.
Ensure compliance with data governance policies, including data lineage and metadata management.
Performance Tuning and Monitoring
Monitor and optimize performance of Spark jobs and data pipelines.
Troubleshoot and resolve issues related to data latency, job failures, and resource utilization.
Collaboration and Stakeholder Engagement
Work closely with data scientists, analysts, and business units to understand data requirements.
Translate business needs into technical solutions that are scalable and maintainable.
Security and Compliance
Implement role-based access control (RBAC), encryption, and secure data handling practices.
Ensure compliance with industry regulations (e.g., NERC CIP, GDPR, HIPAA if applicable).
Documentation and Best Practices
Maintain clear documentation of data flows, architecture, and operational procedures.
Promote best practices in code versioning, testing, and CI/CD for data engineering.