Amazon Web Services (AWS), Automation, Computer Programming, Continuous Deployment/Delivery, Continuous Integration, Data Analysis, Data Management, Data Processing, Data Quality, Data Warehousing, Database Extract Transform and Load (ETL), Enterprise Architecture, Enterprise Protection, GCP (Good Clinical Practices), Java, Microsoft Windows Azure, Performance Tuning/Optimization, Python Programming/Scripting Language, Quality Metrics, Regulatory Compliance, Reliability Engineering, Root Cause Analysis, Sales Pipeline, Scala Programming Language, Security Architecture, Snowflake Schema, Structured Data, Technical/Engineering Design, Test Automation
Job Details:
Mandatory Skills:
• Advanced expertise in ETL / ELT pipeline design
• Handling:
o Batch data processing
o Near real time / streaming data
• Experience with structured and semi structured data
• Strong knowledge of:
o Incremental loading
o CDC (Change Data Capture)
• Pipeline orchestration and dependency management
• Strong programming skills in Python or Scala or Java (nice to have)
• Performance optimization for large scale data processing
• Solid understanding of:
o Dimensional modeling (Star / Snowflake)
o Normalized and denormalized models
• Strong experience on: Azure, AWS, or GCP
• Hands on with Data Warehouses (Snowflake, Synapse, BigQuery, Redshift)
Data Architecture & Solution Design
• Design end to end data engineering architectures
• Define scalable solutions for:
o Data lakes / lakehouse
o Data warehouses
o Streaming and real time systems
• Ensure alignment with enterprise architecture, security, and compliance standards
• Review and approve technical designs
Data Pipeline Development & Management
• Lead development of ETL / ELT pipelines
• Handle:
o Batch and real time ingestion
o Structured and semi structured data
• Optimize pipelines for performance, reliability, and cost
• Manage schema evolution and data dependencies
Data Quality, Reliability & Operations
• Establish data quality standards and validation rules
• Implement monitoring, alerting, and observability
• Perform root cause analysis for data incidents
• Drive operational excellence and stability
DevOps / DataOps Enablement
• Build CI/CD pipelines for data workloads
• Automate testing, deployment, and rollback
• Improve reliability through automation