To begin the application process, please enter your email address.
Company Contact Info
- Plymouth Meeting, PA
Sorry, we cannot save or unsave this job right now.
Report this Job
Saving Your Job Alert
Job Alert Saved!
Could not save Job Alert!
You have too many Job Alerts!
Email Send Failed!
Assoc IT Architect Dir
IQVIA • Plymouth Meeting, PA
Posted 1 month ago
Join us on our exciting journey! IQVIA™ is The Human Data Science Company™, focused on using data and science to help healthcare clients find better solutions for their patients. Formed through the merger of IMS Health and Quintiles, IQVIA offers a broad range of solutions that harness advances in healthcare information, technology, analytics and human ingenuity to drive healthcare forward.
IQVIA is developing our next-generation Global Data Lake and Analytics platform to support analytics and insights against hundreds of Terabytes and Petabytes of health care data, and doing it in near real-time.
As a Assoc IT Architect Dir you will be providing Data Engineering on the Data Lake/Hadoop application and well versed in the below technologies, containerization, and Machine Learning.
- Optimize existing data pipelines by examining the code (Spark, Impala) and its efficiency on the compute plain and on HDFS (Hadoop file system)
- Advise teams on common design patterns on how to optimize these pipelines at Terabyte and Petabyte scale (i.e. how to avoid small files, partitioning and bucketing practices)
- Building data pipelines to load and manipulate data onto the Data Lake.
- Coding data pipelines using procedural, object oriented and functional languages (SQL/Impala, Java, Scala/Spark), to load, transform and build derivative data structures onto Hadoop (our platform for the Data Lake)
- Design and orchestrate quality control tests to ensure the integrity (process and data) of the data pipeline
- Ensure the pipelines are resilient to failure and high data loads.
- Optimizing data architecture for consumption, utilization, and analytics with data on Hadoop, including for data science, machine learning and statistical use cases
- Understand the access patterns for the data in the Data Lake, and design partitioning and bucketing strategies accordingly
- Design data structures (i.e. denormalized, Change Data Capture, nested structures, etc.) that are optimized against process and data skew when dealing with TereBytes and PetaBytes of data, and highly supportive of concurrency
- Must be able to Suggest different training models, and implement validation methodologies, for Machine Learning, and also fluent in using CDSW CUDA environment for building pipelines. Knowledgeable on Tensor Flow and Keras for running the models.
- Leading the charge on a data lake store strategy, while ensuring rapid delivery while taking responsibility for applying standards, principles, theories, and concepts
- Help in designing our data storage strategy across our dozens of tenants on the Data Lake.
- Document and evangelize this to other teams across the company
- Designing and delivering of data models, which power BI initiatives, dashboards, syndicated reporting, and ad-hoc data exploratory canvases for IQVIA solutions
- Design data storage/data architecture strategy that is conducive to 1-3 seconds SLA, optimized for fast scans or entity lookups
- Optimize Hadoop Preemptive, resource pool and admission control policy to ensure resources get the necessary resources to execute immediately when needed to meet SLA
- Working with data architects on the logical data models and physical database designs optimized for performance, availability and reliability
- Work with data architects on which normalized and denormalized models make sense for their business case, and how to architect in on Hadoop/Data Lake.
- Design the physical design on these models, including how to partition and/or bucket the data to ensure maximal parallelism with little data shuffle.
- Designing and developing ETL and master data management processes.
- Integrating with our Reference Systems which requires the Data Lake to interface with Kafka for streaming/micro-batching of topics
- Building data pipelines that can read from reference systems and load the data on Data Lake in Third Normal Form and Bi-temporal dimensional form, in support of our clients needing reference data in the Data Lake
- Scripting and automation to support development, QA and production database environments and deployments to production
- Building Airflow Pipelines that orchestrate all the actions/steps our data pipeline follows.
- Building Docker Containers to run different services for architecture stability to reduce dependency on environment setups.
- Nice to have Kubernetes experience with Micro Services.
We know that meaningful results require not only the right approach but also the right people. Regardless of your role, we invite you to reimagine healthcare with us. You will have the opportunity to play an important part in helping our clients drive healthcare forward and ultimately improve human health outcomes. Whatever your career goals, we are here to ensure you get there! We invite you to join IQVIA™
IQVIA is an EEO Employer - Minorities/Females/Protected Veterans/Disabled
IQVIA, Inc. provides reasonable accommodations for applicants with disabilities. Applicants who require reasonable accommodation to submit an application for employment or otherwise participate in the application process should contact IQVIA’s Talent Acquisition team at email@example.com to arrange for such an accommodation.
Job ID: R1077772