We are seeking a data engineer with ideally 5-7 years of IT experience to actively improve and manage the critical data the team needs to run the business. Partnering with the Enterprise IT team, this role will deliver data sets for Residential Equipment/Supply to enable self-service analytics. They will partner with the architect to prioritize data to bring into the shared workspace, and support a data mart so analysts can easily access the data required. They will also work with and offer 2nd tier support to the data analyst who provides user support to the business. The resulting data will consumed through Alteryx workflows, Python jobs, Tableau workbooks, and more.
- Develop a strong understanding of the Residential/Supply business and proactively advise enterprise IT and the business on the data.
- Understand source systems, and Residential/Supply specific attributes required by the business for meaningful analysis
- Learn the tables currently in the Shared Workspace for the Residential/Supply business and be able to explain structure and origins to other technical users.
- Evaluate, parse, clean and fix data sets according to business requirements.
- Recommend technical approach to data pipeline development and understand the correct schemas to future-proof data pipelines.
- Follow enterprise guidelines and best practices to ensure consistent approach across the organization.
- Join data sets as needed to create new data assets.
- Collaborate with architect to identify ad-hoc tables need to be promoted to standard work in Res or Enterprise.
- Research and identify new data sources required by the business.
- Provide direction to consultants to augment capacity for data engineering work as required.
This is a direct hire / remote role for a Charlotte NC based client , 130-155K plus 10-15% bonus - Prefer candidates within 200 mile radius of Charlotte NC for onsite requirements from time to time, but will consider candidate in the EST and CST time zone.
Main Duties/Required Skills:
- BS/BA degree in computer science, MIS, analytics, engineering or similar
- Minimum 5 years of experience with creating data pipelines utilizing Python, PySpark and SQL
- Knowledge of distributed computing, HDFS or relational data systems such as Oracle, Impala, Hive or similar
- Proficient with Cloud environments, preferably GCP and AWS
- Knowledge of Git or other version control tools and a commitment to collaboration and code sharing
- Familiarity with Alteryx preferred, Tableau nice to have
- Experience with Supporting & building Data Warehouse Architecture
- Proficient in ETL design for efficient data movement
- Experience in operationalizing data pipelines in a production environment.
- Strong communication skills –able to convey complex concepts to technical and business teams in a simple and understandable way.
- Technical Excellence—comfortable around very large data sets and utilizing the appropriate tools for each task.
- Cross Functional Collaboration – within analytics teams, the business and other IT teams
- Innate Curiosity – constant drive to learn more, ask questions, seek better outcomes through better use of data.
- Problem Solver – a self-starter, enjoys digging into problems, and finding creative solutions.
- Love for data – enjoy digging into big data sets, understanding the details of the data, and willing and able to share with other technical users.
Nice to have Skills:
Advanced degree in mathematics or data science
Experience in Snowflake, GCP BQ, DBT, Advanced Analytics/Data Science
7-10+ years of relevant experience
Agile experience a plus
Experience with Cloudera is a plus
Experience with modern cloud-based data pipelines, data modeling, data management and governance, and data architecture is also a plus.
Bachelor’s Degree Requirement: Yes
Bachelors Degree Requirement: Yes
- Data Engineer