p>We're looking for candidates who have: 6+ years of experience as a Data Platform Engineer or Data Engineer or Data Infrastructure Engineer, with hands-on expertise in building and maintaining cloud-based data platforms at large scale Passion for leading/contributing towards the technical vision of the team/org, strong ownership of mission critical systems, dedication to honing their craft while mentoring others Experience in building and maintaining both batch and streaming data pipelines using DBT/Databricks/Apache Spark/Apache Flink, as well as deep understanding of data architecture and data modeling best practices Expertise with AWS services, cloud architecture, fault-tolerant distributed data systems, and proficiency with Terraform for provisioning and managing cloud infrastructure Deep understanding of modern data lakehouse architectures and ecosystem such as Kafka, Flink, Spark, Databricks, Snowflake, DBT, Airflow, Debezium, Delta/Iceberg, StarRocks, Clickhouse, and proficient with Python/Java and SQL Experience building tools and frameworks to accelerate the development of data pipelines, and familiarity with data governance, data quality, and observability best practices.
In this role, you'll:
• Help define the technical vision of the team/org, articulate how our data platform and architecture could evolve • Design, implement, and optimize our high-performance, scalable data serving platform that enables data querying and consumption across the organization and external facing data products • Design, implement, and optimize our high-performance, scalable data storage and transformation platform that enables both batch and stream processing with 100+ million updates per day on datasets > 100 billion rows • Build seamless integrations between Data Cloud and various relational and noSQL OLTP databases • Build batch and streaming data pipelines for core blockchain datasets widely used across the company • Deploy cloud infrastructure at scale with enterprise-grade reliability, implement and maintain infrastructure automation and self-service, and create robust CI/CD pipelines • Establish and maintain observability, security, and data governance solutions to ensure high quality, efficiency, and reliability of data pipelines.