You should have experience working with or building platforms for at scale data processing and collaborating with other teams that use these platforms. Experience building applications using one of the major cloud providers is a bonus, but not required. We primarily write in Java, Scala, Python, and SQL and use technologies like Hadoop, Kafka, Airflow, Avro/Thrift, and GCP comparables like Dataproc, Dataflow, and BQ.
The client believes that small, empowered, self-motivated teams can do big things. We believe in measuring everything, taking advantage of their continuous deployment system to ship code early and often, and keeping up a blameless culture based on trust and a commitment to learning.
This role is located in Brooklyn, NY.
- Build high performing systems that are maintainable and easy to understand by selecting and integrating with the best of current technologies.
- Responsible for developing and monitoring our batch and streaming environments and improving or fixing them over time.
- Write ETL code and advise other teams on how to improve theirs.
- Build a lot of APIs and libraries in Java, Scala, or Python.
- Responsible for the quality and consistent availability of our core business data.
- Understand that being an effective software engineer is about communicating with people as much as it is about writing code.
- You are willing to work with and improve code you did not originally write.
- You are generous with your time and experience, and can mentor other engineers.
- Can take on unconstrained problems and know when to seek help.
- You have used or maintained batch data processing environments like Hadoop or Dataproc, and stream processing systems like Kafka Streams, Spark, or Dataflow
- Experience writing and scheduling ETL pipelines
- Experience writing SQL queries for exploration and analysis
- Integrating data from multiple sources
Extract Transform Load (Etl)
Microsoft Business Intelligence