Pyspark Data Engineer with Databricks

Capgemini SE

NY

JOB DETAILS
SALARY
$90,000–$110,000 Per Year
SKILLS
Apache Spark, Artificial Intelligence (AI), Business Transformation, Cloud Computing, Communication Skills, Compensation and Benefits, Continuous Deployment/Delivery, Continuous Integration, Data Management, Data Modeling, Data Quality, Data Science, Data Warehousing, Database Extract Transform and Load (ETL), Distributed Computing, Ecosystems, Employee Assistance Plan, Equal Employment Opportunity (EEO), Healthcare, Identify Issues, International Business, Legal, Machine Learning, Operations Planning, Performance Tuning/Optimization, Production Systems, Python Programming/Scripting Language, Reconciliation, Scalable System Development, Snowflake Schema, System Architecture, Team Player, Technical/Engineering Design, Use Cases
LOCATION
NY
POSTED
30+ days ago

Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way youd like, where youll be supported and inspired by a collaborative community of colleagues around the world, and where youll be able to reimagine whats possible. Join us and help the worlds leading organizations unlock the value of technology and build a more sustainable, more inclusive world.

Job Location: New York, NY

Job Description We are looking for a hands-on mid-senior level PySpark Data Engineer with Databricks who can design, build, and own production-grade data pipelines and platform components. This role requires strong expertise in Python/PySpark, Databricks, and Snowflake, with a focus on building scalable, cost‑efficient, and reliable data systems that support both analytics and machine learning use cases.

Key Responsibilities

  • Design, develop, and maintain end‑to‑end ETL/ELT pipelines using Python and PySpark on Databricks.
  • Optimize Spark jobs for performance, scalability, and cost-efficiency in production environments.
  • Implement data quality frameworks including validation, reconciliation, and anomaly detection.
  • Build and manage orchestration workflows (Airflow / Databricks Workflows / equivalent).
  • Implement pipeline monitoring, logging, alerting, and observability for reliable operations.
  • Develop and operationalize ML workflows using MLflow (experiment tracking, model registry, packaging, deployment).
  • Build scalable data ingestion and data modeling solutions for analytics and ML use cases.
  • Collaborate with data scientists, platform teams, engineering stakeholders, and business partners.

Required Skills & Experience

  • 8+ years of experience in data engineering with strong hands‑on work in PySpark and Python.
  • Deep experience with Databricks, Spark optimization, cluster tuning, and performance troubleshooting.
  • Strong experience working with Snowflake or similar cloud data warehouses.
  • Practical knowledge of workflow orchestration tools and dependency management.
  • Solid understanding of data modeling, ingestion frameworks, and distributed systems architecture.
  • Hands‑on experience implementing CI/CD for data and ML pipelines.
  • Strong experience with MLflow for managing the ML lifecycle.
  • Excellent communication skills with the ability to work across engineering and business teams.

Desired Skills The base compensation range for this role in the posted location is: 90000-110000.

Capgemini provides compensation range information in accordance with applicable national, state, provincial, and local pay transparency laws. The base compensation range listed for this position reflects the minimum and maximum target compensation Capgemini, in good faith, believes it may pay for the role at the time of this posting. This range may be subject to change as permitted by law.

The actual compensation offered to any candidate may fall outside of the posted range and will be determined based on multiple factors legally permitted in the applicable jurisdiction. These may include, but are not limited to:

  • Geographic location
  • Education and qualifications
  • Certifications and licenses
  • Relevant experience and skills
  • Seniority and performance
  • Market and business consideration
  • Internal pay equity

It is not typical for candidates to be hired at or near the top of the posted compensation range.

In addition to base salary, this role may be eligible for additional compensation such as variable incentives, bonuses, or commissions, depending on the position and applicable laws.

Capgemini offers a comprehensive, non-negotiable benefits package to all regular, full-time employees. In the U.S. and Canada, available benefits are determined by local policy and eligibility and may include:

  • Paid time off based on employee grade (A-F), defined by policy: Vacation: 12-25 days, depending on grade
  • Company paid holidays
  • Personal Days
  • Sick Leave
  • Medical, dental, and vision coverage (or provincial healthcare coordination in Canada)
  • Retirement savings plans (e.g., 401(k) in the U.S., RRSP in Canada)
  • Life and disability insurance
  • Employee assistance programs
  • Other benefits as provided by local policy and eligibility

Important Notice: Compensation (including bonuses, commissions, or other forms of incentive pay) is not considered earned, vested, or payable until it becomes due under the terms of applicable plans or agreements and is subject to Capgeminis discretion, consistent with applicable laws. The Company reserves the right to amend or withdraw compensation programs at any time, within the limits of applicable legislation.

Disclaimers Capgemini is an Equal Opportunity Employer encouraging inclusion in the workplace. Capgemini also participates in the Partnership Accreditation in Indigenous Relations (PAIR) program which supports meaningful engagement with Indigenous communities across Canada by promoting fairness, accessibility, inclusion and respect. We value the rich cultural heritage and contributions of Indigenous Peoples and actively work to create a welcoming and respectful environment. All qualified applicants will receive consideration for employment without regard to race, national origin, gender identity/expression, age, religion, disability, sexual orientation, genetics, veteran status, marital status or any other characteristic protected by law.

This is a general description of the Duties, Responsibilities and Qualifications required for this position. Physical, mental, sensory or environmental demands may be referenced in an attempt to communicate the manner in which this position traditionally is performed. Whenever necessary to provide individuals with disabilities an equal employment opportunity, Capgemini will consider reasonable accommodations that might involve varying job requirements and/or changing the way this job is performed, provided that such accommodation does not pose an undue hardship. Capgemini is committed to providing reasonable accommodation during our recruitment process. If you need assistance or accommodation, please reach out to your recruiting contact.

Please be aware that Capgemini may capture your image (video or screenshot) during the interview process and that image may be used for verification, including during the hiring and onboarding process.

Click the following link for more information on your rights as an Applicant in the United States: http://www.capgemini.com/resources/equal-employment-opportunity-is-the-law

Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its market leading capabilities in AI, generative AI, cloud and data, combined with its deep industry expertise and partner ecosystem.

About the Company

C

Capgemini SE

About Capgemini A global leader in consulting, technology services and digital transformation, Capgemini is at the forefront of innovation to address the entire breadth of clients’ opportunities in the evolving world of cloud, digital and platforms. Building on its strong 50-year heritage and deep industry-specific expertise, Capgemini enables organizations to realize their business ambitions through an array of services from strategy to operations. Capgemini is driven by the conviction that the business value of technology comes from and through people. It is a multicultural company of over 200,000 team members in more than 40 countries. The Group reported 2018 global revenues of EUR 13.2 billion (about $15.6 billion USD at 2018 average rate). Visit us at www.capgemini.com. People matter, results count.
COMPANY SIZE
10,000 employees or more
INDUSTRY
Computer/IT Services
FOUNDED
1967
WEBSITE
https://www.capgemini.com