Robotics Infrastructure Engineer

DRH Search

Watertown, Massachusetts

JOB DETAILS
SKILLS
Artificial Intelligence (AI), Artificial Intelligence (AI) Agents, Automation, Autoscaling, Calibration, Cloud Computing, Configuration Management, Continuous Deployment/Delivery, Continuous Integration, Cost Control, Debugging Skills, Equipment Maintenance/Repair, Hardware Debugging, Identity Data Management, Input/Output, Kernel Programming, Knowledge Management Systems, Linux Operating System, Machine Tool, Maintain Compliance, Memory Management, Metrics, Network Administration/Management, Python Programming/Scripting Language, Refactoring, Robot (Python Framework), Robotics, Security Auditing, Security Infrastructure, Simulation, Software Debugging, Startup, System Migration, System Validation, Telemetry, Test Plan/Schedule, User Interface/Experience (UI/UX), VPN (Virtual Private Network), Vehicle Fleets
LOCATION
Watertown, Massachusetts
POSTED
30+ days ago
We're assisting a well-funded robotics startup in the industrial automation space with their search for Robotics Infrastructure Engineers.  Their platform allows clients to utilize robotics as a service , increasing efficiency while keeping costs manageable.  The role will work onsite in their Watertown, MA office 5 days a week.
 
What you'll do:
  • Own robot-side software (Python): Maintain the on-robot codebase that orchestrates arms, cameras, sensors, and I/O. Debug production hardware/software failures and ship fixes fast

  • Build and maintain infrastructure as code: Manage cloud infrastructure — identity and access management, CI/CD credentials, secrets, container registries, cluster autoscaling — using declarative configuration and reproducible builds

  • Drive build system and packaging migrations: Own the transition of robot software packaging to reproducible, hermetic build systems. Maintain machine images, dev environments, and deployment pipelines

  • Build simulation and testing infrastructure: Develop end-to-end simulation systems that validate robot behavior without physical hardware — camera projection, kinematics, placement validation, fleet-wide calibration

  • Develop and operate AI-powered engineering automation: Build autonomous agents that run nightly CI triage, security audits, infrastructure compliance checks, and code quality sweeps. Design the interfaces and instructions that make agents effective at real engineering work

  • Improve observability and health monitoring: Instrument robot software with metrics and structured telemetry. Build alerting that catches problems before humans notice them

  • Work across the stack: Touch frontend, backend, protobuf definitions, deployment tooling, and cloud services as needed. No part of the system is someone else's problem

What you'll bring:
  • 3+ years of Python in a systems context — not web/ML Python, but the kind where you deal with processes, hardware I/O, async, and real-time constraints

  • Strong Linux systems knowledge: Memory management, device management, systemd, containers, networking, kernel tuning

  • Infrastructure as code experience: Declarative infrastructure and configuration management tools. You've managed IAM, CI runners, secrets, and machine images programmatically

  • Experience with real hardware: Robot arms, depth cameras, grippers, force/torque sensors, pneumatics, or similar

  • CI/CD ownership: You've not just used CI — you've owned it. Runner infrastructure, flaky test triage, build caching, GPU-enabled pipelines

  • Comfort with AI coding agents: You've used tools like Claude Code, Cursor, Copilot Workspace, or similar to do real engineering work — not just autocomplete, but directing agents through multi-step debugging, refactoring, and infrastructure tasks. You understand their failure modes and know when to trust vs. verify

  • Strong debugging instincts: You can go from a vague production symptom to root cause across hardware, OS, network, and application layers

  • Bias toward shipping over perfecting: You fix, monitor, iterate. Your commit history has more fix: than feat: and you're proud of that

Nice to Have

  • NixOS or reproducible build system experience

  • Experience building or operating autonomous engineering agents/bots

  • Robotics simulation (kinematics, camera models, physics)

  • gRPC / Protocol Buffers

  • Managed network infrastructure, VPNs, overlay networks

  • Time-series databases and observability stacks

About the Company

D

DRH Search