Senior Data Engineer - Agentic AI Engineering
Boomi
Atlanta, GA
As a Data Engineer at Boomi, you will architect the "paved road" for our AI initiatives. You will transition our "Customer Zero" prototypes from experimental code into robust, secure, and observable. You will be the bridge between cutting-edge AI research and enterprise-grade reliability, enabling the entire engineering organization to ship intelligent features faster and safer.
We are seeking a highly skilled Data Engineer to join our Agentic AI Engineering Team. This role focuses on building and operating scalable, high-quality data infrastructure that powers agentic AI systems, including LLM-based agents, multi-agent workflows, and tool-using AI systems. You will play a critical role in enabling data-driven training, evaluation, memory, observability, and continuous improvement of intelligent agents.
What You Will Do
• Architect and build scalable, secure, and observable data infrastructure to power LLM-based agents, multi-agent systems, and tool-using AI workflows.
• Design and operate robust batch and real-time data pipelines supporting embeddings, RAG systems, and agent memory frameworks.
• Develop and manage vector database solutions to enable low-latency retrieval and contextual intelligence for AI applications.
• Build data frameworks for training, evaluation, benchmarking, and continuous improvement of agentic AI systems.
• Implement strong data governance, quality controls, lineage tracking, and PII/security compliance across AI data platforms.
• Collaborate with AI/ML, platform, and DevOps teams to productionize experimental AI prototypes into enterprise-grade solutions.
• Optimize data systems for performance, scalability, reliability, and cost efficiency across cloud environments (AWS, Azure, or GCP).
The Experience You Bring
• 5+ years of experience building and operating large-scale data platforms as a Data Engineer.
• Strong programming expertise in Python and SQL for developing scalable and efficient data solutions.
• Hands-on experience designing batch and real-time data pipelines, including streaming systems like Kafka or Kinesis.
• Experience with modern data platforms and cloud environments (AWS, Azure, or GCP), including tools like Snowflake.
• Strong understanding of LLM/AI data workflows, including embeddings, RAG pipelines, evaluation datasets, and vector databases (Pinecone, Milvus ).
• Experience with DataOps/MLOps tools such as Airflow, dbt, and MLflow for orchestration and lifecycle management.
• Strong knowledge of data quality, governance, and security, including PII handling, access controls, lineage, and ensuring data reliability.
Bonus Points If You Have
• Experience supporting agentic AI systems, multi-agent architectures, or tool-using AI agents.
• Familiarity with agent memory models, conversation histories, and reasoning trace storage.
• Experience with synthetic data generation and simulation frameworks.
• Exposure to prompt engineering workflows and prompt/version management.
• Experience optimizing data systems for low-latency AI workloads.