Job Description:- Deploy and manage large language models (LLMs) in production environments.
- Perform benchmarking to evaluate model performance and optimize deployment strategies.
- Research and implement new frameworks and technologies for hosting and serving models.
- Collaborate with cross-functional teams to integrate machine learning models into existing systems.
- Develop and maintain APIs for model inference and other machine learning services.
- Monitor and troubleshoot model performance and infrastructure issues.
- Stay up-to-date with the latest advancements in machine learning, AI, and related technologies
Required Skills:- Proven experience in deploying and managing machine learning models in production.
- Strong programming skills in Python.
- Experience with API development and integration.
- In-depth knowledge of Generative AI model architecture and GPU architecture.
- Familiarity with cloud platforms and containerization technologies (e.g., Docker, Kubernetes).
- Excellent problem-solving skills and the ability to work independently and as part of a team.
Strong communication skills and the ability to collaborate effectively with stakeholders