San Francisco, CA30+ days ago
Preferred Qualifications: ⢠PhD in Computer Science, Machine Learning, Computer Engineering, Applied Mathematics, Electrical Engineering or related fields ⢠LLM PhD focus on NLP or Masters with 10 years of industrial NLP research experience ⢠Core contributor to team that has trained a large language model from scratch (10B + parameters, 500B+ tokens) or through continued pre-training, post training pipeline for alignment and reasoning, LLM optimizations, complex reasoning with multi-agentic LLMs ⢠Numerous publications at ACL, NAACL and EMNLP, Neurips, ICML or ICLR on topics related to the pre-training of large language models (e.g. technical reports of pre-trained LLMs, SSL techniques, model pre-training optimization) ⢠Has worked on an LLM (open source or commercial) that is currently available for use ⢠Demonstrated ability to guide the technical direction of a large-scale model training team ⢠Experience with common training optimization frameworks (deep speed, nemo) ⢠Experience contributing to the team that has trained a large language model from scratch (10B + parameters, 500B+ tokens) or through continued pre-training, post training pipeline for alignment and reasoning, LLM optimizations, complex reasoning with multi-agentic LLMs.