Research Scientist in Large Multimodal Models Applications - San Jose

Beijing ByteDance Technology Co Ltd

San Jose, CA

Apply

JOB DETAILS

SKILLS

Academic Research, Algorithms, Computer Vision, Conferences, Data Processing, Multimedia, Natural Language Processing (NLP), Research & Development (R&D), Research Skills, Scientific Research, Video Compression, Video Processing, Video Streaming

LOCATION

San Jose, CA

POSTED

2 days ago

Team Introduction Multimedia Lab's mission is to promote cutting-edge research in multimedia (including, but not limited to image/video data processing, compression and transmission), and to transfer technologies into our products for better serving our hundreds of millions of users. We are looking for exceptional individuals from all area of multimedia processing/compression/transmission, who have a track record of research excellence, a passion to shape the future of multimedia processing, and the potential to become an outstanding leader in the field.

Responsibilities

Contribute to the research and development of multimedia algorithms based on large multimodal models, including but not limited to video understanding, quality assessment, video processing and enhancement, and video compression.
Optimize and accelerate the performance of algorithms related to large multimodal models.
Explore the implementation of large multimodal models in multimedia applications, such as short video streaming, video transcoding, live streaming, etc.
Conduct advanced academic research on large multimodal models and publish findings in international conferences and journals.Minimum Qualification
Proficiency in Diffusion, LLM, and other advanced large multimodal models; experience with model training, tuning, and application.
Familiarity with computer vision (CV) algorithms, including GAN, VAE, and Diffusion for AIGC.

Preferred Qualification

Experience with NLP and RL algorithms, and knowledge of models such as Transformer, BERT, and GPT is preferred.
A history of leading impactful projects in large multimodal models or publishing in conferences (NeurIPS, ICLR, ICML, etc.) is advantageous.

Research Scientist in Large Multimodal Models Applications - San Jose

Beijing ByteDance Technology Co Ltd

San Jose, CA

About the Company

Beijing ByteDance Technology Co Ltd

Similar Job Searches