Job Description

Huawei Canada has an immediate 12-month contract opening for a Reinforcement Learning Researcher.

About the team:

Founded in 2012, the Noah’s Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab’s mission focuses on advancing artificial intelligence and related fields to benefit the company and society. Driven by impactful, long-term projects, the aim is to enhance state-of-the-art research while integrating innovations into the company's products and services, including LLMs, RL, NLP, computer vision, AI theory, and Autonomous driving.

About the job:

Enabling Large Language Models (LLMs) to learn from experience, interaction, and environment feedback, moving beyond static fine-tuning toward continual, agentic self-improvement.
LLM post-training paradigms (e.g., RLHF, GRPO, reward-free methods, etc.);
Agentic reinforcement learning for tool-using and browsing-based LLMs trained in interactive environments;
Agentic evaluation and benchmarking, including design of multi-turn, verifiable reasoning tasks.
Your work will involve implementing and evaluating new training and evaluation pipelines for reasoning-enhanced LLMs and tool-using agents, scaling experiments on large GPU clusters, and contributing to scientific insights and publications in this emerging area.

Requirements

About the ideal candidate:

PhD degree in Computer Science or related fields or master's degree with comparable experience.
Strong foundation in deep learning, including architectures such as Transformers and optimization techniques for large models.
Practical or research experience in reinforcement learning, self-supervised learning, or language model fine-tuning
Proven research record in AI by having at least one paper as the first author in top tier venues, such as NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ICRA.
Solid proficiency in Python and experience with PyTorch, DeepSpeed, Megatron and other distributed training frameworks.
Familiarity with LLM post-training pipelines (RLHF, GRPO/PPO, SFT, LoRA, MoE, etc.) is a strong asset.
Experience with multi-agent RL, tool-use / browser/coding agents, is a strong asset.
Strong communication and writing skills; enthusiasm for open research and collaborative problem-solving.

About Huawei

Huawei is a leading global provider of information and communications technology (ICT) infrastructure and smart devices. With integrated solutions across four key domains – telecom networks, IT, smart devices, and cloud services – we are committed to bringing digital to every person, home and organization for a fully connected, intelligent world.

Huawei's end-to-end portfolio of products, solutions and services are both competitive and secure. Through open collaboration with ecosystem partners, we create lasting value for our customers, working to empower people, enrich home life, and inspire innovation in organizations of all shapes and sizes.

At Huawei, innovation focuses on customer needs. We invest heavily in basic research, concentrating on technological breakthroughs that drive the world forward. We have more than 207,000 employees, and we operate in more than 170 countries and regions. Founded in 1987, Huawei is a private company fully owned by its employees.

House Rules

This page is for ICT professionals with an interest in Huawei and our industry to engage in open discussions.

To facilitate dialogue, please follow these rules:

- Huawei holds the right to delete comments that are offensive, misleading, false, unlawful, off-topic and in violation of any regulations.

- Repeated violations of any of the above will be removed and users may be blocked.

- Huawei does not necessarily endorse the information shared by members.

- Please be familiar with and follow LinkedIn's User Agreement.

- By publicly uploading a photograph or comment, you give Huawei permission to feature your content. This will always be credited.

Please visit the below portals for career or customer service queries.

Career page: http://bit.ly/2rdljD7

Customer service: http://bit.ly/2a4mXNY

Thank you for visiting us & we hope you enjoy your time on our page.

Industry

Telecommunications

Company Size

10,000+ employees

Headquarters

Shenzhen, CN

Year Founded

Unknown

Website

huawei.com

Social Media

Researcher - Reinforcement Learning

Job Description

Requirements

About Huawei