Xiaomi Technology

大模型爬虫工程师

Xiaomi Technology  •  Onsite  •  3 months ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

大模型爬虫工程师北京校招正式软件研发类2026届春季校招计划职位描述1. 遵照robots协议,对互联网公开网页和公开数据集进行采集;
2. 负责设计和开发分布式的网络爬虫,能独立解决实际开发过程碰到的各类问题(优化调度、并发、覆盖率等),提升数据抓取的效果和性能;
3. 负责网页信息抽取技术算法的研究和开发,提升数据抓取的效率和质量;
4. 负责爬取数据的去重、解库、爬虫系统的监控和异常警报;
加入我们,你将获得:
1.挑战前沿的爬虫技术:面对全网复杂多变的网站结构、动态渲染、反爬机制,设计高可用、智能化的爬取策略;
2.影响下一代AI大模型:工作直接决定大模型的数据质量,影响AI的理解能力、知识广度和推理能力;
3.快速成长的技术环境:接触大规模分布式爬虫、智能反反爬、自动化数据清洗等核心技术;
4.广阔的职业发展空间:可以深入爬虫架构、AI数据工程,或者转向大模型数据策略。职位要求1.本科及以上学历,计算机相关专业, 熟悉Python/Java/Go/C++其中两种语言;
2. 掌握一种或多种爬虫库(如Requests、BeautifulSoup、Scrapy等);
3. 具备扎实的编码能力,精通网络通信,对HTTPS、TCP有深入理解;
加分项:
1. 熟悉主流爬虫框架工具,如Playwright、Puppeteer;
2. 掌握正则表达式、XPath、CSS等网页信息抽取技术;
3. 了解NLP基本技术,实际使用过如Fasttext、N-gram、Bert、GPT等算法和模型者优先。 投递
Xiaomi Technology

About Xiaomi Technology

Xiaomi Corporation was founded in April 2010 and listed on the Main Board of the Hong Kong Stock Exchange on July 9, 2018 (1810.HK). Xiaomi is a consumer electronics and smart manufacturing company with smartphones and smart hardware connected by an IoT platform at its core.

Embracing our vision of “Make friends with users and be the coolest company in the users’ hearts”, Xiaomi continuously pursues innovations, high-quality user experience and operational efficiency. The company relentlessly builds amazing products with honest prices to let everyone in the world enjoy a better life through innovative technology.

Xiaomi is one of the world's leading smartphone companies. The company has also established the world’s leading consumer AIoT (AI+IoT) platform,reached 558 million smart devices connected to its platform (excluding smartphones,laptops and tablets) as of September 30 2022. Xiaomi products are present in more than 100 countries and regions around the world. In August 2022, Xiaomi was included in the Fortune Global 500 list for the fourth year in a row, ranking 266th. The company is the fastest-rising Chinese technology conglomerate during the four-year period.

Xiaomi is a constituent of the Hang Seng Index, Hang Seng China Enterprises Index, Hang Seng TECH Index and Hang Seng China 50 Index.

Industry
IT & Software
Company Size
10,000+ employees
Headquarters
Beijing, CN
Year Founded
2010
Social Media