Hyphen Connect

Synthetic Data Engineer (AI Data/Training)

Hyphen Connect  •  Boston, MA (Onsite)  •  1 month ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

We are seeking a talented and innovative Synthetic Data Engineer. In this role, you will design and implement domain-specific synthetic data generation pipelines, ensuring high-quality data management for training loops. Your expertise will drive the success of data processing and model training within the organization.

Responsibilities:

  • Design domain-specific synthetic data generation (SDG) pipelines via self-instruct and constitutional prompting.
  • Implement automated quality scoring and de-duplication systems.
  • Manage data pipelines that feed directly into SFT and DPO training loops.

Qualifications:

  • Proven experience building large-scale data pipelines (Airflow, Spark, Ray).
  • Deep knowledge of prompt engineering for data generation.
  • Familiarity with dataset distillation and bias mitigation.
Hyphen Connect

About Hyphen Connect

Hyphen Connect: The Nexus of Web3 Talents

As your premier Web3 talent acquisition partner, Hyphen Connect is dedicated to driving innovation by connecting passionate talent with forward-thinking enterprises. We equip both with the essential knowledge and tools needed to excel in the rapidly evolving, decentralized landscape.

We serve as the link to top Web3 opportunities across infrastructure, DeFi, NFTs, gaming, and more, providing unparalleled insights, data-driven research, and comprehensive resources.

Join us and become an integral part of our thriving Web3 community. Let's connect!

Industry
HR & Recruiting
Company Size
1-10 employees
Headquarters
Unknown
Year Founded
2024
Social Media