Telnyx

Senior Machine Learning Engineer (Speech Synthesis)

Telnyx  •  Amsterdam, NL / Dublin, IE / Kraków, PL (Remote)  •  1 month ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

About Telnyx

Telnyx is an industry leader that's not just imagining the future of global connectivity—we're building it. From architecting and amplifying the reach of a private, global, multi-cloud IP network, to bringing hyperlocal edge technology right to your fingertips through intuitive APIs, we're shaping a new era of seamless interconnection between people, devices, and applications.

We're driven by a desire to transform and modernize what's antiquated, automate the manual, and solve real-world problems through innovative connectivity solutions. As a testament to our success, we're proud to stand as a financially stable and profitable company. Our robust profitability allows us not only to invest in pioneering technologies but also to foster an environment of continuous learning and growth for our team.

Our collective vision is a world where borderless connectivity fuels limitless innovation. By joining us, you can be part of laying the foundations for this interconnected future. We're currently seeking passionate individuals who are excited about the opportunity to contribute to an industry-shaping company while growing their own skills and careers.

The Impact You'll Drive

As a Senior ML Engineer (Speech Synthesis), you’ll be a founding member of the team building Telnyx’s next-generation speech synthesis systems. This is a greenfield opportunity — you’ll define the stack, architecture, and best practices for training and deploying state-of-the-art multilingual text-to-speech (TTS) models that power our voice AI agents.

You’ll build everything from distributed training pipelines to inference services that generate ultra-low-latency, lifelike voices across dozens of languages. Your work will bridge research and production — shaping how millions of people experience real-time conversational AI.

What You’ll Work On

  • Own the stack from day one Design and implement the ML training and inference pipelines for multilingual speech synthesis.
  • Low-latency TTS Engineer systems optimized for real-time, streaming speech generation with sub-100ms response targets.
  • Train cutting-edge models Build and fine-tune multilingual TTS systems using modern architectures — including LLM-based, diffusion, and flow-matching approaches.
  • Massive-scale data processing Develop pipelines for ingesting, aligning, and normalizing text, audio, and phonetic data across dozens of languages.
  • Experimentation at scale Run distributed training across multi-node GPU clusters, tracking results and iterating quickly.
  • Cross-functional collaboration Work with infrastructure and voice platform teams to deploy models that scale globally.
  • Research meets production Evaluate emerging techniques (LLM-guided synthesis, zero/few-shot voice cloning, full-duplex modeling) and bring them to life in production-grade systems.

What You’ll Work With

  • Infrastructure Docker, Kubernetes, Ray, Kubeflow, MLflow, Weights & Biases
  • Data Systems Kafka, Redis, PostgreSQL, Parquet
  • You’ll define it You’ll help select and implement the stack that supports distributed training, data processing, and inference for global deployment.

What We’re Looking For

  • 6+ years of experience in machine learning or speech systems engineering
  • Hands-on expertise with neural TTS, speech synthesis, or adjacent areas (ASR, voice cloning, multilingual modeling)
  • You’ve obsessed over one or two hard problems, whether it’s building multilingual TTS from noisy data, teaching LLMs to speak, designing self-supervised audio encoders, or making diffusion models run in real time.
  • Experience with LLM-based approaches to speech synthesis or prosody control
  • Strong proficiency in Python and PyTorch
  • Ability to deploy models efficiently (ONNX, TensorRT)
  • Experience leading small teams and defining technical direction or team executables
  • Production mindset: You build systems that run fast, stay stable, and are easy to maintain

Why Telnyx

You’ll be joining a company where voice, infrastructure, and AI converge. Telnyx is building the foundation for real-time, intelligent global communications — and your work on multilingual TTS will be at the core of that vision.

#LI-KG1
#LI-REMOTE

Telnyx

About Telnyx

Telnyx is the full-stack platform for real-time conversational AI—designed for teams that want to build with power, flexibility, and speed.

We combine global telephony, dedicated AI infrastructure, and full customizability under one roof so you can design and deploy AI-powered agents that feel like part of your team. From low-latency voice streaming to scalable, multi-language support, Telnyx gives you everything you need to build real-time, intelligent voice experiences.

Whether you’re enhancing customer support, automating outbound calls, or embedding voice AI into your product, Telnyx makes it easy to launch and scale with confidence. And you never have to worry about piecing together multiple providers or sacrificing performance.

Build agile. Launch faster. Scale globally. All on one platform built for real-time engagement.

Industry
IT & Software
Company Size
201-500 employees
Headquarters
Austin, Texas
Year Founded
2009
Social Media