EXL

Lead Assistant Manager

EXL  •  Noida, IN (Onsite)  •  4 months ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

Senior ASR/TTS Specialist - AI Agent Integration ExpertCompany: EXL ServiceType: Full-timeExperience: 3+ yearsPosition SummaryWe seek an exceptional Senior ASR/TTS Specialist to lead speech AI initiatives and integrate advanced speech technologies with AI agent frameworks. This role focuses on fine-tuning ASR/TTS models, implementing MLOps best practices, and building production-ready speech AI systems powering next-generation conversational AI agents.Key ResponsibilitiesSpeech AI Model Development & IntegrationModel Fine-tuning: Customize state-of-the-art ASR/TTS models for domain-specific applications with <300ms latencySpeech-to-Speech Systems: Build end-to-end S2S pipelines using Amazon Nova Sonic v1.0, Azure OpenAI Realtime (GPT-4o), and Gemini 2.5 Flash Native AudioMulti-modal Integration: Develop speech models integrating with vision and text modalities in AI agentsAgent Framework Integration: Implement speech capabilities with LangChain/LangGraph, CrewAI, AutoGen, LlamaIndex, and OpenAI Assistants APIMLOps & Production EngineeringModel Lifecycle: Implement comprehensive MLOps pipelines using MLflow, Weights & Biases, and automated CI/CDMulti-cloud Deployment: Deploy speech models across AWS Bedrock, Google Cloud AI, and Azure Cognitive ServicesReal-time Processing: Build WebSocket-based streaming audio systems handling 1000+ concurrent connectionsProduction Monitoring: Implement WER tracking, latency monitoring, and multi-provider failover mechanismsResearch & DevelopmentCutting-edge Research: Stay current with latest speech AI breakthroughs and implement novel architecturesPerformance Optimization: Optimize models for real-time inference using TensorRT, ONNX, and edge deploymentData Pipeline Engineering: Build scalable audio ingestion, preprocessing, and augmentation systemsRequired QualificationsCore Technical Skills (Must-Have)Speech AI Models (3+ years experience): - ASR Systems: Amazon Nova Sonic v1.0, Google Speech-to-Text, Azure Speech Services, Whisper, Wav2Vec2, Riva - TTS Systems: Google TTS, Azure Cognitive Services TTS, ElevenLabs (REST/WebSocket), Tortoise, VITS, FastSpeech2 - Speech-to-Speech: Direct S2S without intermediate text, multimodal audio processing - Cloud Services: AWS Bedrock Runtime, Google Cloud AI (Gemini API), Azure OpenAI ServicesProgramming & Frameworks: - Languages: Expert Python, proficient C++/Rust for optimization - ML Frameworks: Advanced PyTorch, TensorFlow 2.x, JAX/Flax - Audio Processing: librosa, torchaudio, soundfile, WebRTC, µ-law/PCM conversion - Agent Frameworks: Hands-on experience with 3+ of: LangChain, CrewAI, AutoGen, LlamaIndex, OpenAI AssistantsMLOps & Infrastructure (Essential)MLOps Tools (2+ years): - Experiment Management: MLflow, Weights & Biases - Model Serving: TorchServe, TensorFlow Serving, NVIDIA Triton - Workflow Orchestration: Apache Airflow, Kubeflow, Prefect - Containerization: Docker, Kubernetes for speech model deploymentCloud & Production: - Multi-cloud Experience: AWS (Bedrock, Nova Sonic), Google Cloud (Gemini, Speech APIs), Azure (OpenAI Services) - Real-time Systems: Sub-300ms latency, WebSocket architecture, telecom integration (Genesys AudioConnector) - Monitoring: Audio quality metrics, model drift detection, production reliability (99.9% uptime)Preferred QualificationsAdvanced SpecializationsMulti-lingual Processing: Cross-lingual transfer learning, zero-shot adaptationDomain Expertise: Healthcare, legal, technical domain speech AIEdge AI: TensorRT, Core ML, ONNX optimization for mobile/edge deploymentResearch Background: Publications in ICASSP, INTERSPEECH, ICML, NeurIPSLeadership & EducationTeam Leadership: Experience leading speech AI teams and technical initiativesEducation: MS/PhD in Computer Science, Electrical Engineering, or related fieldOpen Source: Contributions to speech AI libraries and frameworksTechnical EnvironmentProduction Technology StackCore Technologies: - Languages: Python, C++, Rust, TypeScript - Frameworks: PyTorch, TensorFlow, JAX, LangChain, CrewAI, AutoGen - Cloud Services: AWS Bedrock, Google Cloud AI, Azure OpenAI Services - Audio Tools: librosa, torchaudio, WebRTC, FFmpeg - MLOps: MLflow, Kubeflow, Docker, Kubernetes, NVIDIA Triton - Databases: Vector DBs (Pinecone, Weaviate), PostgreSQL, RedisProduction Models: - Amazon Nova Sonic v1.0 (Speech-to-Speech streaming) - Gemini 2.5 Flash Native Audio Dialog (Multimodal processing) - Azure OpenAI GPT-4o (Realtime voice conversations) - ElevenLabs (Voice cloning and synthesis)InfrastructureGPU Clusters: NVIDIA A100/H100 for model trainingEdge Deployment: NVIDIA Jetson, ARM-based targetsReal-time Requirements: <300ms latency, 1000+ concurrent streamsEnterprise Integration: Genesys AudioConnector, SIP protocol, telephony systemsKey Projects & Success MetricsPrimary Focus AreasNext-gen S2S Systems: Amazon Nova Sonic, Azure OpenAI Realtime, Gemini Native AudioMulti-cloud Integration: Unified APIs across AWS, Google Cloud, AzureConversational AI Agents: Low-latency speech-enabled customer service botsTelecom Integration: Enterprise telephony and AudioConnector systemsDomain-specific Models: Medical, legal, technical vocabulary fine-tuningSuccess MetricsPerformance: <5% WER for domain-specific tasksLatency: <300ms end-to-end processingReliability: 99.9% uptime for production servicesScale: 1000+ concurrent speech streams ; ;
EXL

About EXL

Choosing a digital partner is about more than capabilities — it’s about collaboration and character.

Unrealistic overhauls and off-the-shelf products ignore what matters most — your unique needs, culture, goals, and your legacy data and technology environments.

At EXL, our collaboration is built on ongoing listening and learning to adapt our methodologies. We’re your business evolution partner—tailoring solutions that make the most of data to make better business decisions and drive more intelligence into your increasingly digital operations.

Whether your goals are scaling the use of AI and digital, redesign operating models, or driving better and faster decisions, we’re here to partner with you to help you gain—and maintain—competitive advantage with efficient, sustainable models at scale.

Our expertise in transformation, data science, and change management helps make your business more efficient and effective, improve customer relationships and enhance revenue growth. Instead of focusing on multi-year, resource- and time-intensive platform designs or migrations, we look deeper at your entire value chain to integrate strategies with impact.

We use our specialization in analytics, digital interventions, and operations management—alongside deep industry expertise — to deliver solutions that help you outperform the competition.

At EXL, it’s all about outcomes—your outcomes—and delivering success on your terms. Share your goals with us and together, we’ll optimize how you leverage data to drive your business forward.

For more information, visit www.exlservice.com.

Industry
Consulting & Advisory
Company Size
10,000+ employees
Headquarters
New York, NY
Year Founded
Unknown
Social Media