Job Description
Join the Innovative BytePlus Team!
As part of BytePlus, you'll help both enterprises and AI developers build what's next for their business. Leveraging Bytedance's cutting-edge technologies in AI models, agent solutions and cloud infrastructure, we are devoted to developing innovative products and solutions to shape the future. We help our clients to focus on what truly matters. You can help us to achieve our mission.
Key Responsibilities:
- Bridge the gap between our cutting-edge Generative AI research on models and real-world industrial applications. You aren't just shipping code; you are solving the "last mile" problem of AI—ensuring models are performant, grounded, and integrated into complex agent solutions to deliver results directly.
- AI agent solution design & prototyping: Rapidly design and iterate AI agents based on clients' requirements, prototype workflows and sytem integrations, LoRA fine-tuned models tailored to specific industry needs (e.g., automated credit risk summaries for Fintech, or design style specific image/video content generation).
- Production-ready implementation and delivery: Write production-quality code to integrate AI model APIs in various agent frameworks (ADK, LangChain, etc) with necessary tools (RAG, vectorDB, memory base, cache management, skills, etc)
- Performance Optimization: Tackle challenges around latency, throughput, and cost. You’ll optimize prompt chains and implement caching strategies to ensure AI features scale for millions of users.
- Evaluation & Guardrails: Develop rigorous evaluation frameworks (LLM-as-a-judge, human-in-the-loop) to ensure outputs are safe, accurate, and hallucination-free—critical for compliance-heavy industries like Finance. Also to evaluate various models on the same tasks to conclude on the pros and cons for GTM.
- Cross-functional Collaboration: Work with product and Algorithm research teams to feed "field signals" back into the core product roadmap.
Basic Technical Skills Required:
- AI/ML: Deep understanding of Transformer architectures, RAG, prompt engineering, and vector databases (e.g., Pinecone, Milvus, Weaviate).
- Engineering: Proficiency in Python (FastAPI, PyTorch/Jax) and experience with orchestration frameworks like LangChain or LlamaIndex.
- Data Systems: Comfort with SQL/NoSQL and data processing at scale (Spark, Flink).
- Cloud/DevOps: Experience with Docker, Kubernetes, and deploying models on cloud providers.