Cantina Labs

Applied ML Engineer, Real‑Time Video Generation

Cantina Labs  •  €190k - €225k/yr  •  Onsite  •  4 months ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

About Cantina:

Cantina Labs is a social AI company, developing a suite of advanced real-time models that push the boundaries of expression, personality, and realism. We bring characters to life, transforming how people tell stories, connect, and create. We build and power ecosystems. Cantina, our flagship social AI platform, is just the beginning.

If you're excited about the potential AI has to shape human creativity and social interactions, join us in building the future!

About the Role:
We’re looking for an Applied ML Engineer who can take video generation models from research to real‑time production You’ll work across model engineering (fine‑tuning/distillation/optimization) and low‑latency inference & streaming (including WebRTC prototypes).

Typical time split (roughly):

  • 50–60% model engineering (distillation, optimization, fine‑tuning)

  • 30–40% serving / streaming / inference infrastructure

  • 10–20% prototyping + product integration

What You’ll Do:

  • Productionize video generation models turn research checkpoints into robust, scalable inference APIs.

  • Make models fast and affordable distillation + performance optimization (latency/cost/memory tradeoffs).

  • Build real‑time inference systems low‑latency serving, streaming outputs, reliability/observability.

  • Prototype fast ship demos (often WebRTC) and harden them into production features.

  • Multi‑GPU work run/optimize large model components across GPUs when needed.

  • Collaborate with research translate model constraints into deployable systems and performance improvements.

What You’ll Bring:

  • 2+ years in ML engineering (or equivalent), with real ownership of shipped systems.

  • Strong PyTorch + Python, comfortable with both training and inference code.

  • Hands-on experience with generative models (diffusion/transformers/VAEs), especially for image/video.

  • Proven ability to improve latency/cost in practice (profiling, memory optimization, runtime improvements).

  • Production mindset: debugging under load, monitoring, deployment hygiene.

  • WebRTC / real-time media delivery experience.

  • Comfortable in cloud environments: Docker, Kubernetes basic

Bonus Points For:

  • Distillation experience end-to-end (teacher/student, eval design).

  • Familiarity with acceleration toolchains (e.g., compilation / TensorRT / Triton / ONNX).

Technical Stack You’ll Work With:

  • Cloud/Infra: AWS (S3, DynamoDB), Kubernetes, Docker

  • ML: PyTorch

  • Models: video generation (diffusion/VAEs/transformers)

  • Optimization: distillation, real‑time inference, multi‑GPU strategies

  • Streaming: WebRTC prototypes + low‑latency delivery patterns

Location:

This role can be performed remotely in Europe, within GMT +/- 2 hours.

Compensation:

The anticipated annual base salary range for this role is between €190,000-€225,000, plus bonus. When determining compensation, a number of factors will be considered, including skills, experience, job scope, location, and competitive compensation market data.

Benefits for U.S.-based roles:

  • Competitive salary and generous company equity

  • Medical, dental, and vision insurance – 99.99% of premiums covered by Cantina

  • 42 days of paid time off, including:

    • 15 PTO days

    • 10 sick days

    • 15 company holidays

    • 2 floating holidays

  • Generous parental leave & fertility support

  • 401(k) retirement savings plan

  • Lifestyle spending account – $500/month to use however you’d like

  • Complimentary lunch and snacks for in-office employees

  • One Medical membership, and more!

Cantina Labs

About Cantina Labs

Cantina Labs is a social AI company, developing a suite of advanced real-time models that push the boundaries of expression, personality, and realism. We bring characters to life, transforming how people tell stories, connect, and create. We build and power ecosystems. Cantina, our flagship social AI platform, is just the beginning.

Industry
IT & Software
Company Size
201-500 employees
Headquarters
San Francisco, California
Year Founded
2023
Social Media