99x

AI/MLOps Engineer

99x  •  São Paulo, BR (Remote)  •  6 days ago
Apply
AI can make mistakes so check important info. Chat history is never stored.
64
AI Success™

Job Description

We are seeking a skilled AI/MLOps Engineer to join the innovative team at 99x Brazil. In this role, you will be responsible for designing, deploying, and maintaining scalable machine learning infrastructure and pipelines that enable rapid development and reliable deployment of AI models. You will work closely with data scientists, engineers, and product managers to ensure seamless integration of AI capabilities into production systems.

You will play a crucial part in automating ML workflows, monitoring model performance, and optimizing resource utilization in cloud environments. Join us to help drive the future of AI-powered solutions in a fast-paced, collaborative environment.

Responsibilities

  • Design and maintain monitoring and observability solutions for AI applications and ML pipelines
  • Track logs, metrics, and traces using tools such as CloudWatch, Datadog, or similar platforms
  • Develop evaluation and testing frameworks for prompts, models, and AI workflows
  • Perform regression testing and quality validation for LLM-based systems
  • Manage prompt experimentation, versioning, and A/B testing processes
  • Debug AI workflows, including model outputs, orchestration pipelines, and infrastructure failures
  • Support deployment, scaling, and maintenance of AI/ML infrastructure in production environments
  • Collaborate with engineering and product teams to improve system reliability and performance
  • Analyze production data and user feedback to drive continuous improvement of AI systems
  • Contribute to operational best practices, documentation, and incident response processes

Requirements

    • Experience with DevOps, SRE, MLOps, or AI infrastructure engineering
    • Strong understanding of monitoring and observability concepts
    • Hands-on experience with tools such as Datadog, CloudWatch, Grafana, Prometheus, or similar
    • Experience supporting AI/ML or LLM-based applications in production
    • Familiarity with prompt engineering, model evaluation, and experimentation workflows
    • Knowledge of cloud platforms such as AWS, Azure, or Google Cloud
    • Experience troubleshooting distributed systems and production pipelines
    • Proficiency in Python, scripting, or automation tooling
    • Strong analytical and problem-solving skills
    • Excellent communication and collaboration abilities

Nice to Have

    • Experience with LLM orchestration frameworks
    • Familiarity with vector databases and RAG architectures
    • Experience with CI/CD pipelines for ML systems
    • Knowledge of Kubernetes, Docker, and infrastructure-as-code tools
    • Experience with AI governance, security, or compliance practices

Benefits

  • Your pick when it comes to employment models: CLT/PJ/Cooperativa;
  • We provide resources for you to grow and learn on the job, including online courses, mentoring, and the latest-gen laptops;
  • A fully remote work environment with flexible working hours;
  • Bonus for any referrals that we end up hiring;
99x

About 99x

We empower our customers to create exceptional product experiences, to rapidly scale their own teams with top-tier local and nearshore engineering talent, and to build appealing web solutions and business critical systems. Our offices are based in Norway, Sri Lanka, Malaysia, Brazil, and Portugal, with over 600 employees worldwide. The 99x Group serves clients in Norway, Sweden, Germany, across Europe and North America. What sets us apart is our combination of local presence, global access to talent and proven capability of delivering value with multi-national teams.

Industry
IT & Software
Company Size
501-1,000 employees
Headquarters
Oslo, NO
Year Founded
Unknown
Website
99x.io
Social Media