Job Description

Archer is a leading provider of integrated risk management (IRM) solutions that enable customers to improve strategic decision-making and operational resilience with a modern technology platform that supports qualitative and quantitative analysis driven by both business and IT impacts. As true pioneers in GRC software, Archer remains solely dedicated to helping customers manage risk and compliance domains, from traditional operational risk to emerging issues such as ESG. With over 20 years in the risk management industry, the Archer customer base represents one of the largest pure risk management communities globally, with more than 1,200 customers including more than 50% of the Fortune 500. Learn more at www.ArcherIRM.com

Data Scientist – LLM & Data Pipeline Engineering (LegalTech / RegTech AI)

We are seeking an experienced Data Scientist with a strong background in AI model integration, data pipeline development, and knowledge base (KB) engineering to support our next-generation LegalTech / RegTech AI platform.

This role blends applied machine learning, data engineering, and software development, focusing on building scalable pipelines that connect large language models (LLMs) to structured and unstructured data through retrieval-augmented generation (RAG) and vector database architectures.

The ideal candidate is passionate about operationalizing AI — from training and fine-tuning models to deploying intelligent retrieval systems in AWS cloud environments.

Key Responsibilities

1. AI Model Integration & Development

Design, train, and evaluate LLM-based pipelines for document understanding, obligation extraction, and regulatory reasoning.
Implement and optimize RAG architectures, combining LLMs with vector databases for semantic retrieval.
Develop and maintain model fine-tuning workflows, embedding generation, and knowledge distillation.
Collaborate with ML Ops teams to integrate AI models into production-ready APIs and services on AWS
Measure and improve model precision, recall, latency, and interpretability.

1.5 Agentic and MCP Knowledge Integration:

Design and maintain agentic multi-component processes (MCPs) that enable context-aware reasoning across multiple data sources and agents.
Implement AI agents capable of dynamic tool use, autonomous task decomposition, and multi-context knowledge retrieval.
Develop pipelines that support agent memory, self-reflection, and knowledge synthesis across distributed systems and knowledge bases.
Collaborate with engineering teams to integrate MCP-driven agents with retrieval, analytics, and workflow orchestration layers, ensuring compliance with regulatory reasoning frameworks.

2. Data Pipeline Engineering

Build and manage end-to-end data pipelines for ingestion, transformation, embedding, and indexing of legal and compliance data.
Orchestrate data workflows leveraging AWS services (e.g., S3, Lambda, Glue, SageMaker, Step Functions, RDS).
Develop scalable ETL/ELT processes to feed both relational ( PostgreSQL) and vector databases (e.g., Pinecone, FAISS, Weaviate, Elastic Vector Search).
Ensure data lineage, reproducibility, and version control across AI and analytics pipelines.
Automate retraining and evaluation pipelines for continuous learning from user feedback.

3. Knowledge Base & Information Retrieval

Architect and maintain intelligent Knowledge Bases (KBs) to support AI-driven search, summarization, and compliance reasoning.
Implement advanced retrieval techniques using ElasticSearch / Elastic Vector Search and embedding-based retrieval.
Align KB structures with business ontologies and regulatory taxonomies to support explainable AI outputs.
Collaborate with domain experts and PMs to enrich KB metadata and enhance model context relevance.

4. AWS & Deployment

Deploy and scale AI pipelines using AWS services such as SageMaker, Lambda, ECS/EKS, API Gateway, and CloudFormation/Terraform
Implement model and data monitoring solutions for drift detection, latency management, and cost optimization.
Collaborate with DevOps to maintain secure, reliable, and compliant cloud environments.

5. Cross-Functional Collaboration

Partner with engineering, product, and compliance teams to align AI models with regulatory and data governance requirements.
Work closely with QA and Professional Services teams to validate AI outputs and improve client-facing performance.
Document architectures, experiment results, and data flows to ensure transparency and reproducibility.

Preferred Experience

Experience building AI products for LegalTech, RegTech, or compliance automation
Familiarity with agentic AI frameworks (e.g., OpenAI MCP, CrewAI, LangGraph, or AutoGen).
Background in document intelligence systems, multi-agent orchestration, or knowledge graph integration
Experience with LangChain, LlamaIndex, or similar frameworks for RAG orchestration.
Hands-on knowledge of MLOps tools and data versioning (DVC, MLflow, Weights & Biases).
Understanding of governance, interpretability, and ethical AI

Qualifications

5+ years of experience in data science, ML engineering, or AI-driven software development
Strong programming skills in Python (NumPy, Pandas, PyTorch/TensorFlow, LangChain, or equivalent).
Experience with vector databases and retrieval systems (Pinecone, FAISS, Weaviate, Qdrant, or Elastic Vector Search).
Hands-on experience with RAG pipelines, embedding models, and LLM orchestration (OpenAI, Bedrock, Hugging Face, etc.).
Solid understanding of data pipelines, ETL frameworks, and cloud-native deployment on AWS
Familiarity with Elasticsearch, PostgreSQL, and API integration patterns.
Knowledge of ML lifecycle management, including model training, evaluation, and monitoring.

Soft Skills

Strong problem-solving and system design capabilities.
Excellent communication skills for cross-disciplinary collaboration.
Passion for structured documentation, reproducibility, and experimentation.
Adaptable mindset with focus on performance, scalability, and reliability.

Success Indicators

Scalable and well-documented RAG pipelines supporting production of AI workloads.
High model accuracy, retrievability, and latency efficiency.
Reliable data flow from ingestion to inference with minimal manual intervention.
Increased explainability and compliance assurance across AI outputs.

Additional Information:

About Archer’s Culture and Work Environment:
Our people, team collaboration and dynamic leadership is the centerpiece of our great culture and the reason for Archer’s 25 years of success. Over the years, many companies and global organizations have been faced with tough decisions. Layoffs, reorganizations, acquisitions, and mergers. Yet, throughout these challenging times, Archer has exemplified strong innovation and growth and a commitment to our employees.Why is this possible? Collaboration is the key to our success. It inspires great innovation and innovative ideas. It is why Archer's is a household name in the GRC space. Companies, from F500 – F1000, come to Archer first - for our thought leadership and for our ability to meet customers where they are. As we continue to grow and evolve, our focus will remain the same: continue innovating, support our customers and employees and continue driving the risk management industry to new levels.

Please note this job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required of the employee for this job. Duties, responsibilities and activities may change at any time with or without notice at management discretion based on business need.

Archer is committed to the principle of equal employment opportunity for all employees and applicants for employment and to providing employees with a work environment free of discrimination and harassment. All employment decisions at Archer are based on business needs, job requirements and individual qualifications, without regard to race, color, religion, national origin, sex (including pregnancy), age, disability, sexual orientation, gender identity and/or expression, marital, civil union or domestic partnership status, protected veteran status, genetic information, or any other characteristic protected by federal, state or local laws. Archer will not tolerate discrimination or harassment based on any of these characteristics. This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training. All Archer employees are expected to support this policy and contribute to an environment of equal opportunity.

If you need a reasonable accommodation during the application process, please contact talent-acquisition@archerirm.com. All employees must be legally authorized to work in Country they are applying for. Archer and its approved consultants will never ask you for a fee to process or consider your application for a career with Archer. Archer reserves the right to amend or withdraw any job posting at any time, including prior to the advertised closing date.

Pay Transparency Notice: We’re committed to fair and transparent pay practices. In line with state pay transparency laws, the salary range for this role is available upon request. Please contact our Talent Acquisition team at Talent-Acquisition@archerirm.com for the range and related compensation details. Actual pay may vary based on location, experience, skills, and internal equity.

About Archer Integrated Risk Management

For more than 20 years, Archer has pioneered holistic integrated risk management solutions that empower enterprise organizations to more effectively manage risk, ensure compliance, and address emerging challenges. Leveraging advanced technology like artificial intelligence (AI) and risk quantification, Archer’s broad range of solutions and services provide our clients with a clear understanding of risk that drives strategic decision-making and operational resilience. Visit www.ArcherIRM.com.

Industry

IT & Software

Company Size

501-1,000 employees

Headquarters

Overland Park, Kansas

Year Founded

Unknown

Website

archerirm.com

Social Media

Archer Data Scientist

Job Description

About Archer Integrated Risk Management