IntegriChain

AI Data Engineer

IntegriChain  •  Philadelphia, PA (Onsite)  •  12 hours ago
Apply
AI can make mistakes so check important info. Chat history is never stored.
74
AI Success™

Job Description

IntegriChain is the data and application backbone for market access departments of Life Sciences manufacturers. We deliver the data, the applications, and the business process infrastructure for patient access and therapy commercialization. More than 250 manufacturers rely on our ICyte Platform to orchestrate their commercial and government payer contracting, patient services, and distribution channels. ICyte is the first and only platform that unites the financial, operational, and commercial data sets required to support therapy access in the era of specialty and precision medicine. With ICyte, Life Sciences innovators can digitalize their market access operations, freeing up resources to focus on more data-driven decision support. With ICyte, Life Sciences innovators are digitalizing labor-intensive processes – freeing up their best talent to identify and resolve coverage and availability hurdles and to manage pricing and forecasting complexity.

We are headquartered in Philadelphia, PA (USA), with offices in: Ambler, PA (USA); Pune, India; and Medellín, Colombia. For more information, visit www.integrichain.com, or follow us on Twitter @ IntegriChain and LinkedIn

This role offers flexibility, but candidates must reside in Pennsylvania, New Jersey, or New York and be within a reasonable travel distance of our Philadelphia office, as regular in-person collaboration is required.

Mission

Join the Data Science team as an AI Data Engineer responsible for building the data foundations that make enterprise AI products accurate, explainable, and scalable. This role will design and implement Snowflake and dbt pipelines from raw source data to curated gold-layer datasets, create semantic models that LLM tools can use reliably, and partner with data science, product, and engineering teams to convert data dictionaries and business definitions into AI-ready data products. The ideal candidate is a strong data engineer with deep Snowflake/dbt experience and a practical understanding of how semantic layers, ER relationships, denormalized models, and metadata quality influence LLM and agent performance.

  • Snowflake and dbt engineering: Design, build, optimize, and operate Snowflake pipelines and dbt models across raw, curated, and gold-layer datasets.
  • AI-ready semantic modeling: Create semantic models, relationships, metrics, dimensions, and curated views that allow LLM tools and agents to answer questions accurately.
  • Data dictionary-driven delivery: Translate team-defined data dictionaries, business definitions, and source mappings into tested, governed, and reusable data products.
  • Agent consumption focus: Design datasets for AI agents, natural-language analytics, Snowflake Cortex Analyst, and other LLM-powered tools.
  • Enterprise data modeling: Balance normalized source models, ER relationships, dimensional models, denormalized consumption layers, and semantic-layer needs.

Key Responsibilities

Snowflake, dbt, and Data Pipeline Development

  • Build reliable data pipelines from raw source data through curated silver layers and business-ready gold layers using Snowflake and dbt.
  • Develop modular dbt models, tests, documentation, exposures, and lineage-friendly transformation patterns.
  • Implement incremental processing, snapshots, audit columns, reconciliation, data quality checks, and restartable pipeline patterns.
  • Optimize Snowflake SQL and dbt workloads for performance, scalability, cost, and maintainability.
  • Work with orchestration and DevOps/SRE teams to support CI/CD, environment promotion, pipeline monitoring, and operational runbooks.

Semantic Models and AI-Ready Data Products

  • Create Snowflake semantic models and curated views that support accurate natural-language querying through Snowflake Cortex Analyst and related LLM tools.
  • Translate approved data dictionaries into semantic model dimensions, facts, metrics, synonyms, descriptions, relationships, and business rules.
  • Design ER relationships and join paths that are explicit, accurate, and easy for semantic-layer tools and AI agents to use.
  • Create denormalized or consumption-optimized models where appropriate to reduce ambiguity and improve LLM answer quality.
  • Partner with AI developers to understand tool schema needs, agent workflows, and how data model design affects LLM tool performance.

Data Modeling, Integration, and Consolidation

  • Design logical and physical models that support enterprise data consolidation, analytical reporting, AI workflows, and business operations.
  • Work across source systems, files, APIs, cloud storage, operational systems, and analytical platforms to integrate data into Snowflake.
  • Create reusable patterns for source-to-target mapping, schema evolution, master/reference data alignment, and data product publishing.
  • Collaborate with business and technical stakeholders to validate data definitions, grain, relationships, hierarchies, and measures.
  • Support data consolidation across Integrichain by rationalizing overlapping datasets and aligning enterprise definitions.

Snowflake Cortex and AI Platform Enablement

  • Understand Snowflake Cortex capabilities, including Cortex Analyst, Cortex Complete, semantic views/models, and metadata-driven AI workflows.
  • Prepare data models and semantic layers for accurate LLM usage, including clear naming, descriptions, relationships, metrics, and governance metadata.
  • Support AI Explorer and similar applications by ensuring curated datasets are reliable, performant, explainable, and governed.
  • Partner with AI and application teams to troubleshoot semantic model issues, poor AI answers, ambiguous joins, missing metadata, or incorrect measures.
  • Contribute to standards for AI-ready data design, semantic model review, data dictionary alignment, and LLM-friendly data modeling.

Qualifications

  • 6+ years of experience in data engineering, analytics engineering, database engineering, or data platform development in production environments.
  • Strong hands-on experience with Snowflake, including SQL development, performance tuning, security-aware design, cost optimization, and large-volume processing.
  • Strong hands-on experience with dbt or comparable ELT tooling, including models, tests, documentation, lineage, and environment promotion.
  • Experience building raw-to-curated-to-gold data pipelines and business-ready datasets.
  • Strong SQL and Snowflake development skills, including complex transformations, views, stored procedures/Snowflake Scripting, and query optimization.
  • Experience creating semantic layers, semantic models, metrics, dimensions, relationships, and curated analytical views.
  • Good understanding of ER modeling, dimensional modeling, denormalized consumption models, and data grain management.
  • Experience translating data dictionaries and business definitions into physical models, dbt models, and semantic-layer definitions.
  • Understanding of Snowflake Cortex capabilities such as Cortex Analyst, Cortex Complete, and semantic-model-driven natural-language querying.
  • Ability to partner with data science, product, engineering, and business teams to deliver AI-ready data products.

Preferred Experience

  • Experience in life sciences, healthcare, pharma commercialization, MDM, patient data, channel data, or commercial data platforms.
  • Experience with Snowflake semantic views, Cortex Analyst, Cortex Search, or other AI/LLM data platform capabilities.
  • Experience with data quality frameworks, metadata management, data observability, and lineage tooling.
  • Experience with orchestration tools such as dbt Cloud jobs, Airflow, Dagster, cloud-native schedulers, or similar platforms.
  • Experience with Python for data automation, metadata processing, testing, or API integrations.
  • Experience designing governed data products for BI, AI/ML, natural-language analytics, or agentic applications.
  • Snowflake SnowPro, dbt certification, or equivalent data engineering credentials.

Additional Information

What does IntegriChain have to offer?

  • Mission driven: Work with the purpose of helping to improve patients' lives!
  • Excellent and affordable medical benefits + non-medical perks including Student Loan Reimbursement, Flexible Paid Time Off and Paid Parental Leave
  • 401(k) Plan with a Company Match to prepare for your future
  • Robust Learning & Development opportunities including over 700+ development courses free to all employees

#LI-ZG1

IntegriChain is committed to equal treatment and opportunity in all aspects of recruitment, selection, and employment without regard to race, color, religion, national origin, ethnicity, age, sex, marital status, physical or mental disability, gender identity, sexual orientation, veteran or military status, or any other category protected under the law. IntegriChain is an equal opportunity employer; committed to creating a community of inclusion, and an environment free from discrimination, harassment, and retaliation.

Our policy on visa sponsorship for US based positions: Applicants for employment in the US must have valid work authorization that does not now and/or will not in the future require sponsorship of a visa for employment authorization in the US by IntegriChain.

IntegriChain

About IntegriChain

IntegriChain is the leading provider of revenue optimization technology and insights for the Pharma industry. The company’s ICyte data-driven commercialization platform enables manufacturers to develop, implement, and operate sustainable growth strategies for life-changing science. Through its unique focus on data, SaaS and BPaaS technology, consulting, and outsourcing, IntegriChain helps manufacturers connect the commercial, financial, and operational dimensions of drug access – all the way from demand through to net revenue optimization. IntegriChain is backed by Nordic Capital, a leading sector-specialized private equity investor with a broad portfolio in Healthcare and Technology. IntegriChain’s umbrella of companies include Blue Fin Group and Federal Compliance Solutions, and the company is headquartered in Philadelphia, PA, with offices in Ambler, PA, and Pune, India.

Industry
IT & Software
Company Size
501-1,000 employees
Headquarters
Philadelphia, PA
Year Founded
2005
Social Media