Location:
Work from home (Pennsylvania)
Shift:
Days (United States of America)
Scheduled Weekly Hours:
40
Worker Type:
Regular
Exemption Status:
YesThe Senior Platform Data Engineer owns roadmap, priorities, platform standards, and architecture reviews; provides formal input on performance reviews. This position makes clinical data ready for AI at scale: owning the shared data products, retrieval infrastructure, and platform administration that the entire AI portfolio depends on. Owns Real-time data feeds. Reusable clinical data models and feature pipelines. RAG retrieval infrastructure (ingestion, chunking, embeddings, vector DB, retrieval pipelines). Databricks platform administration.
Job Duties:
Streams data from Epic SDE, ADT feeds, lab results, and other clinical sources into Databricks for downstream model consumption.
Curates shared clinical feature tables (patient demographics, labs, vitals, diagnoses, utilization history, imaging metadata) in Databricks/Unity Catalog that multiple AI programs consume for model training, validation, and monitoring.
Owns RAG Infrastructure, the shared retrieval-augmented generation platform that agentic and generative AI programs use to ground LLM outputs in organizational knowledge.
Designs and operates document ingestion pipelines: normalizing clinical documents, policies, guidelines, and unstructured data sources into formats ready for embedding and retrieval.
Implements and optimizes chunking strategies tailored to healthcare content (e.g., preserving clinical note structure, section-aware chunking for guidelines and protocols).
Manages the embedding pipeline: selecting, tuning, and versioning embedding models (domain-specific clinical models where they outperform general-purpose).
Administers the vector database: schema design, indexing, metadata management, access controls, and performance tuning.
Builds and maintains retrieval pipelines: hybrid search (vector + keyword/BM25), reranking, and relevance filtering to maximize retrieval precision for downstream agents and LLM applications.
Establishes data quality gates for RAG: automated profiling, completeness checks, and accuracy scoring before content enters the vector store.
Monitors retrieval quality metrics (Precision@K, Recall@K, MRR) and continuously optimize retrieval performance.
Databricks workspace configuration and Unity Catalog governance.
Cluster policies, compute management, and cost monitoring.
Manges user/group management and access control.
Administrator for Feature Store.
Work is typically performed in an office environment. Accountable for satisfying all job specific obligations and complying with all organization policies and procedures. The specific statements in this profile are not intended to be all-inclusive. They represent typical elements considered necessary to successfully perform the job.
*Relevant experience may be a combination of related work experience and degree obtained (Master's Degree = 2 years).
Position Details:
Key Technologies:
Required Skills & Qualifications:
Education:
Bachelor's Degree-Related Field of Study (Required), Master's Degree-Related Field of Study (Preferred)
Experience:
Minimum of 5 years-Relevant experience* (Required)
Certification(s) and License(s):
Skills:
OUR PURPOSE & VALUES: Everything we do is about caring for our patients, our members, our students, our Geisinger family and our communities.
We offer healthcare benefits for full time and part time positions from day one, including vision, dental and domestic partners. Perhaps just as important, we encourage an atmosphere of collaboration, cooperation and collegiality.
We know that a diverse workforce with unique experiences and backgrounds makes our team stronger. Our patients, members and community come from a wide variety of backgrounds, and it takes a diverse workforce to make better health easier for all. We are proud to be an affirmative action, equal opportunity employer and all qualified applicants will receive consideration for employment regardless to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or status as a protected veteran.

Geisinger is among the nation’s leading providers of value-based care, serving 1.2 million people in urban and rural communities across Pennsylvania. Founded in 1915 by philanthropist Abigail Geisinger, the nonprofit system generates $10 billion in annual revenues across 126 care sites — including 10 hospital campuses — and Geisinger Health Plan, with more than half a million members in commercial and government plans. Geisinger College of Health Sciences educates more than 5,000 medical professionals annually and conducts more than 1,400 clinical research studies.
With 26,000 employees, including 1,700 employed physicians, Geisinger is among Pennsylvania’s largest employers with an estimated economic impact of $15 billion to the state’s economy. On March 31, 2024, Geisinger became the first member of Risant Health, a new nonprofit charitable organization created to expand and accelerate value-based care across the country.
For more information, visit geisinger.org/careers or connect with us on Facebook, Instagram, LinkedIn and Twitter.