Job Description
Software Engineer, Data Privacy & De-Identification
About the Company - Series A
A rapidly growing healthcare AI infrastructure company is building a large-scale platform that
enables the responsible development and deployment of clinical artificial intelligence solutions.
Founded by leaders from healthcare technology, research, clinical informatics, and AI, the
company is focused on helping organizations safely leverage healthcare data while maintaining
the highest standards of privacy, security, and data quality.
The organization partners with major healthcare providers across the United States to make
de-identified clinical data available for AI development and validation. Its platform supports a
broad range of healthcare applications, including machine learning, medical devices, clinical
research, and healthcare innovation.
About the Data Platform
The company's data ecosystem includes longitudinal clinical information from millions of
patients and consists of diverse healthcare data modalities, including:
- Structured clinical and administrative data
- Electronic medical records and claims-related datasets
- Clinical notes and other unstructured text
- Medical imaging data
- Pathology data
- Video and waveform data
- Continuous patient monitoring and streaming datasets
The Opportunity
As a Software Engineer focused on data privacy and de-identification, you will play a key role in
expanding and improving large-scale data processing systems that protect patient privacy while
enabling AI innovation.
Key responsibilities include:
- Designing and building scalable software systems that process and de-identify large healthcare datasets at significant scale.
- Developing and executing quality assurance frameworks to validate privacy-preserving workflows.
- Deploying and optimizing data processing pipelines within cloud environments to improve reliability, efficiency, and cost-effectiveness.
- Collaborating with privacy, compliance, and clinical domain experts to define and implement de-identification requirements.
- Continuously improving operational workflows, automation, and processing performance.
Required Qualifications
Technical Skills
- 3+ years of professional software development experience using Python or a similar programming language.
- Experience across the full software development lifecycle, including design, development, testing, deployment, and maintenance.
- Familiarity with SQL and command-line scripting tools such as Bash.
- Experience working with data processing and analytics workflows.
Professional Skills
- Strong analytical thinking and problem-solving capabilities.
- Ability to design, optimize, and document operational processes.
- Experience managing quality assurance and workflow improvements.
- Strong organizational and prioritization skills.
- Collaborative mindset with the ability to work cross-functionally.
- Passion for data privacy, security, and responsible use of sensitive information.
Preferred Qualifications
- Experience working with Pandas or similar data processing frameworks.
- Cloud platform experience (AWS, Azure, or equivalent).
- Familiarity with containerization and virtualization technologies such as Docker.
- Exposure to healthcare, life sciences, or regulated data environments.
- Ability to communicate technical concepts effectively to non-technical stakeholders.
Technology StackThe team primarily works with:
- Python
- AWS cloud infrastructure
- SQL-based data warehouses
- Snowflake, Redshift, and large-scale data storage solutions
- Pandas and data processing frameworks
- Containerized deployment environment