Job Description
The Senior Data Engineer – Platform Foundation is a hands-on, senior-level contributor embedded in the Foundations squad. You will design, build, and evolve the shared ingestion platform that underpins data delivery across the company The platform is the product — your job is to make it reliable, extensible, and easy for other teams to adopt.
The Foundations squad operates across three pillars: simplifying the overall data platform landscape by reducing complexity and consolidating redundant patterns; enabling structured and unstructured data ingestion at scale; and supporting the exposure of data products to consumers across the organization You contribute to all three — making architectural decisions, writing production code, and enabling other teams through documentation and hands-on support.
Key Responsibilities:
Platform Foundation Development
- Design and implement reusable ingestion components using dlt and dbt-core, covering both structured and unstructured data sources, handling high-volume, append-heavy, and schema-drifting patterns
- Own the Airflow platform end-to-end: extend and maintain DAGs and shared operators, handle deployments and version upgrades, and provide hands-on support to consuming teams
- Ensure incremental loading strategies, data quality checks, and lineage metadata are first-class outputs of every pipeline
Platform Simplification & Architecture
- Identify and eliminate redundant ingestion patterns across consuming teams, drive standardization onto shared Platform Foundation components
- Collaborate with Solution Architects to evolve the platform architecture in response to new data sources and shifting business requirements
- Support data product exposure: define and implement governed interfaces that make data reliably accessible to internal consumers
- Contribute to Terraform-managed infrastructure; participate in multi-cloud (AWS / Azure) deployment patterns
AI Tooling & Developer Productivity
- Actively use and evaluate AI-assisted development tools (GitHub Copilot, Claude Code, etc.) to accelerate platform Foundation delivery
- Champion AI tooling adoption within the squad; share best practices and guardrails around AI-generated code review
- Explore AI-powered capabilities (RAG pipelines, LLM-assisted data cataloguing) for internal platform documentation and self-service enablement
DevOps & Reliability
- Maintain and improve CI/CD pipelines (TeamCity, GitHub Actions) for platform Foundation components
- Define and enforce observability standards: DAGTask-level alerting, SLA tracking
- Participate in on-call rotation for critical ingestion pipelines; drive post-incident improvements
Team Enablement & Stakeholder Management
- Produce platform Foundation documentation, runbooks, and enablement materials for consuming squads
- Translate ambiguous or moving business requirements into concrete technical designs — comfortable challenging scope when needed
- Mentor mid-level engineers; participate in hiring and technical assessments
Qualifications
Basic Qualifications:
- Bachelor's degree in Business, Information Systems, Data/Analytics, Computer Science, or related field
- Minimum 5 years in data engineering roles, with at least 2 years in a senior / platform-level position
- Proven track record building production ingestion and transformation pipelines at scale
- Experience contributing to a shared platform or internal developer tooling consumed by multiple teams
Core Technical Skills
- Python: idiomatic, testable, production-grade code — not just scripting
- dbt-core: advanced modelling (custom materializations), testing, documentation, packages
- Apache Airflow: DAG design patterns, custom operators, dynamic task mapping, SLA management
- Cloud data platforms: comfortable with one or more major cloud warehouses (Snowflake, BigQuery, Databricks, Microsoft Fabric)
- SQL: complex analytical queries, window functions, query profiling
- Git, CI/CD: trunk-based development, automated testing gates, pipeline-as-code
AI & Modern Tooling:
- Daily user of AI coding assistants (Copilot, Claude Code or equivalent)
- Understands the limits of AI-generated code — applies rigorous review, not blind trust
- Interest in LLM-powered data tooling (RAG pipelines, Cortex, semantic layers) is a plus