Curotec

Senior Data Engineer

Curotec  •  Remote  •  3 months ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description


This is a remote position.


We are seeking a Senior Data Engineer to support the ingestion, processing, and synchronization of data across our analytics platform. This role focuses on using Python Notebooks to ingest data via APIs into Microsoft Fabric's Data Lake and Data Warehouse, with some data being synced to a Synapse Analytics database for broader reporting needs.
The ideal candidate will have hands-on experience working with API-based data ingestion and modern data architectures, including implementing Medallion layer architecture (Bronze, Silver, Gold) for optimal data organization and quality management, with bonus points for exposure to marketing APIs like Google Ads, Google Business Profile, and Google Analytics 4.

This is a remote position. We welcome applicants globally, but this role has a preference for LATAM candidates to ensure smoother collaboration with our existing team

Key Responsibilities

  • Build and maintain Python Notebooks to ingest data from third-party APIs

  • Design and implement Medallion layer architecture (Bronze, Silver, Gold) for structured data organization and progressive data refinement

  • Store and manage data within Microsoft Fabric's Data Lake and Warehouse using delta parquet file formats

  • Set up data pipelines and sync key datasets to Azure Synapse Analytics

  • Develop PySpark-based data transformation processes across Bronze, Silver, and Gold layers

  • Collaborate with developers, analysts, and stakeholders to ensure data availability and accuracy

  • Monitor, test, and optimize data flows for reliability and performance

  • Document processes and contribute to best practices for data ingestion and transformation

Tech Stack You'll Use

Ingestion & Processing:

  • Python (Notebooks)

  • PySpark

Storage & Warehousing:

  • Microsoft Fabric Data Lake & Data Warehouse

  • Delta Parquet files

Sync & Reporting:

  • Azure Synapse Analytics

Cloud & Tooling:

  • Azure Data Factory, Azure DevOps


Requirements


  • Strong experience with Python for data ingestion and transformation

  • Proficiency with PySpark for large-scale data processing;

  • Proficiency in working with RESTful APIs and handling large datasets;

  • Experience with Microsoft Fabric or similar modern data platforms;

  • Understanding of Medallion architecture (Bronze, Silver, Gold layers) and data lakehouse concepts;

  • Experience working with Delta Lake and parquet file formats;

  • Understanding of data warehousing concepts and performance tuning;

  • Familiarity with cloud-based workflows, especially within the Azure ecosystem.


Nice to Have


  • Experience with marketing APIs such as Google Ads or Google Analytics 4;

  • Familiarity with Azure Synapse and Data Factory pipeline design;

  • Understanding of data modeling for analytics and reporting use cases;

  • Experience with AI coding tools;

  • Experience with Fivetran, Aribyte, and Riverly.
Curotec

About Curotec

Curotec was founded in 2010 just outside of Philadelphia in an area known as the Philadelphia Main Line. We have a global presence with clients ranging from funded startups to Fortune 100 enterprises spanning a number of vertical industries.

The work we’ve done has won numerous awards and we’ve been recognized by global organizations for providing exceptional business value to our clients on a consistent basis. When it comes to digital business solutions, there is no challenge too complicated that we cannot tackle.

Industry
IT & Software
Company Size
51-200 employees
Headquarters
Newtown Square, Pennsylvania
Year Founded
2010
Social Media