Job Description

Data Architect — Databricks

Data Engineering & Pipelines
|
Mid-Level
|
Full-Time

Experience

5 – 8 Years

Level

Mid-Level

Employment
Type

Full-Time

Location

Pune - Hybrid

Primary
Stack

Databricks,
Apache Spark, Delta Lake, SQL

Domain

Data
Engineering & Pipelines

About the Role

We are looking for a hands-on Data
Architect with deep expertise in Databricks to design, build, and optimise
enterprise-scale data platforms. You will own the end-to-end data engineering
lifecycle — from ingestion and transformation to serving — while ensuring
reliability, scalability, and governance across our lakehouse architecture.

You will collaborate closely with
data engineers, analytics engineers, and product teams to translate business
requirements into robust, reusable data solutions on the Databricks Lakehouse
Platform.

Key Responsibilities

Data
Architecture & Design

•
Design and maintain the
organisation's lakehouse architecture using Databricks and Delta Lake.

•
Define data modelling
standards (dimensional, Data Vault 2.0, or medallion architecture) across
Bronze, Silver, and Gold layers.

•
Architect scalable
ingestion frameworks using structured and unstructured data sources (Kafka,
JDBC, REST APIs, cloud storage).

•
Own schema evolution
strategy and ensure backward-compatibility across data assets.

Pipeline
Development & Optimisation

•
Build and maintain
production-grade ETL/ELT pipelines using PySpark, Spark SQL, and Databricks
Workflows.

•
Implement Delta Live Tables
(DLT) for declarative, auto-scaling pipeline development.

•
Optimise Spark jobs for
performance — partitioning, Z-ordering, caching, and cluster right-sizing.

•
Establish CI/CD practices
for data pipelines using tools such as GitHub Actions, Azure DevOps, or
Databricks Asset Bundles.

Data
Governance & Quality

•
Implement Unity Catalog for
data discovery, lineage tracking, fine-grained access control, and compliance.

•
Define and enforce data
quality rules using Great Expectations, DLT expectations, or equivalent
frameworks.

•
Work with data governance
teams to document metadata, business glossary, and data contracts.

Platform
& Infrastructure

•
Manage Databricks workspace
configuration: clusters, pools, secrets, and access policies.

•
Collaborate with cloud and
DevOps teams on infrastructure-as-code (Terraform) for Databricks on Azure /
AWS / GCP.

•
Monitor platform health,
SLAs, and cost using Databricks system tables and cloud-native monitoring
tools.

Collaboration
& Mentorship

•
Partner with data consumers
(analysts, data scientists, ML engineers) to define SLAs and publish clean,
well-documented data products.

•
Review code and provide
architectural guidance to junior engineers.

•
Contribute to and champion
internal data engineering best practices, runbooks, and documentation.

Required Skills & Experience

Core
Databricks & Spark

•
4+ years of hands-on
experience with Databricks (Unified Data Analytics Platform).

•
Strong proficiency in
PySpark and Spark SQL for large-scale data transformation.

•
Deep knowledge of Delta
Lake — ACID transactions, time travel, OPTIMIZE, VACUUM.

•
Experience with Databricks
Workflows, Jobs, and Delta Live Tables (DLT).

•
Familiarity with Unity
Catalog and Databricks governance features.

Data
Engineering Fundamentals

•
Solid understanding of data
modelling paradigms: dimensional modelling, Data Vault, or medallion
architecture.

•
Experience designing and
operating streaming pipelines (Structured Streaming, Kafka, Event Hubs, or
Kinesis).

•
Proficiency in SQL;
experience with dbt is a strong plus.

•
Hands-on experience with
cloud platforms: Azure (ADLS, ADF), AWS (S3, Glue), or GCP (BigQuery, GCS).

Software
Engineering Practices

•
Version control with Git;
experience with branching strategies and code review workflows.

•
Ability to write testable,
modular pipeline code with unit and integration tests.

•
Familiarity with CI/CD
pipelines and infrastructure-as-code (Terraform preferred).

Nice to Have

•
Databricks Certified Data
Engineer Associate or Professional certification.

•
Experience with data mesh
or data product frameworks.

•
Exposure to ML pipelines,
MLflow, or Feature Store on Databricks.

•
Knowledge of data
cataloguing tools (Alation, Collibra, or Databricks Unity Catalog).

•
Experience with Apache
Iceberg or Apache Hudi as alternative table formats.

•
Familiarity with real-time
analytics or OLAP systems (Druid, ClickHouse, Redshift).

What We Offer

•
Competitive salary with
performance-linked bonus.

•
Flexible / hybrid working
arrangements.

•
Access to Databricks
training and certification budget.

•
Collaborative,
engineering-first data culture with modern tooling.

•
Clear career progression
path to Senior Data Architect or Data Platform Lead.

•
Comprehensive health,
wellness, and retirement benefits.

About nCircle Tech

Since 2012, nCircle Tech has empowered passionate innovators in the AEC and Manufacturing industry to create impactful 3D engineering & construction solutions. Leveraging our domain expertise in CAD-BIM, we provide disruptive solutions that reduce time to market and meet business goals. Our team of dedicated engineers, partner ecosystem and industry veterans are on a mission to redefine how you design, collaborate and visualize.

nCircle is a one-stop 2D/3D Product development studio for AEC and Manufacturing

With 13+ years of experience, we have worked with 100+ software companies in the construction and manufacturing sectors across the globe, successfully delivering 300+ solutions, with a strong presence in the US, UK, Japan and India.

We offer:

-CAD - BIM Plugin development & Software Customisation: design automation, workflow creation & integration into your business systems with CAD & BIM design packages.

-Building 3D applications on Cloud, Mobile, AR/VR, Point Cloud Viewer, Digital Twin, 4D scheduling, for Construction & Engineering Industry using FORGE, ThreeJS, HOOPS communicator, ODA, etc.

-Integration with BIM360, Procore, Egnyte, BOX, Forge, Oracle Aconex, Primavera P6, Dropbox, and other construction management systems.

- ML based solutions: ML powered Scan to BIM, ML powered cost estimation and Quantity takeoff, ML powered OCR, face recognition apps, etc.

Industry

IT & Software

Company Size

201-500 employees

Headquarters

Pune, IN

Year Founded

2012

Website

ncircletech.com

Social Media