AI Success™

Job Description

We are tech transformation specialists, uniting human expertise with AI to create scalable tech solutions.With over 8,000 CI&Ters around the world, we’ve built partnerships with more than 1,000 clients during our 30 years of history. Artificial Intelligence is our reality.

CI&T is expanding its data development capabilities to support a greenfield platform initiative for a leading client in the agribusiness industry This new product is being built from the ground up to deliver AI-powered agronomic analysis with georeferenced map visualizations — and the quality of its data foundation will determine everything that follows.

This role sits at the core of that foundation. As a Senior Data Developer, you will work alongside the client's technical leadership to architect and build the data ecosystem that will power intelligent agronomic insights. Your work will directly enable AI applications and geospatial visualizations to function on reliable, well-structured data — making this position both technically demanding and strategically critical. If you thrive in ambiguous, high-ownership environments where you shape the data architecture rather than inherit it, this is your role.

Responsibilities

Design and build end-to-end data pipelines across the RAW, Silver, and Gold layers of the Medallion Architecture, ensuring reliability, performance, and maintainability at each stage
Architect data ingestion, transformation, standardization, and serving processes, structuring data flows from diverse and heterogeneous sources into a coherent analytical foundation
Model data for analytical consumption following Data Warehouse best practices, including Star Schema design and dimensional modeling suited for business intelligence and AI-readiness
Identify, evaluate, and consolidate new data sources relevant to agronomic business objectives, proactively engaging stakeholders to understand, obtain, and validate data availability and quality
Interact with business stakeholders and client leadership to translate domain requirements into data architecture decisions, challenging assumptions and proposing solutions grounded in technical evidence
Manipulate, optimize, and serve data in multiple formats — including Parquet, CSV, and geospatial datasets — tailored to the consumption needs of downstream AI applications and map-based visualizations
Manage and configure cloud infrastructure end-to-end, including storage, compute, access control, serverless functions, data cataloging, and event-driven processing on AWS
Own deployment and CI/CD practices for data pipelines — including repository management, branching strategy, test gates, and automated deploy workflows via GitLab
Support the creation of the data layer that will feed AI/ML applications, ensuring data quality, structure, and availability meet the requirements of machine learning workflows — without directly developing the models themselves
Operate as a proactive technical partner in a greenfield environment: question, propose, experiment, and iterate with the team rather than execute in isolation

Requirements

English proficiency at B2 level or above — ability to explain technical flows, engage in discussions, ask clarifying questions, and collaborate effectively with international stakeholders (accent is not a barrier; communication clarity is)
Solid hands-on experience with AWS, covering the full infrastructure spectrum: S3, IAM (permissions and security configuration), Redshift, Lambda (serverless use cases), and Glue (including Glue Catalog for metadata management); ability to evaluate trade-offs between services for different pipeline scenarios
Experience with Terraform or equivalent Infrastructure-as-Code (IaC) tooling, applied recurrently in real data engineering projects — not just theoretical knowledge
Proficiency with GitLab for source control, CI/CD pipeline configuration, deployment workflows, and test gate management — specifically GitLab, not just generic Git experience
Strong proficiency in SQL, including complex query writing, analytical transformations, and performance tuning for data warehouse environments
Strong proficiency in PySpark, applied to large-scale distributed data processing — including partitioning strategies (e.g., by day/month/year), volume handling (tens to hundreds of GB), and performance optimization
Experience with Databricks, used in the context of data engineering pipelines and lakehouse architectures, including migration and deployment scenarios
Analytical data modeling expertise, with solid knowledge of Star Schema and dimensional modeling applied to data warehousing and business intelligence environments
Hands-on experience with the Medallion Architecture (RAW / Silver / Gold layers), including manipulation and optimization of Parquet and CSV files
Experience integrating and consolidating data from multiple heterogeneous sources, ensuring consistency, traceability, and analytical readiness
Mindset suited for greenfield projects proactive, solution-oriented, comfortable with ambiguity, and able to contribute to architectural decisions — not just execute predefined tasks

Nice to Have

Familiarity with SnapLogic or equivalent low-code/no-code ETL orchestration platforms (e.g., Pentaho, Airflow, Alteryx) — SnapLogic is the current standard at the client, with migration underway; hands-on experience with block/flow-based ETL logic is a differentiator
Experience with geospatial data processing and analytical environments focused on map-based and geographic visualization
Knowledge of DuckDB for in-process analytical queries
Background in data projects applied to agribusiness or precision agriculture
Exposure to predictive modeling workflows (e.g., gradient boosting, ensemble methods, or similar) — as a data provider to ML pipelines, not as a model developer

#LI-JP3 Our benefits:
-Health and dental insurance-Meal and food allowance-Childcare assistance-Extended paternity leave-Partnership with gyms and health and wellness professionals via Wellhub (Gympass) TotalPass;-Profit Sharing and Results Participation (PLR);-Life insurance-Continuous learning platform (CI&T University);-Discount club-Free online platform dedicated to physical, mental, and overall well-being-Pregnancy and responsible parenting course-Partnerships with online learning platforms-Language learning platformAnd many more!
More details about our benefits here: https://ciandt.com/br/pt-br/carreiras
At CI&T, inclusion starts at the first contact. If you are a person with a disability, it is important to present your assessment during the selection process. See which data needs to be included in the report by clicking hereThis way, we can ensure the support and accommodations that you deserve. If you do not yet have the assessment, don't worry: we can support you in obtaining it.
We have a dedicated Health and Well-being team, inclusion specialists, and affinity groups who will be with you at every stage. Count on us to make this journey side by side.

About CI&T

We are your global partner in tech-integrated business solutions, bringing deep business understanding together with technology and AI to help leaders navigate change with clarity and measurable impact. With teams around the world and decades of transformation experience, we work side by side with clients to solve complexity and create meaningful, lasting impact.

Industry

IT & Software

Company Size

5,001-10,000 employees

Headquarters

New York, NY

Year Founded

1995

Website

ciandt.com

Social Media

[Job - 29349] Senior Data Developer (AWS), Brazil

Job Description

Responsibilities

Requirements

Nice to Have

About CI&T