Data Engineer-(CREQ257234)
Data Engineer and Generative AI/ML Specialist
Skill Cluster/Practice: Data Engineering / Emerging Technologies (GenAI)
We are seeking an innovative and results-driven Data Engineer and GenAI/ML Specialist with expertise in large-scale data engineering, LLM-based automation, and the implementation of intelligent systems. The ideal candidate will leverage proven experience to develop robust, compliant, and scalable data solutions, contributing to both data infrastructure and advanced AI applications.
Data Pipeline Architecture: Design, build, and maintain large-scale, high-volume real-time data ingestion and streaming pipelines using technologies like Apache Spark/PySpark, Python, and Kafka on cloud platforms like Databricks, GCP, and AWS.
Generative AI and Machine Learning: Lead the development and deployment of LLM-based automation, Retrieval-Augmented Generation (RAG) pipelines, and AI agent systems (using tools like OpenAI/HuggingFace).
ML/DL Model Productionization: Apply and productionize Machine Learning and Deep Learning models, including building classification models, predictive models, time-series forecasting, and sentiment analysis solutions using frameworks like PyTorch, TensorFlow, and scikit-learn.
Data Modeling and Warehousing: Migrate and re-architect data models to build and maintain data lakes and data warehouses (e.g., Snowflake), streamlining ETL/ELT workflows using tools like dbt and Apache Airflow.
Data Quality and Compliance: Ensure data ingestion and processing workflows adhere to data quality standards, compliance, and data privacy requirements.
Performance Optimization: Optimize SQL and Spark workloads to improve query performance and reduce compute costs.
Required Technical Skills and Experience
Experience and hands-on experience in Data Engineering, Machine Learning, or AI Solution Engineering.
Core Proficiency: Deep proficiency in Python and SQL for data processing, scripting, and optimization.
Big Data & Cloud: Expertise in distributed computing (e.g., Apache Spark/PySpark, Databricks), real-time streaming technologies (e.g., Kafka), and cloud platforms (AWS, GCP, Snowflake).
CA-ON-Toronto
Full Time
Individual Contributor
Experienced
No
18/05/2026, 5:06:33 PM

Virtusa is a global product and platform engineering services company that makes experiences better with technology. We help organizations grow faster, more profitably, and more sustainably by reimagining enterprises through domain-driven solutions. We combine strategy, design, and engineering, backed by unmatched expertise at the intersection of industry, business, and technology to generate real-world business impact for clients.
Headquartered in Massachusetts with global delivery centers, Virtusa provides a broad range of services, solutions, and assets, including strategy and design, AI advisory and services, digital engineering, data and analytics, digital assurance, cloud and security, cx transformation and managed services across industries such as financial services, healthcare, communications, media, entertainment, travel, manufacturing, and technology.