
Job Summary
We are seeking a hardcore, hands-on AI Data Engineer to
build the high-performance data infrastructure required to power autonomous
AI agents. You won't just be moving data from A to B; you will be
architecting Dynamic Context Windows, managing Real-time Semantic Indexes,
and building Self-Cleaning Data Pipelines that feed our "Super
Employee" agents.
Job Responsibilities
·
Vector & Graph ETL: Design and maintain
pipelines that transform unstructured data (PDFs, emails, logs, chats) into
optimized embeddings for Vector Databases (Pinecone, Weaviate, Milvus).
·
Semantic Data Modeling: Engineer data
structures that optimize for Retrieval-Augmented Generation (RAG), ensuring
agents find the "needle in the haystack" in milliseconds.
·
Knowledge Graph Construction: Build and scale
Knowledge Graphs (Neo4j) to represent complex relationships in our trading
and support data that standard vector search misses.
·
Automated Data Labeling & Synthetic Data:
Implement pipelines using LLMs to auto-label datasets or generate synthetic
edge cases for agent training and evaluation.
·
Stream Processing for Agents: Build real-time
data "listeners" (Kafka/Flink) that feed live context to agents,
allowing them to react to market or support events as they happen.
·
Data Reliability & "Drift"
Detection: Build monitoring for "Embedding Drift", identifying when
the statistical distribution of your data changes and the agent's
"knowledge" becomes stale.
Essential Skills
·
Vector Database Mastery: Expert-level
configuration of HNSW indexes, scalar quantization, and metadata filtering
strategies within Pinecone, Milvus, or Qdrant.
·
Advanced Python & Rust: Proficiency in
Python for AI logic and Rust (or C++) for high-performance data processing
and custom embedding functions.
·
Big Data Ecosystem: Hands-on experience with
Apache Spark, Flink, and Kafka in a high-throughput environment
(Trading/FinTech preferred).
·
LLM Data Tooling: Deep experience with
Unstructured.io, LlamaIndex, or LangChain for document parsing and chunking
strategy optimization.
·
MLOps & DataOps: Mastery of DVC (Data
Version Control) and Airflow/Prefect for managing complex, non-linear AI data
workflows.
·
Embedding Models: Understanding of how to
fine-tune embedding models (e.g., BGE, Cohere, or OpenAI) to better represent
domain-specific (Trading) terminology.
Additional qualifications:
·
Chunking Strategy Architect: You don't just
"split text." You implement Semantic Chunking and Parent-Child
retrieval strategies to maximize LLM context relevance.
·
Cold/Warm/Hot Storage Strategy: Managing cost
and latency by tiering data between Vector DBs (Hot), SQL/NoSQL (Warm), and
S3/Data Lakes (Cold).
·
Privacy & Redaction Pipelines: Building
automated PII (Personally Identifiable Information) redaction into the
ingestion layer to ensure agents never "see" or "leak"
sensitive user data.
Background Check required
No criminal record
Others
Work mode- Hybrid model working (3 days work from office)
Office Location-Rai Durg, Hyderabad
Interview rounds-3-4 rounds of interviews.

Predictable outcome through innovations, based upon mutual trust!!
With the emergence of new technologies, many organizations are facing significant challenges such as increased stakeholder expectations, static or reduced budgets or the need to do more with less. This has led to many of them turning to IT to enable their future strategies –
this is where we come in...
Supporting your business to get value from investment in Information Technology. As a cloud service provider, we bring a wealth of experience, from a team of professionals who understand technology, particularly microsoft clouds and the positive impact it can have on improving business outcomes, allowing businesses to meet business objectives to succeed and grow.
Every day, our consultants interact with our customers in different time zones, in different countries, in different cultures, in different languages to deliver innovative solutions for our customer’s business to establish a trustworthy relationship as your managed service provider.
We operate from KOLKATA, INDIA & LONDON, UNITED KINGDOM…
Innovatively Yours!!