Develop and support AI-driven applications focused on knowledge base ingestion, including extracting, processing, and storing structured and unstructured data (e.g., Excel files, PDFs, images, and database records).
Build and maintain data pipelines for machine learning models, ensuring data quality, consistency, and efficient processing.
Implement document parsing, text preprocessing, and chunking strategies to prepare data for retrieval-augmented generation (RAG) systems.
Design and support workflows for retrieving relevant data and injecting context into prompts for AI applications.
Assist in developing and integrating backend services using Python frameworks such as Django.
Collaborate with cross-functional teams to test, deploy, and improve AI models and pipelines.
Support containerized deployments and environments using Docker and Kubernetes.
Contribute to continuous improvement of AI systems, including reducing hallucinations and improving response accuracy.
Requirements
Qualification:
Bachelor’s degree in Computer Science, Information Technology, Data Science, or any related field.
<max 2 for Junior at least 3 for Mid> years of experience in AI/ML, data engineering, or backend development with exposure to production-level AI systems.
Strong understanding of AI/ML concepts, including embeddings, vector databases, and RAG architecture design.
Proficient in Python and relevant libraries such as pandas, numpy, openpyxl, requests, and PyPDF.
Hands-on experience in building and optimizing data pipelines for structured and unstructured data.
Strong experience in data preprocessing, document parsing, chunking strategies, and normalization techniques.
Experience working with vector databases and similarity search mechanisms.
Familiar with Docker and Kubernetes for containerized deployment and scaling.
Experience with Dify platform or similar AI workflow/orchestration tools is an advantage.
Strong understanding of prompt engineering, context injection, and techniques to minimize hallucinations in AI systems.
Experience with API development and system integration using frameworks such as Django or FastAPI.
Familiar with CI/CD pipelines, version control (Git), and Agile methodologies.
Strong problem-solving, analytical, and communication skills, with the ability to work across cross-functional teams.