Tookitaki

Site Reliability Engineer

Tookitaki  •  Bengaluru, IN (Onsite)  •  27 days ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

Location: Bangalore,None,None

Job Title: Site Reliability Engineer (SRE)
Department: Technology
Location: Bangalore
Reporting To: Head of Infra

Tookitaki is looking for a Site Reliability Engineer (SRE) with 3–6 years of experience to help maintain and scale the infrastructure that powers our flagship products—FinCense and the AFC Ecosystem. As an SRE, you will work at the intersection of software engineering and infrastructure, ensuring high availability, performance, and scalability of our platforms.

You will collaborate with engineering, DevOps, and client success teams to operationalize deployments across on-premise, VPC, and Compliance as a Service (CaaS) environments while improving monitoring, automation, and incident response.

Position Purpose

The SRE role is responsible for ensuring the reliability and efficiency of Tookitaki’s production systems and environments. This includes building monitoring systems, improving deployment pipelines, automating routine operations, and responding to production incidents. You’ll help build a resilient infrastructure that supports our mission to provide AI-driven solutions that prevent financial crime.

Key Responsibilities

  1. System Monitoring & Incident Management

  • Build and maintain monitoring, alerting, and logging systems using tools like Prometheus, Grafana, and ELK.

  • Respond to incidents and outages, conduct post-mortems, and implement corrective actions.

  • Infrastructure & Deployment Automation

    • Automate infrastructure provisioning and application deployment using Terraform, Ansible, or Helm.

    • Contribute to CI/CD pipelines, improve reliability and speed of software delivery (GitLab CI, Jenkins, etc.).

  • Container & Orchestration Management

    • Manage and troubleshoot Docker containers and Kubernetes clusters, ensuring workload scaling, resource management, and health.

    • Support application updates, rollbacks, and blue-green or canary deployments.

  • Cloud & Platform Operations

    • Operate within AWS (preferred) or GCP environments (EC2, S3, VPC, IAM).

    • Monitor system availability and resource usage across environments.

  • Security & Reliability Enhancements

    • Implement and monitor TLS/SSL, RBAC, SSO, and secure API practices.

    • Support compliance and security audit activities by maintaining logs, access controls, and operational hygiene.

  • Collaboration & Documentation

    • Work closely with developers, infra engineers, and support teams to ensure production readiness.

    • Maintain playbooks, runbooks, and system documentation for reliability engineering activities.

    Qualifications and Skills

    Education

    • Bachelor’s degree in Computer Science, Engineering, or related technical field.

    Experience

    • 3–6 years in Site Reliability Engineering, DevOps, Platform Engineering, or a related role.

    • Experience with production environments and live system debugging.

    Technical Skills

    • Kubernetes, Docker, Helm – experience deploying and scaling services.

    • Linux administration and command-line debugging.

    • Hands-on with AWS (preferred) or GCP cloud platforms.

    • Scripting in Bash and Python for automation and monitoring tasks.

    • Experience with monitoring and alerting tools like Prometheus, Grafana, ELK, or Datadog.

    • Familiarity with databases (e.g., MariaDB, ScyllaDB) and SQL/CQL querying.

    Soft Skills

    • Strong problem-solving and debugging skills.

    • Ability to work in on-call rotations and high-pressure production environments.

    • Excellent communication and documentation abilities.

    Key Competencies

    • Operational Reliability: Ensures system uptime and performance through proactive monitoring and maintenance.

    • Automation Mindset: Reduces manual effort through scripting and tooling.

    • Incident Response: Quick identification and resolution of issues to minimize downtime.

    • Cross-Functional Collaboration: Works effectively with engineering, support, and infra teams.

    • Security Awareness: Applies best practices in infrastructure and platform security.

    Success Metrics

    • Maintain 99.9%+ uptime across production environments.

    • Reduce mean time to detect (MTTD) and mean time to resolve (MTTR) for critical incidents.

    • Increase in automation coverage and reduction in manual deployment steps.

    • High internal satisfaction from developers on CI/CD and platform reliability.

    • Compliance readiness and security log availability for audits.

    Benefits

    • Competitive compensation

    • Work on a globally recognized RegTech platform transforming financial crime prevention.

    Exposure to cutting-edge AI and big data infrastructure (Spark, Kafka, ScyllaDB, Flink).

    Apply to this job

    Tookitaki

    About Tookitaki

    Tookitaki is transforming financial services by building a robust trust layer that focuses on two crucial pillars: preventing fraud to build consumer trust and combating money laundering to secure institutional trust. Our trust layer leverages collaborative intelligence and a federated AI approach, delivering powerful, AI-driven solutions for real-time fraud detection and AML (Anti-Money Laundering) compliance.

    How We Build Trust: Our Unique Value Propositions

    AFC Ecosystem – Community-Driven Financial Crime Protection

    The Anti-Financial Crime (AFC) Ecosystem is a community-driven platform that continuously updates financial crime patterns with real-time intelligence from industry experts. This enables our clients to stay ahead of the latest money laundering and fraud tactics.

    FinCense – End-to-End Compliance Platform

    Our FinCense platform is a comprehensive compliance solution that covers all aspects of AML and fraud prevention—from name screening and customer due diligence (CDD) to transaction monitoring and fraud detection. This ensures financial institutions not only meet regulatory requirements but also mitigate risks of non-compliance, providing the peace of mind they need as they scale.

    Industry Recognition and Global Impact

    Tookitaki’s innovative approach has been recognized by some of the leading financial entities in Asia, including Tencent, GXS, Maya, Aeon, UOB, and Fubon. We have also earned accolades from key industry bodies such as FATF and received prestigious awards like the World Economic Forum Technology Pioneer, Forbes Asia 100 to Watch, and Chartis RiskTech100.

    Serving some of the world’s most prominent banks and fintech companies, Tookitaki is continuously redefining the standards of financial crime detection and prevention, creating a safer and more trustworthy financial ecosystem for everyone.

    Industry
    IT & Software
    Company Size
    51-200 employees
    Headquarters
    Singapore, SG
    Year Founded
    2015
    Social Media