ThetaRay

Data Engineer

ThetaRay  •  Madrid, ES (Onsite)  •  5 months ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

About ThetaRay:

ThetaRay is a trailblazer in AI-powered Anti-Money Laundering (AML) solutions, offering cutting-edge technology to fintechs, banks, and regulatory bodies worldwide. Our mission is to enhance trust in financial transactions, ensuring compliant and innovative business growth.

Our technology empowers customers to expand into new markets and introduce groundbreaking products.

Why Join ThetaRay?

At ThetaRay, you'll be part of a dynamic global team committed to redefining the financial services sector through technological innovation. You will contribute to creating safer financial environments and have the opportunity to work with some of the brightest minds in AI, ML, and financial technology. We offer a collaborative, inclusive, and forward-thinking work environment where your ideas and contributions are valued and encouraged.

Join us in our mission to revolutionize the financial world, making it safer and more trustworthy for millions worldwide. Explore exciting career opportunities at ThetaRay – where innovation meets purpose.

We are looking for a Data Engineer to join our growing team of data experts. As a Data Engineer, you will be responsible for designing, implementing, and optimizing data pipeline flows within the ThetaRay system. You will support our data scientists with the implementation of the relevant data flows based on the data scientist’s features design and construct complex rules to detect money laundering activity.

The ideal candidate has experience in building data pipelines and data transformations and enjoys optimizing data flows and building them from the ground up. They must be self-directed and comfortable supporting multiple production implementations for various use cases.

Responsibilities

  • Implement and maintain data pipeline flows in production within the ThetaRay system based on the data scientist’s design
  • Design and implement solution-based data flows for specific use cases, enabling the applicability of implementations within the ThetaRay product
  • Building a Machine Learning data pipeline
  • Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader
  • Work with product, R&D, data, and analytics experts to strive for greater functionality in our systems
  • Train customer data scientists and engineers to maintain and amend data pipelines within the product
  • Travel to customer locations both domestically and abroad
  • Build and manage technical relationships with customers and partners

Requirements

  • 2+ years of Hands-on experience working with Apache Spark - must
  • Hands-on experience with SQL
  • Hands-on experience with version-control tools such as GIT
  • Hands-on experience with Apache Hadoop Ecosystem including Hive, Impala, Hue, HDFS, Sqoop etc..
  • Experience with Python (Pandas)
  • Experience with PySpark/Scala/Java/R
  • Hands-on experience with data transformation, validations, cleansing, and ML feature engineering
  • BSc degree or higher in Computer Science, Statistics, Informatics, Information Systems, Engineering, or another quantitative field
  • Experience working with and optimizing big data pipelines, architectures, and data sets - an advantage
  • Strong analytic skills related to working with structured and semi-structured datasets
  • Build processes supporting data transformation, data structures, metadata, dependency, and workload management
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement
  • Business-oriented and able to work with external customers and cross-functional teams
  • Fluent in English & Spanish both written and spoken

Nice to have

  • Experience with Linux
  • Experience in building Machine Learning pipeline
  • Experience with Elasticsearch
  • Experience with Zeppelin/Jupyter
  • Experience with workflow automation platforms such as Jenkins or Apache Airflow
  • Experience with Microservices architecture components, including Docker and Kubernetes.
ThetaRay

About ThetaRay

ThetaRay is transforming the way financial institutions detect and combat financial crime through the power of 𝗖𝗼𝗴𝗻𝗶𝘁𝗶𝘃𝗲 𝗔𝗜.

In a world where traditional, rule-based systems have failed to distinguish between good and bad actors —and where financial crime is increasingly pervasive—our solutions provide the much-needed remedy. We empower financial institutions to confidently grow their businesses, build trust with customers and regulators, and minimize their risk of financial crime.

The financial industry has been blinded by outdated, rule-based systems that created bottlenecks and increased compliance costs. These systems incorrectly flagged legitimate customers while missing the majority of financial crime, stifling growth and delivering poor customer service.

Our 𝗖𝗼𝗴𝗻𝗶𝘁𝗶𝘃𝗲 𝗔𝗜 solutions—driven by dynamic data analysis and machine learning—not only focus on identifying criminals but also on accurately recognizing the good customers and businesses that you rely on to thrive.

By leveraging the latest advancements in AI, we can detect the undetectable, uncover hidden connections, and make smart, data-driven decisions—enabling you to onboard customers confidently, reduce operational costs, improve customer experiences, and unlock opportunities worldwide.

ThetaRay is redefining success in the financial industry. Our Cognitive AI empowers you to thrive while ensuring financial crime is detected and blocked with unmatched accuracy. We believe the key to global financial prosperity and trust lies in identifying legitimate players while making it impossible for criminals to hide.

ThetaRay is the future of financial crime compliance, creating a safer, more efficient financial world.

Industry
IT & Software
Company Size
201-500 employees
Headquarters
New York
Year Founded
2013
Social Media