Job Description

Must-Have**

Strong proficiency in Python programming.
Hands-on experience with PySpark and Apache Spark
Knowledge of Big Data technologies (Hadoop, Hive, Kafka, etc.).
Experience with SQL and relational/non-relational databases.
Familiarity with distributed computing and parallel processing
Understanding data engineering best practices.
Experience with REST APIs, JSON/XML, and data serialization.
Exposure to cloud computing environments.

5+ years of experience in Python and PySpark development.
Experience with data warehousing and data lakes
Knowledge of machine learning libraries (e.g., MLlib) is a plus.
Strong problem-solving and debugging skills.
Excellent communication and collaboration abilities.

Requirements

Develop and maintain scalable data pipelines using Python and PySpark
Design and implement ETL (Extract, Transform, Load) processes.
Optimize and troubleshoot existing PySpark applications for performance.
Collaborate with cross-functional teams to understand data requirements.
Write clean, efficient, and well-documented code.
Conduct code reviews and participate in design discussions.
Ensure data integrity and quality across the data lifecycle.
Integrate with cloud platforms like AWS, Azure, or GCP

Implement data storage solutions and manage large-scale datasets.

About GSB Solutions

We provide you with the perfect partnership between human capital and technology, becoming an extension of your business that can work inside and outside of it, with the purpose of giving you quality and efficiency through the most recognized standards in the market.

Currently we have presence throughout America, providing services focused on Technology and Telecommunications, always combined with human vision and the professionalism that characterizes us.

Industry

Unknown

Company Size

201-500 employees

Headquarters

México, MX

Year Founded

2009

Website

gsb.lat

Social Media

Developer

Job Description

Requirements

About GSB Solutions