Job Description

Roles & Responsibilities

What you will be doing

Design and develop pipelines using Python, PySpark, and SQL
Use GitLab as the versioning control system
Utilize S3 buckets for storing large volumes of raw and processed data
Implement and manage complex data workflows using Apache Airflow (MWAA) to orchestrate tasks
Utilize Apache Iceberg (or similar) for managing and organizing data in the data lake
Create and maintain data catalogs using AWS Glue Catalog to organize metadata
AWS Athena for interactive querying
Familiarize with data modeling techniques to support analytics and reporting requirements, as well as knowledge of the data journey stages within a data lake (Medallion Architecture

What we are looking for

Ideally, a degree in Information Technology, Computer Science, or a related field
Ideally, +5 years of experience within the Data Engineering landscape
Strong expertise in Python, PySpark, SQL, and the overall AWS data ecosystem
Strong problem-solving and analytical skills
Ability to explain technical concepts to non-technical users
Proficiency to work with Github
Terraform and CICD pipelines are a great nice-to-have

The following skills will be valued:

Experience in modelling databases for analytical consumption (Star Schema, Snowflake, Data Vault)
Experience in working in a SCRUM / AGILE environment

Data Engineer Jumia Group

People Looking for Data Engineer Jobs also searched