Data Engineer
Responsible for ensuring high-quality access to and management of data sources, guaranteeing data quality through cataloging, normalization, and qualification to support analytics teams. Collaborates on defining data governance policies and structuring the data lifecycle in compliance with regulations. Oversees data integration and supervision from various application systems and Big Data/IoT platforms, ensuring data quality in the Data Lake through validation and duplicate removal.
Key responsibilities
- Ensure high-quality access to data sources
- Master and maintain data quality through cataloging, normalization, and qualification to facilitate use by analytics teams
- Capture, integrate, and consolidate structured and unstructured data from diverse systems into the Data Lake
- Structure, MAP, clean (duplicate removal), and validate data sets
- Collaborate on establishing data governance policies and managing the data lifecycle in compliance with regulations
- Oversee and monitor data pipelines on Big Data and IoT platforms
Skills and competences
- Agile methodology
- Multi-tasking
- Analytical thinking
- Organizational skills
Qualifications
- At least 5 years of experience with Agile methodology
- At least 5 years of experience in Big Data architecture and technologies
- At least 5 years of experience with Python
- At least 5 years of experience with SQL
- At least 5 years of experience developing and consuming APIs
- At least 3 years of experience working on multiple topics simultaneously
- At least 3 years of experience in analytical and synthesis work
- At least 3 years of experience demonstrating organizational rigor
- At least 3 years of experience with Spark
- At least 3 years of experience with AWS
- At least 3 years of experience with Databricks
- At least 3 years of experience with Shell scripting