What You Will Do?
The Data Engineering Lead/Specialist will design, build, and maintain the robust data infrastructure primarily utilizing the Databricks Lakehouse Platform to support the Data Science and MarTech operations. This role is responsible for ensuring data is reliable, high-quality, and efficiently flows from source systems (CRM, MAP, CDP, web logs) into the Lakehouse architecture for analysis, modeling, and real-time activation.
Key Responsibilities
- Lakehouse Architecture: Design, implement, and optimize the data architecture within Databricks , leveraging Delta Lake for reliability and performance across the data ingestion and consumption layers.
- Data Pipeline Development: Design and implement scalable ETL/ELT data pipelines (batch and streaming) using tools like Databricks notebooks (PySpark/Scala) and related cloud orchestration services (e.g., Azure Data Factory, AWS Glue/Step Functions) to ingest, transform, and load high-volume marketing data.
- Databricks Optimization: Tune and optimize Databricks clusters and jobs for cost-efficiency and performance, specifically supporting high-volume MarTech data processing and large-scale ML model training (in collaboration with Data Science).
- MarTech Integration: Own the technical integration and data flow between core MarTech systems (CDP, CRM, Marketing Automation) and the central Lakehouse platform, ensuring data is ready for activation.
- Data Quality & Governance: Establish and enforce data quality checks, monitoring, and lineage practices within the Databricks environment to ensure data accuracy, consistency, and compliance.
Required Skills & Qualifications
- Extensive experience in data engineering, software engineering, or a related field.
- Bachelor´s degree Systems Engineering, Computer Engineering or
Mathematics.
- Expert proficiency in SQL and the use of Py Spark or Scala for building complex data pipelines.
- Mandatory hands-on expertise with the Databricks Lakehouse Platform, including Delta Lake.
- Strong experience with cloud data platforms ( AWS, Azure, or GCP ) and their associated services for data storage and orchestration.
- Proven experience with modern data pipeline tools (e.g., Airflow, dbt) and data modeling techniques.
Work Properly.
In case you need any reasonable adjustment to continue with your process, let your recruiter know.
Remember to attach your CV when applying to this vacancy.