Data Engineer / Marketing

General Motors • Full-time • 1d ago

What You Will Do?

The Data Engineering Lead/Specialist will design, build, and maintain the robust data infrastructure primarily utilizing the Databricks Lakehouse Platform to support the Data Science and MarTech operations. This role is responsible for ensuring data is reliable, high-quality, and efficiently flows from source systems (CRM, MAP, CDP, web logs) into the Lakehouse architecture for analysis, modeling, and real-time activation.

Key Responsibilities

Lakehouse Architecture: Design, implement, and optimize the data architecture within Databricks , leveraging Delta Lake for reliability and performance across the data ingestion and consumption layers.
Data Pipeline Development: Design and implement scalable ETL/ELT data pipelines (batch and streaming) using tools like Databricks notebooks (PySpark/Scala) and related cloud orchestration services (e.g., Azure Data Factory, AWS Glue/Step Functions) to ingest, transform, and load high-volume marketing data.
Databricks Optimization: Tune and optimize Databricks clusters and jobs for cost-efficiency and performance, specifically supporting high-volume MarTech data processing and large-scale ML model training (in collaboration with Data Science).
MarTech Integration: Own the technical integration and data flow between core MarTech systems (CDP, CRM, Marketing Automation) and the central Lakehouse platform, ensuring data is ready for activation.
Data Quality & Governance: Establish and enforce data quality checks, monitoring, and lineage practices within the Databricks environment to ensure data accuracy, consistency, and compliance.

Required Skills & Qualifications

Extensive experience in data engineering, software engineering, or a related field.
Bachelor´s degree Systems Engineering, Computer Engineering or
Mathematics.
Expert proficiency in SQL and the use of Py Spark or Scala for building complex data pipelines.
Mandatory hands-on expertise with the Databricks Lakehouse Platform, including Delta Lake.
Strong experience with cloud data platforms ( AWS, Azure, or GCP ) and their associated services for data storage and orchestration.
Proven experience with modern data pipeline tools (e.g., Airflow, dbt) and data modeling techniques.

Work Properly.

Hybrid: This position requires to work 3 days on site 2 days at home remote on a full-time basis.

In case you need any reasonable adjustment to continue with your process, let your recruiter know.

Remember to attach your CV when applying to this vacancy.

Test Your Formula 1 Career Readiness

Take the Formula 1 Career Readiness Quiz and find out if you’re on track to success. Get a personalised report highlighting your strengths and areas to improve in just 5 minutes!

Get Your Bespoke Report