Data Engineering

Emre Durukan

Senior Data Engineer

Senior Data Engineer with 7+ years of experience building and optimizing scalable data platforms on Google Cloud Platform (GCP), with strong focus on performance, reliability, and cost efficiency.

7+ Years Data Engineering Experience
~30% Cloud Cost Reduction
GCP BigQuery, Dataflow, Dataproc

Work History

Experience

Senior Data Engineer @ Hopi

Istanbul | Jan 2023 - Present

  • Optimized large-scale BigQuery datasets, improving performance and reducing cloud costs by ~30%.
  • Designed and maintained batch and streaming data pipelines on Google Cloud Platform.
  • Implemented IAM, encryption, and access controls to ensure data security and compliance.
  • Improved data reliability through monitoring, validation, and pipeline optimization.

Data Engineer @ Hopi

Istanbul | Aug 2021 - Jan 2023

  • Built and maintained real-time data processing pipelines using Apache Beam and Cloud Dataflow.
  • Developed and orchestrated ETL pipelines using Apache Airflow, Luigi, Apache Spark, and Hive SQL.
  • Managed cloud services including BigQuery, Bigtable, Dataproc, Dataflow, Pub/Sub, Cloud SQL, and Cloud Run.
  • Improved pipeline performance and stability through optimization and proactive issue resolution.

Data Engineer @ Alta Data

Istanbul | May 2019 - Jul 2021

  • Built ETL pipelines and data curation processes to support analytics and business intelligence use cases.
  • Integrated data quality checks and validation rules into pipelines to ensure accuracy and consistency.
  • Developed Tableau dashboards to provide actionable insights for stakeholders.

Software Developer @ Publins

Denizli | Nov 2017 - Nov 2018

  • Developed modern web applications using Ruby on Rails, JavaScript, and PostgreSQL.
  • Built data processing systems with Ruby and implemented basic machine learning algorithms.
  • Maintained AWS infrastructure including Elastic Beanstalk, EC2, S3, and RDS.

Writing

Blog

Practical notes on data engineering, platform reliability, and cost-aware cloud systems.

View all posts

Cutting BigQuery Costs Without Slowing Analytics

A field guide to partitioning, clustering, query review, and storage habits that lower spend while keeping analysts productive.

Designing Airflow DAGs That Recover Cleanly

Notes on retries, idempotency, backfills, alerting, and dependency boundaries for pipelines that fail in predictable ways.

Lessons From Streaming Pipelines on GCP

Patterns for Pub/Sub, Dataflow, and BigQuery pipelines where latency, correctness, and operational clarity all matter.