I'm a computer scientist from Barcelona. I'm passionate about FP (mostly Scala), data, algorithms and mountains.

Resume

I’m Miguel, a Computer Scientist from Barcelona with a strong background in large-scale data platforms and distributed systems. I’ve grown from hands-on data engineering into engineering leadership, building and scaling teams responsible for end-to-end data platforms operating at petabyte scale.

I’m deeply interested in functional programming (mainly Scala), system design, and pragmatic engineering. I’m demanding about the quality and impact of what I build, with a strong focus on delivering real business value, while continuously experimenting with new languages, technologies, and ideas.

Experience

Sabbatical
Apr 2025 to Present
Verve
Jul 2019 to Mar 2025
Senior Engineering Manager, Data Platform
Sep 2023 to Mar 2025

Verve Group is a global ad platform connecting brands and publishers to people in real-time.

Started as a data engineer and transitioned into an engineering manager role, scaling the data team from 2 to 9 engineers. The team owned the end-to-end data platform, spanning infrastructure, ingestion, processing, analytics, and database management, operating at petabyte scale.

Focused on delivering business value through modern data technologies while keeping infrastructure growth efficient and cost-aware. Built and led a highly motivated team by promoting technical ownership, continuous learning, and exposure to complex, high-impact problems, ensuring engineers remained challenged while staying aligned with real business needs.

Non-exhaustive tech stack:

  • Scala/Python
  • Airflow
  • Apache Spark and Delta
  • Kubernetes
  • Kafka
  • Apache Druid
  • GCP ecosystem (GCS, BigQuery)
  • AWS ecosystem (S3, Athena)
  • Trino
  • Prometheus/Grafana

Engineering Manager, Data Platform
Nov 2022 to Aug 2023
Lead Data Engineer
Feb 2020 to Oct 2022
Data Engineer
Jul 2019 to Jan 2020
Sabbatical
Jan 2018 to Jun 2019
Trovit
Jul 2014 to Dec 2017
Data Engineer
Jul 2015 to Dec 2017

I worked in the Data team. We helped the company leverage the data generated by the users and other departments using distributed computing. We also managed a self-hosted YARN cluster (with both Hadoop and Spark jobs) of about 60 hosts.

I led the keywords management and other related batch data pipelines. A keyword is just a set of tokens related to content. The goal of the pipeline was to manage all the keywords and, thus, the visibility of all the content to search engines. The total number of keywords exceeded the hundreds of millions, and the pipeline consisted of different phases:

  1. Check if new keywords could be generated
  2. Simulate the number of results of each keyword (the ones without a minimum quantity of content are useless)
  3. Categorize and contextualize the tokens (what does the keyword really mean?)
  4. Relate keywords with each other to generate linking (by hierarchy, clustering...)
  5. Check which keywords are worth indexing and generate a Solr index with them

The pipeline was implemented using a hybrid Hadoop-Spark batch pipeline. It was challenging in many ways: performance issues, lack of context (the same token could mean a lot of different things), dealing with search engines performance, dealing with different languages, etc.

Other projects I worked on:

  • Ads categorization, deduplication, sorting and automatic expiration. Kafka, was used to enqueue the downloaded ads. A Hadoop ecosystem (YARN, HDFS and MapReduce) was used to consume, process and analyse them. Finally, Solr indices were built with all the processed information and deployed to production.
  • Stats processing. We used Kafka to enqueue impressions, clicks, e-mail openings and conversions from the site. Then, different Hadoop ETL pipelines processed the queues and extracted useful information for the company. Finally, the data was persisted to Hive, Impala or MySQL so, it could be consumed more easily.
Full Stack Web Developer
Jul 2014 to Jul 2015

I worked developing different experimental Web projects expected to be an important part of the company in the future. The most important one was the "Publish Your Ad" project, where users could post their own ads directly on Trovit (which was a pure aggregator before that).

Some of the technologies I used were:

  • PHP (Composer)
  • Javascript (jQuery, backbone, requirejs, zepto)
  • MySQL
  • Amazon S3
Polytechnic University of Catalonia (UPC)
Mar 2012 to Aug 2013
Internship as Web Developer and Systems Administrator
Mar 2012 to Aug 2013

I worked in the TSC (Signal Theory and Communications) department. I started helping to manage the department's data center (servers and network). Later on, I started developing both front-end and back-end web tools, which were used to improve the management of the department.

Some of the technologies I used were:

  • Apache 2
  • PHP (Symfony 2 framework)
  • SQL (MySQL)
  • Javascript (jQuery framework)
  • LDAP
  • Bash

Education

Sep 2009
Apr 2014
Bachelor’s degree in Informatics Engineering
Barcelona School of Informatics (FIB), Universitat Politècnica de Catalunya (UPC).

Major in Computing. I was trained to assess the difficulty of computing problems, to identify the most suitable machines, languages and programming paradigms, and to design and implement the best IT solution.

I successfully finished different advanced computing modules, including Theory of Computation, Machine Learning, Advanced Algorithms and Distributed Intelligent Systems.

Certifications

Aug 2015
Verified Certificate for Scalable Machine Learning
edX
I learned how machine learning algorithms could be adapted and used in large clusters of commodity machines. Particularly, I used Apache Spark to resolve different machine learning problems.

Languages

ReadingListeningWritingSpeaking
SpanishNative
CatalanNative
English
C1
Advanced
C1
Advanced
C1
Advanced
C1
Advanced
Common European Framework of Reference for Languages (CFE)

Publications

Oct 2015
Using Multi-Agent Systems to mediate in an assistive social network for elder population
Co-authors: Cristian Barrué, Ulises Cortés, Atia Cortés and Jonatan Moreno.