Your business generates data
I make sure it generates value

Data Graph

Get in touch or scroll down to discover more!

Featured Projects

💧
AI & ENVIRONMENT

AIgua

Intelligent agent designed to analyze critical water quality parameters such as pH, turbidity and chlorine, providing accurate diagnostics on potential risks, treatment suggestions and educational tips.

  • Automated analysis of water quality parameters
  • Diagnostics with risks, treatments and education
  • Collaborative map with anonymous data worldwide
  • Raising awareness about access to clean water
🚗
BIG DATA & MACHINE LEARNING

Vehicle Valuation Platform

Innovative platform for automating used vehicle valuations, recognized as a finalist in the PBS UE STEAM School Awards (12th Edition).

  • LightGBM predictive model with €764 error margin
  • Web scraping for detailed data collection
  • Intuitive web interface for quick valuations
  • Significant reduction in operational time and costs
🔧
COMING SOON

Project in Development

Currently working on new innovative tech solutions. More details coming soon.

🚀 Coming Soon

Capabilities

Data Analysis / BI

  • Big data systems implementation
  • Data Lake architecture and creation
  • Data cleaning, preparation and transformation
  • KPI analysis and descriptive statistics
  • Predictive and prescriptive model design
  • Interactive dashboard development

Data Engineering

  • Database design and optimization
  • Cloud infrastructure management
  • Scalable ETL/ELT pipelines
  • Data Warehouse and Big Data integration
  • API-driven automation
  • Data quality and governance

Data Science / AI

  • Dataset preparation
  • Exploratory Data Analysis (EDA)
  • Feature Engineering
  • Natural Language Processing
  • Computer Vision
  • Model deployment in production

Professional Experience

SDG Group

Madrid Area · Hybrid

Data Engineer

Full-time · Jul 2025 - Present ·

Part of the Advanced Analytics team at a leading insurance group in Spain specializing in motor and health insurance, with a wide product portfolio (car, motorcycle, home, life, health, savings & investment), actively contributing to the development of a large-scale project.

Key responsibilities include:

  • Migration and adaptation of code from Databricks to Palantir using Code Repository
  • Building pipelines using Palantir Pipeline Builder
  • Optimization, refactoring and parameterization of code to improve performance, scalability and maintainability
  • Identifying and resolving issues arising from code adaptation to the new platform
  • Continuous follow-up and direct 1:1 communication with the client, ensuring project progress
Palantir Databricks PySpark Python

Data Engineer (Intern)

Internship · Feb 2025 - Jul 2025 ·

As a Data Engineering intern at SDG Group, I participated in a project for a multinational client highly recognized in the insurance sector, focusing on data ingestion, transformation and modeling. My work contributed to evolving the existing data model by adding multiple calculated fields with their corresponding business logic.

My responsibilities were:

  • Data ingestion processes in Databricks, using a proprietary framework to load CSV and Excel files
  • Developing ETL pipelines to integrate and transform data from multiple sources
  • Technical analysis and functional design of data engineering solutions using Scala + Spark, orchestrated with Azure Data Factory pipelines and deployed via Azure DevOps
  • Using Big Data and cloud technologies focused on scalability and automation
  • Problem solving with a data-driven and technically efficient approach
  • Familiarity and application of the Medallion Architecture
  • Code development using Azure DevOps, including version control, development branches and pull requests
Databricks Scala Spark Azure Data Factory Azure DevOps

Awards & Recognition

Notable achievements throughout my personal and professional journey

SDG Star 2025 — Teamwork

SDG Group · 2025

Selected as SDG Star 2025 in the Teamwork category, a recognition for collaborative work, strong team involvement and adding value day-to-day within a large-scale project.

🏅

Third Place — Talent Race 2024

Universia · 2024

Third place in Talent Race 2024, a competition organized by Universia with over 1,100 participants. We passed a rigorous selection process presenting innovative solutions that assessed problem-solving and teamwork in front of leading companies.

🏆

Winners — Enseña 2.0 Challenge

Oracle Spain & Nuwe · 2024

First place in the Enseña 2.0 Challenge by Oracle Spain and Nuwe among over 220 participants and 40 universities. The challenge "Formula 1 & Data Science with ORACLE" involved exploring data with Oracle Database and Oracle Cloud, and training an AutoML model to predict the most exciting race.

Tech Stack

These are some of the technologies I've mainly worked with, but I might be forgetting a few

Azure Azure
Python Python
Scala Scala
Databricks Databricks
Palantir Palantir
PySpark PySpark
R R
SQL SQL
Power BI Power BI
HTML HTML
Git Git
JavaScript JavaScript

About Me

Junjie Wu

Hi, I'm Junjie. 👋

I built this website with a lot of enthusiasm ✨ so you can get to know me a little better, beyond what I do professionally.

Most likely, the first thing people discover about me are my skills and experience, but I think it's equally important to share who I am on a personal level. I consider myself a fortunate person 🍀, not by chance, but because of the people who have been with me along the way, the active life I lived at university, and all the experiences that have brought me here.

All of this has taught me the value of hard work, consistency and teamwork 🤝, principles I try to apply in every project I'm part of. This way of understanding learning and work is what guides how I approach new challenges and keep growing.

In the Media

Highlights on social media and press

Contact

Have a project in mind or want to collaborate?