Nialish

From Germany (UTC+2)

Data Engineer|Senior

Skills and seniority verified on Sep 4, 2025

Nialish – Microsoft Azure, Python, SQL

Nialish is a Senior Data Engineer with over six years of experience in data engineering, big data analytics, and the development of scalable, distributed data pipelines. He possesses hands-on expertise in Python, Apache Airflow, Snowflake, SQL, and ETL/Data Warehouse architecture, among other related technologies. Throughout his career, Nialish has worked across various industries, delivering reliable, secure, and high-performance data solutions in fast-paced environments. He excels at designing and debugging complex data pipelines while effectively balancing technical best practices with business requirements to provide actionable insights. He demonstrates strong problem-solving, critical thinking, and business acumen, with the ability to communicate and collaborate across technical and non-technical teams. Adaptable, eager to learn, and experienced in leading teams, he thrives in agile environments and embraces new challenges.

9 years of commercial experience in

Data analytics

Fintech

Healthcare

Machine learning

Manufacturing

Recruiting

AI software

Chatbots

Main technologies

Microsoft Azure

7 years

Python

7 years

SQL

7 years

AWS

7 years

Additional skills

Docker

CI/CD

Apache Airflow

LLM

Kubernetes

LangChain

PySpark

Databricks

Machine learning

Apache Kafka

Data Modeling

Snowflake

Redis

Deep Learning

Big Data

MongoDB

Azure SQL

Apache Spark

DBT

Direct hire

Possible

Ready to get matched with vetted developers fast?

Let’s get started today!

Experience Highlights

Team Lead & Principal Data Engineer

Dec 2024 - Ongoing9 months

Project Overview

It is an intelligent customer support chatbot that delivers instant, accurate solutions by leveraging thousands of previously solved cases. The system consolidates knowledge scattered across diverse sources, including internal documentation, defect logs, help pages, data tables, and training materials, into a unified, intelligent platform. By doing so, it significantly reduces problem-resolution time, which previously could take weeks when handled manually by consultants.

Responsibilities:

Served as the sole Principal Data Engineer in a 19-member team, mentoring colleagues and driving project development, delivery, and maintenance;
Designed and built scalable Spark pipelines to process and integrate data from Azure Databricks Unity Catalog, transforming raw data into vector embeddings for semantic search and AI model relevance;
Leveraged Azure AI Search as a vector store to persist embeddings, and engineered Spark workflows for efficient storage and retrieval, forming the foundation for LLM-powered applications;
Developed end-to-end FastAPI services to enable real-time contextual data retrieval and dynamic response generation via LLMs;
Migrated and implemented CI/CD pipelines with self-hosted Azure DevOps agents, ensuring compliance with internal security and regulatory requirements;
Unified diverse knowledge sources (databases, logs, documentation, media) into vectorized representations, enabling instant retrieval of the most relevant context and integration with Azure OpenAI for precise solutions;
Achieved dramatic improvements in time-to-resolution (from weeks to minutes), customer satisfaction, and scalability, with a solution designed to continuously learn from new cases.

Project Tech stack:

Databricks

PySpark

Unit testing

Kubernetes

Docker

Microsoft Azure

LLM

Apache Airflow

Vector Databases

LangChain

Team Lead & Senior Data Engineer

Dec 2022 - Nov 20241 year 10 months

Project Overview

It is a data-driven solution for a large care services marketplace (15M+ clients, 10K+ providers) to overcome disconnected data silos and improve efficiency in matching clients with care providers. The platform introduced predictive intelligence and self-service dashboards for real-time monitoring and KPIs, reducing manual data processing by 60%, improving scalability, and generating over €2M in additional revenue.

Responsibilities:

Successfully built and deployed a Snowflake-based data warehouse from scratch, centralizing 10K+ partner company records and 15M+ client interactions through Azure Data Factory ELT pipelines;
Implemented real-time data streaming with Snow Pipes and Apache Kafka, enabling live synchronization of transactional and customer data;
Designed and developed a core ML engine on Databricks (Spark + PySpark) for data cleaning, feature engineering, predictive analytics, and automated appointment scheduling, generating over €2M in additional annual revenue;
Created self-service BI dashboards with real-time KPIs, reducing dependency on IT/data teams, accelerating decision-making cycles, and improving business scalability;
Established CI/CD pipelines for Databricks and Azure Data Factory, improving deployment reliability and shortening release cycles;
Led and mentored a team of 3 data engineers within a larger group of 23, ensuring agile delivery, knowledge sharing, and alignment with project goals.

Project Tech stack:

Snowflake

Data Modeling

Machine learning

Azure DevOps

Databricks

Microsoft Power BI

PySpark

Apache Kafka

Apache Airflow

Senior Data Engineer

Jan 2021 - Nov 20221 year 9 months

Project Overview

Developed a data-driven recruitment platform designed to optimize candidate-job matching at scale. The main challenge was that the existing matching process required up to 14 hours per day, causing delays and a poor user experience. By introducing advanced automation and optimized data pipelines, the solution significantly reduced processing time, improved accuracy in recommendations, and enhanced overall platform efficiency for both candidates and employers.

Responsibilities:

Built a Big Data and Machine Learning platform from scratch using Apache Spark, enabling scalable candidate-job matching for 200K+ candidates and 1K+ daily logins;
Designed and implemented ETL and ML pipelines for automated candidate ranking and matching, significantly reducing manual intervention and errors;
Orchestrated workflows with Apache Airflow and integrated AWS SQS + SNS for reliable cross-server communication and recruiter notifications;
Reduced candidate-job matching processing time from 14 hours per day to just 6 seconds, enabling near real-time recommendations and faster placements;
Increased candidate shortlisting efficiency by 60%, improving recruiter productivity and decision-making speed;
Contributed to 25% YoY growth in customer onboarding by optimizing recruitment workflows and improving overall user experience;
Scaled the platform to support growing activity levels without delays, ensuring long-term performance and reliability.

Project Tech stack:

PySpark

Big Data

Amazon SQS

Amazon SNS

Redis

MongoDB

Deep Learning

Senior Data Engineer

Jun 2020 - Dec 20211 year 6 months

Project Overview

An AI-powered B2B chatbot that delivers instant, accurate solutions to customers by leveraging multiple knowledge sources, including wiki articles, case stories, defect logs, help pages, training videos, and internal documentation. Reduced problem resolution time from weeks to minutes and improved customer satisfaction through real-time, data-backed responses.

Responsibilities:

Achieved a 40% reduction in unexpected machine failures across 5K+ machines worldwide, saving an estimated $5M annually in maintenance and operational costs;
Built a streaming data ingestion pipeline using Apache Kafka and Spark to process IoT sensor data (temperature, vibration, pressure) in real time;
Stored and processed data on Azure Databricks, applying feature engineering for anomaly detection;
Developed predictive ML models using PySpark and MLlib to forecast machine failures, enabling maintenance teams to act proactively.;
Deployed live dashboards in Power BI to provide engineers with immediate insights into equipment health and enable automated alerts and proactive maintenance scheduling;
Improved engineer productivity and reduced downtime by integrating predictive insights directly into operational workflows.

Project Tech stack:

Apache Kafka

Databricks

Apache Spark

Azure SQL

Keep in mind, the experience summary might exclude non-relevant projects

Languages

English

Advanced

Hire Nialish or someone with similar qualifications in days

All developers are ready for interview and are are just waiting for your request