Logo
Nialish – Microsoft Azure, Python, SQL, experts in Lemon.io

Nialish

From Germany (UTC+2)flag

Data Engineer|Senior

Nialish – Microsoft Azure, Python, SQL

Nialish is a Senior Data Engineer with over six years of experience in data engineering, big data analytics, and the development of scalable, distributed data pipelines. He possesses hands-on expertise in Python, Apache Airflow, Snowflake, SQL, and ETL/Data Warehouse architecture, among other related technologies. Throughout his career, Nialish has worked across various industries, delivering reliable, secure, and high-performance data solutions in fast-paced environments. He excels at designing and debugging complex data pipelines while effectively balancing technical best practices with business requirements to provide actionable insights. He demonstrates strong problem-solving, critical thinking, and business acumen, with the ability to communicate and collaborate across technical and non-technical teams. Adaptable, eager to learn, and experienced in leading teams, he thrives in agile environments and embraces new challenges.

9 years of commercial experience in
AI
Data analytics
Fintech
Healthcare
Machine learning
Manufacturing
Recruiting
AI software
Chatbots
Main technologies
Microsoft Azure
7 years
Python
7 years
SQL
7 years
AWS
7 years
Additional skills
LangChain
CI/CD
LLM
AI
Apache Airflow
Docker
Kubernetes
PySpark
Databricks
Machine learning
Apache Kafka
Data Modeling
Snowflake
Redis
Deep Learning
Big Data
MongoDB
Azure SQL
Apache Spark
Direct hire
Possible
Ready to get matched with vetted developers fast?
Let’s get started today!

Experience Highlights

Team Lead & Principal Data Engineer
Dec 2024 - Ongoing8 months
Project Overview

It is an intelligent customer support chatbot that delivers instant, accurate solutions by leveraging thousands of previously solved cases. The system consolidates knowledge scattered across diverse sources, including internal documentation, defect logs, help pages, data tables, and training materials, into a unified, intelligent platform. By doing so, it significantly reduces problem-resolution time, which previously could take weeks when handled manually by consultants.

Responsibilities:
  • Served as the sole Principal Data Engineer in a 19-member team, mentoring colleagues and driving project development, delivery, and maintenance;
  • Designed and built scalable Spark pipelines to process and integrate data from Azure Databricks Unity Catalog, transforming raw data into vector embeddings for semantic search and AI model relevance;
  • Leveraged Azure AI Search as a vector store to persist embeddings, and engineered Spark workflows for efficient storage and retrieval, forming the foundation for LLM-powered applications;
  • Developed end-to-end FastAPI services to enable real-time contextual data retrieval and dynamic response generation via LLMs;
  • Migrated and implemented CI/CD pipelines with self-hosted Azure DevOps agents, ensuring compliance with internal security and regulatory requirements;
  • Unified diverse knowledge sources (databases, logs, documentation, media) into vectorized representations, enabling instant retrieval of the most relevant context and integration with Azure OpenAI for precise solutions;
  • Achieved dramatic improvements in time-to-resolution (from weeks to minutes), customer satisfaction, and scalability, with a solution designed to continuously learn from new cases.
Project Tech stack:
Databricks
PySpark
CI
CD
Unit testing
Kubernetes
Docker
Microsoft Azure
LLM
AI
Apache Airflow
Vector Databases
LangChain
Team Lead & Senior Data Engineer
Dec 2022 - Nov 20241 year 10 months
Project Overview

It is a data-driven solution for a large care services marketplace (15M+ clients, 10K+ providers) to overcome disconnected data silos and improve efficiency in matching clients with care providers. The platform introduced predictive intelligence and self-service dashboards for real-time monitoring and KPIs, reducing manual data processing by 60%, improving scalability, and generating over €2M in additional revenue.

Responsibilities:
  • Successfully built and deployed a Snowflake-based data warehouse from scratch, centralizing 10K+ partner company records and 15M+ client interactions through Azure Data Factory ELT pipelines;
  • Implemented real-time data streaming with Snow Pipes and Apache Kafka, enabling live synchronization of transactional and customer data;
  • Designed and developed a core ML engine on Databricks (Spark + PySpark) for data cleaning, feature engineering, predictive analytics, and automated appointment scheduling, generating over €2M in additional annual revenue;
  • Created self-service BI dashboards with real-time KPIs, reducing dependency on IT/data teams, accelerating decision-making cycles, and improving business scalability;
  • Established CI/CD pipelines for Databricks and Azure Data Factory, improving deployment reliability and shortening release cycles;
  • Led and mentored a team of 3 data engineers within a larger group of 23, ensuring agile delivery, knowledge sharing, and alignment with project goals.
Project Tech stack:
Snowflake
Data Modeling
Machine learning
CI
CD
Azure DevOps
Databricks
Microsoft Power BI
PySpark
Apache Kafka
Apache Airflow
Senior Data Engineer
Jan 2021 - Nov 20221 year 9 months
Project Overview

Developed a data-driven recruitment platform designed to optimize candidate-job matching at scale. The main challenge was that the existing matching process required up to 14 hours per day, causing delays and a poor user experience. By introducing advanced automation and optimized data pipelines, the solution significantly reduced processing time, improved accuracy in recommendations, and enhanced overall platform efficiency for both candidates and employers.

Responsibilities:
  • Built a Big Data and Machine Learning platform from scratch using Apache Spark, enabling scalable candidate-job matching for 200K+ candidates and 1K+ daily logins;
  • Designed and implemented ETL and ML pipelines for automated candidate ranking and matching, significantly reducing manual intervention and errors;
  • Orchestrated workflows with Apache Airflow and integrated AWS SQS + SNS for reliable cross-server communication and recruiter notifications;
  • Reduced candidate-job matching processing time from 14 hours per day to just 6 seconds, enabling near real-time recommendations and faster placements;
  • Increased candidate shortlisting efficiency by 60%, improving recruiter productivity and decision-making speed;
  • Contributed to 25% YoY growth in customer onboarding by optimizing recruitment workflows and improving overall user experience;
  • Scaled the platform to support growing activity levels without delays, ensuring long-term performance and reliability.
Project Tech stack:
PySpark
Big Data
Amazon SQS
Amazon SNS
Redis
MongoDB
Deep Learning
Senior Data Engineer
Jun 2020 - Dec 20211 year 6 months
Project Overview

An AI-powered B2B chatbot that delivers instant, accurate solutions to customers by leveraging multiple knowledge sources, including wiki articles, case stories, defect logs, help pages, training videos, and internal documentation. Reduced problem resolution time from weeks to minutes and improved customer satisfaction through real-time, data-backed responses.

Responsibilities:
  • Achieved a 40% reduction in unexpected machine failures across 5K+ machines worldwide, saving an estimated $5M annually in maintenance and operational costs;
  • Built a streaming data ingestion pipeline using Apache Kafka and Spark to process IoT sensor data (temperature, vibration, pressure) in real time;
  • Stored and processed data on Azure Databricks, applying feature engineering for anomaly detection;
  • Developed predictive ML models using PySpark and MLlib to forecast machine failures, enabling maintenance teams to act proactively.;
  • Deployed live dashboards in Power BI to provide engineers with immediate insights into equipment health and enable automated alerts and proactive maintenance scheduling;
  • Improved engineer productivity and reduced downtime by integrating predictive insights directly into operational workflows.
Project Tech stack:
Apache Kafka
Databricks
Apache Spark
Azure SQL

Languages

English
Advanced

Hire Nialish or someone with similar qualifications in days
All developers are ready for interview and are are just waiting for your requestdream dev illustration
Copyright © 2025 lemon.io. All rights reserved.