Logo
Catherine – Python, GCP, AWS, experts in Lemon.io

Catherine

From United Kingdom (GMT+1)

flag
Data ScientistSenior
Hire developer
9 years of commercial experience
Automotive
Business intelligence
Cloud computing
Data analytics
Machine learning
Real estate
AI software
Dev tools
Lemon.io stats

Catherine – Python, GCP, AWS

Catherine excels at simplifying complex concepts and effectively communicating her findings to both technical and non-technical audiences. With her ability to independently manage projects and her commitment to continuous learning, Catherine brings both expertise and adaptability to any team as a Senior Data Scientist with over 6 years of experience in the field.

Main technologies
Python
10 years
GCP
5 years
AWS
5 years
Additional skills
React
Golang
Flutter
MongoDB
SQL
Linux
Docker
Terraform
Kubernetes
Nginx
MySQL
DynamoDB
Ready to start
ASAP
Direct hire
Potentially possible

Experience Highlights

CTO
Dec 2020 - Oct 20232 years 10 months
Project Overview

Developed a patent-pending content moderation algorithm capable of identifying various categories of abusive content in speech, as well as in images, videos, and audio, utilizing finely-tuned Transformers. Additionally, constructed the entire Cloud Infrastructure and Python REST APIs from the ground up.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Created the patent-pending Emotion AI algorithm using Transformers for the NLP part and Python;
  • Created various Computer vision models to detect nudity, weapons, and racist symbols using Resnet( for nudity detection) and Yolo via Roboflow for Object detection;
  • Migrated legacy PoC from MS AZURE to GCP;
  • Created a no-SQL Database using Mongo and connected it securely to the Cloud Infrastructure;
  • Built a containerized FastAPI back-end to serve the models and deployed on Kubernetes using Docker;
  • Hired Front-end and infrastructure Engineers during the scaling phase.
Project Tech stack:
Python
React
Terraform
Docker
Kubernetes
Senior Data Engineer
Dec 2020 - Mar 20221 year 2 months
Project Overview

Catherine automated a previously labor-intensive process involving the weekly generation of device crash reports for the CTO of SKY Technologies in Google Data Studio. This project involved handling KPI data from AWS S3 buckets and BigQuery. She adeptly managed the data, ingested it into BigQuery, partitioned tables, and crafted views for diverse stakeholders. The final result was a fully automated end-to-end report in Data Studio.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Was the first GCP-certified Data Engineer on this team and started the Data Pipeline on GCP from scratch, including GCP networking (PVC's Security groups, Database access, etc. );
  • Built a comprehensive report in Data Studio that automatically rendered and visualized real-time KPI data weekly;
  • Refrained from spending a lot of money on tools that could code herself and wrote an article about this project on Medium.
Project Tech stack:
GCP
Python
AWS
Cloud Engineer
Dec 2020 - Jun 20215 months
Project Overview

Moved an on-premises system to RedHat OpenShift Kubernetes with Terraform, driven by a desire to deepen understanding of networking and security. This effort succeeded in broadening knowledge in these areas and revealed a true passion for constructing solutions. A Golang tool was developed to collaborate with Terratest for deployment on IBM Cloud. Penetration testing was conducted using Gobuster during the infrastructure testing phase.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Migrated on-prem payments processing system of a large retail bank to IBM cloud / RedHat Openshift using Terraform;
  • Built a Flask REST API service that detects severe weather warnings from RSS news feeds (data cleaning, wrangling, ML classification, Deployment of Flask Service using Docker and Kubernetes);
  • Participated and succeeded in Capture the Flag events for Cyber Security Awareness month;
  • Mentored Junior Engineers who were doing their Masters in AI-related fields with their assignments;
  • Developed a Penetration testing tool in Golang that integrates with Terratest and supports IBM Cloud;
  • Was part of a team that built corporate Chatbot applications using IBM Watson.
Project Tech stack:
Terraform
Python
Golang
React
Lead Data Engineer/Data Scientist
Dec 2019 - Dec 20201 year
Project Overview

Built various Webscrapers for Real Estate Websites and built AI models to predict prices based on the features of new properties that come the market.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Built a bespoke web crawling tool that scraped real estate prices and features from various well-known websites;
  • Cleaned and wrangled the data and ingested it into AWS Dynamo DB (document database);
  • Performed exploratory data analysis;
  • Trained SKlearn RandomForest to predict prices of new properties based on this data;
  • Automated the data ingestion and period re-training of the model;
  • Monitored model performance.
Project Tech stack:
AWS
DynamoDB
Python
Senior Data Scientist
Dec 2016 - Dec 20193 years
Project Overview

Design, Implementation, and Management of a business intelligence tool for data mining. Built a Framework for scraping used car prices that developed into a full-fledged Data pipeline, including training Scikitlearn Prediction models to serve the prices to Insurance customers on a Flask backend and a Plotly Dash dashboard for analytics.

Skeleton
Skeleton
Skeleton
Responsibilities:
  • Wrote web scrapers and API consumers in Python that automatically scraped automotive sales websites and ingested the data into a MYSQL database;
  • Designed schemas and set up the database on Digital Ocean;
  • Did exploratory data analysis to find the best model to predict used car prices based on the data obtained from the Data Ingestion Pipeline;
  • Wrote Python module that iteratively tried different ML models and generated charts for each model's accuracy and precision;
  • Did Full stack Software /Data Engineering, including bespoke SKLEARN NLP classifier to identify vehicles from automotive adverts precisely geo-coding using Google Location API and interactive map generation of the location of vehicles visualizations using Plotly Dash, which was built on top of Flask with Auth and all using the free version from Plotly;
  • Wrote reports and forecasts (Python code to automatically generate relevant charts generated in Python's matplolib as PowerPoint Presentation for Business Stakeholders).
Project Tech stack:
Python
Linux
MySQL
Nginx
Docker

Education

2014
Applied Mathematics / Statistics
Bsc.
2020
Building Transformer-based Natural Language Processing Applications on GPUs
https://courses.nvidia.com/certificates/f0856d4772b2416ea5809bc0aac9683b

Copyright © 2024 lemon.io. All rights reserved.