
Carlos
From United States (UTC-7)
26 years of commercial experience
Carlos – Python, LLM, Typescript
Carlos is a Senior AI Engineering Leader with over 8 years of Python experience and 15+ years in technical leadership, including 5 years in management roles. He has built and led ML infrastructure teams at top organizations like Facebook, Microsoft, Outreach, and the University of Washington. Carlos brings deep expertise in scalable AI systems, distributed training, and bridging the gap between research and engineering!
Main technologies
Additional skills
Direct hire
PossibleReady to get matched with vetted developers fast?
Let’s get started today!Experience Highlights
Principal Machine Learning Engineer
A robust tooling for distributed PyTorch training of deep learning (Transformer) models on a GCP Kubernetes-based GPU cluster, including custom monitoring and diagnostics using Weights and Biases.
- Developed internal tools and processes enabling researchers to run distributed jobs on a cluster without needing to understand Docker or Kubernetes;
- Simplified infrastructure complexity, significantly improving accessibility and productivity for non-engineering team members;
- Leveraged Kubernetes and Git to streamline the transition from local model development to large-scale cluster execution;
- Integrated Git and Weights & Biases, allowing easy experiment tracking, results sharing, and full reproducibility.
Senior Principal Research Software Engineer
A Python seismology package designed for fast and straightforward computation of ambient noise cross-correlation functions. It provides additional functionality for noise monitoring and surface wave dispersion analysis.
- Revamped a broken codebase by introducing proper engineering practices, unit tests, and a continuous integration (CI) pipeline within 6 months;
- Improved code quality and reliability, making the software functional and maintainable;
- Optimized performance and added support for multiple data formats and parallel processing across both HPC and cloud (AWS) environments;
- Enabled large-scale data processing, successfully handling over 1TB of data in AWS;
- Mentored the research team in software development best practices, enabling them to maintain and extend the project independently.
Principal Engineer, Applied AI Engineering Director
An AI Conversation Intelligence that joins and records live meetings, captures real-time transcription, and delivers on-demand confidence. It runs NLP models in real-time to provide timely insights during meetings.
- Built and led Kaia’s first ML team from scratch, hiring two NLP applied scientists and establishing core practices;
- Designed and implemented machine learning infrastructure from the ground up within 7 months;
- Set up secure virtual networks in Azure to ensure data compliance and security.
- Deployed and configured Azure Machine Learning and Databricks instances, using Terraform for infrastructure-as-code and maintainability;
- Revamped the Action Item detection system by training a new NLP model, improving precision from <70% to 90% while maintaining recall;
- Shipped the improved model to production, integrating with Azure DevOps (CI/CD), deploying to Kubernetes, and setting up production monitoring.
Machine Learning Engineer
The project involved a flagship instant messaging app available on web, iOS, Android, Windows, and macOS, serving over 1 billion monthly users. It enables one-on-one and group chats, as well as voice and video calls, media sharing, and interactive features like games, stickers, and disappearing messages.
- Joined the team at a stage where automatic (contextual) sticker suggestions feature only worked in English - as an ML expert with Spanish fluency, was uniquely positioned to deliver Spanish support;
- Took ownership of the data and training pipelines and shipped the feature within 5 months;
- Further generalized the pipeline, and shortly after, the team shipped support for Portuguese and Thai.
Principal Research Software Engineering Lead
This was a research platform for the Computer Human Interactive Learning group. The research focus was the intersection of Machine Learning and Human Computer Interaction for building classification and entity extraction models. The platform was used to experiment with different models, algorithms, and UX/visualizations.
- Proposed a re-architecture to build an SDK and desktop app, focusing on flexibility and simplicity and leveraging existing technology as much as possible (e.g., Git for model versioning);
- Through a combination of technical and strategic discussions, Carlos led the team to consensus and drove the architecture and execution of the project over nine months. The result was a much simpler and flexible system that runs on a single machine. It enabled the partner engineering teams and researchers to customize it to their needs, allowing Carlos' team to focus on the core ML and HCI research questions.