Profile
I am a developer with 13 years of experience, specializing in Data Science and holding a Master's degree in Machine Learning. Highly motivated and committed to continuous skill development and professional growth.
Work History
Data Scientist Intern | AMERICANAS S.A
2019 - 2021
- Developed AutoML solutions using the Apache Marvin AI framework, streamlining the automated search for optimal machine learning models.
- Conducted in-depth research on Automated Machine Learning (AutoML) and Neural Architecture Search (NAS), focusing on improving model efficiency and accuracy.
Data Scientist | AMERICANAS S.A
2021 - 2023
- Trained deep learning models using MLOps tools such as Docker, Kubernetes, and CI/CD on large-scale datasets (over 10 million product titles and images), orchestrated with Airflow and Vertex AI Jobs, and processed using Dataflow jobs.
- Demonstrated experience working in multi-cloud environments, leveraging both Google Cloud and AWS.
- Applied bottom-up approaches (H-DBSCAN) for hierarchical product classification, identifying and removing outliers from the training set of hierarchical deep learning models.
- Researched hierarchical multi-modal classification techniques (combining text and images) for e-commerce product categorization.
Data Scientist | SOFTPLAN UNJ
2023 - 2025
- Deployed NLP machine learning models to production using MLOps tools such as Docker, Kubernetes, and CI/CD, ensuring scalable, reliable, and automated model serving pipelines on AWS ECS and Kubernetes clusters.
- Implemented ETL processes for SQL Server, PostgreSQL, and Oracle SQL databases from the Brazilian judicial system, handling servers with different versions and distributions located across the country.
- Created and deployed an OCR PDF API using AWS Step Functions and AWS Lambda with AWS Textract.
- Applied hierarchical clustering techniques for semi-supervised learning to build a taxonomy of legal processes for state Prosecutor Offices.
- Led and mentored a team of three junior data scientists, sharing both academic knowledge of machine learning and practical expertise in NLP, OCR, and ETL projects involving judicial data from the Brazilian judicial system.
- Developed and maintained a fork of the open-source Open WebUI project, called Open WebUI Iara, used by the Brazilian Public Prosecutor’s Office (Ministério Público – MP). This version adds support for asynchronous document processing using Celery and RabbitMQ, enabling greater scalability and efficiency when handling large volumes of data, as well as integration with Weaviate.
Education
Computer Network Technician | Serviço Nacional de Aprendizagem Industrial
2012 - 2014
- Recognized as the top student among all technical courses in the graduating class of the second semester of 2014.
Bachelor of Computer Science | Universidade Federal de Mato Grosso
2015 - 2019
- Active participant in the CNPQ Scientific Initiation Program, focusing on NLP and hierarchical text classification for industrial patents from the World Intellectual Property Organization (WIPO).
Master of Computer Science (M.Sc.) in Machine Learning | Universidade Federal de São Carlos
2019 - 2021
- Conducted a literature review and published advancements in Neural Architecture Search (NAS) through an ensemble approach to search spaces and model combination, demonstrating that dividing the search space into different network configurations can improve final classification metrics.
PhD Candidate in Computer Science | Universidade Federal de São Carlos
2021 - 2026 (Ongoing)
- Researching hierarchical multi-label classification techniques and developing a framework to advance hierarchical multi-label categorization metrics using deep learning with PyTorch.
- Current work focuses on metric learning to improve hierarchical multi-label metrics based on the research hypothesis.