
Federico Dominguez Molina
Data Scientist & Engineer building AI agents, LLM systems, and end-to-end data pipelines
About Me
About
I'm a Data Scientist and Engineer passionate about drawing insights from structured and unstructured data—whether to improve outcomes for underrepresented communities or solve hard problems in finance, healthcare, and government.
Over five+ years I've built end-to-end systems across sectors: satellite-driven pollution alerts in a World Bank & NASA-backed RCT, LLM-powered classification pipelines, an ML recommender serving 500,000+ borrowers, agentic workflows for automated data extraction, and process automation at scale.
My current focus is Agentic Workflows and LLMs. I work primarily in Python, AWS, and GCP, and my goal is to tackle public policy and business challenges with state-of-the-art ML tools. I hold an M.S. in Computational Analysis and Public Policy from the University of Chicago and a B.A. in Economics (Honors) from ITAM in Mexico City.
Work History
Experience
Data Scientist
Energy & Environment Lab
- →Implemented multi-platform content dissemination pipelines (YouTube, WhatsApp) within a World Bank & NASA-backed randomized controlled trial
- →Architected a global alert system to track and notify on air pollution events using NLP and graph analysis, reaching over 200,000 impressions in three months
- →Designed a social-media scraper extracting ~11 million daily posts, delivering $50,000 in monthly cost savings
- →Led LLM classification improvement from 65% → 87% accuracy via prompt engineering — built annotated datasets and iterative evaluation loops for systematic prompt adjustment
Data Science Research Assistant
University of Chicago
- →Co-developed FastAPI web platform for behavioral research games studying children's decision-making
- →Created 10+ interactive visualizations with Plotly to detect bias in 250+ policing research papers
Data Engineering Fellow
Coding it Forward
- →Automated data collection & cleaning for six websites, reducing 10+ weekly hours of manual work
- →Built Python data platform for Long Beach Climate Office accessing 2010-2023 data
Senior Data Scientist
deep_dive
- →Led automatic alert system for 20+ clients to improve customer relationships via social media analytics
- →Architected serverless AWS infrastructure to collect and process pricing data for 250,000+ commercial goods, enabling realtime strategic pricing
- →Built and deployed an end-to-end recommender system for 500,000+ borrowers: applied K-Means clustering, engineered custom borrower features, and designed similarity metrics to match borrowers with optimal loan products
Junior Data Scientist
deep_dive
- →Extracted and parsed data from key Mexican governmental sources to support decision-making for 10+ clients
- →Implemented dynamic Power BI dashboard to monitor government projects and infrastructure, proactively flagging 50+ potential risks
- →Built the company's inflation forecasting capability from scratch: developed XGBoost, Random Forest, linear regression models and an ensemble of all three — making the firm a top-ranked inflation forecaster
Selected Work
Highlighted Projects
Global Air Pollution Alert System
Built within a World Bank & NASA-backed RCT: a global alert system targeting air pollution advocates from 30+ cities using X (Twitter), NLP, graph analysis, and real-time satellite pollution data. Reached 200,000+ impressions in three months.
Automated Trading Platform
Designed EC2 architecture and Python code to auto-trade over 10+ portfolios through Interactive Brokers with daily stock price ingestion.
AI Trading Agent
Proof-of-concept AI agent built with Claude and custom tools to analyze investment portfolios against macroeconomic regimes and generate trade recommendations. Demonstrates LLM tool use, function calling, and agent orchestration patterns.
Interactive Laboratory Games Platform
Co-developed FastAPI web platform to play laboratory games and inform research on children decision-making behavior.
ML Loan Recommendation System
Developed machine learning recommender system to offer customized loans to over 500,000 borrowers, improving conversion rates.
Social Media Analytics Alert System
Led automatic alert system project to help 20+ clients improve customer relationships via social media analytics and real-time monitoring.
Long Beach Climate Data Platform
Built a Python data platform allowing City of Long Beach Climate Office to download data from 2010-2023, improving access and saving hundreds of work hours annually.
Policing Research Bias Detection
Created 10+ interactive visualizations and dashboards with Plotly to detect bias in 250+ policing research papers for academic research.
Technical Stack
Skills & Expertise
Python
Machine Learning
NLP & AI
Cloud & DevOps
LLMs & AI Agents
Background
Education
M.S. in Computational Analysis and Public Policy (MSCAPP)
University of Chicago
June 2024
B.A. in Economics, Honors
Instituto Tecnológico Autónomo de México (ITAM)
June 2020
Research
Publications
In the Press
Featured
Get in Touch
Contact
I'm always interested in discussing data science challenges, social impact projects, or collaboration opportunities. Feel free to reach out.