PhD Student · University of Michigan

Hi, I'm Donna.

I'm a Rackham Scholar and Computer Science & Engineering PhD student at the University of Michigan, advised by Professor Lin Ma. My research sits at the intersection of database management systems and machine learning.

Originally from the Bay Area, California.

Donna Pham
Ann Arbor, MI
About

I've been fortunate to be advised by Professor Lin Ma ever since my undergraduate years at Michigan. Working in his lab is what first drew me to the questions I care about most — and when it came time to choose a doctoral path, continuing my PhD under his guidance was an easy decision.

Today my work focuses on making database systems more robust and explainable as they become increasingly driven by machine learning — especially in the noisy, unpredictable environments real systems actually run in.

ML-Driven Database Tuning

Learned knob tuning and configuration for self-optimizing data systems.

Robustness & Explainability

Resilient methodology and benchmarks for noisy, real-world conditions.

Large-Scale Data Analytics

Large-Scale Data Analytics

Optimizing index structures for LLMs — including hierarchical indexing for faster, more scalable retrieval.

Experience

Where I've worked.

Before my PhD, I was fortunate to build systems and machine learning across industry, from deep-learning libraries to autonomy and high-frequency data infrastructure.

2022
Jan – July 2022
Ford
Autonomy Sim Intern
📍 Dearborn, MI

Built autonomous-driving simulations, applying computer vision to model vehicle behavior.

2022
May – Aug 2022
NVIDIA
Deep Learning SW Dev Intern
📍 Santa Clara, CA

Built conversational-AI models and improved CUDA-X AI libraries, boosting performance and reducing reported bugs across projects.

2022–23
Aug 2022 – May 2023
Tesla
Vision Software Engineer
📍 Palo Alto, CA

Developed autonomy algorithms in C++ and Unreal Engine and analyzed self-driving case data with deep-learning techniques.

2023
May – Aug 2023
Citadel
Data Processing SWE Intern
📍 New York, NY

Built and optimized large-scale data pipelines on Spark and Kafka, handling terabytes of market data daily with real-time streaming.

2024–26
2024 – 2026
Google
SWE · Modeling Infra
📍 Mountain View, CA

Built and maintained model pipelines and sampling for large-scale ML modeling infrastructure.

2023 →
Sept 2023 – Present
U-M · Lin Ma Lab
Database Researcher → PhD
📍 Ann Arbor, MI

Researching robustness and explainability of ML-driven database tuning, with benchmarks built for noisy environments.

Research

Databases that learn

Modern data systems increasingly tune themselves with machine learning. My research asks what happens when those systems meet the messiness of the real world, and how we can keep them robust, explainable, and reliable.

01

Robust Knob Tuning

Methodology and benchmarks that test the resilience of ML-based database tuning, improving throughput and cutting variance in noisy environments.

02

Explainable Systems

Identifying weaknesses in existing methods and raising system reliability through better analytics and prioritization strategies.

03

Analytics at Scale

Enhanced ETL pipelines with GPU acceleration that reduce processing times and streamline evaluation over large-scale datasets.

Get in touch