From Black Box to Blueprint:
Trustworthy AI for Materials Discovery
I am a PhD candidate at Johns Hopkins University dedicated to building the next generation of autonomous reasoning frameworks for materials discovery. By bridging molecular simulation (DFT/MD) with Machine Learning, I aim to move beyond "black-box" models toward a new AI paradigm that is trustworthy and leads to reliable, physics-informed materials design.
Recent Work
What is Your Force Field Really Learning? Gaining Scientific Intuition with a Dual-Level Explainability Framework
Yi Cao, Peter Mastracco, Jieneng Chen, Alan Yuille, Paulette Clancy*
AAAI Conference Workshop (XAI4Science), Spotlight (2026)
Dual-level explainability framework bridging model reasoning with human understanding in scientific AI.
[PDF (Coming Soon)]
[Code (Coming Soon)]
Migration as a Probe: A Generalizable Benchmark Framework for Specialist vs. Generalist Machine-Learned Force Fields
Yi Cao and Paulette Clancy*
NeurIPS Conference Workshop (AI4Mat), Spotlight (2025)
Comprehensive benchmarking framework for evaluating MLFF generalization in materials science.
[PDF]
Atomic Switch Control via Two-Mode Intercalation for Tunable 2D Materials
Yi Cao, Victor Wu, and Paulette Clancy*
npj 2D Materials and Applications, Under Review (2025)
Investigating two-mode intercalation for tunable electronic properties in 2D materials.
[Project Website]
Low-energy pathways lead to self-healing defects in CsPbBr₃
Kumar Miskin, Yi Cao, Madaline Marland, Farhan Shaikh, David T. Moore, John Marohn, Paulette Clancy*
Phys. Chem. Chem. Phys., 27(29), 15446-15459 (2025)
Computational discovery of self-healing mechanisms with implications for rational material design.
[PDF]
[Post]
2025 Yi Cao.
This work is licensed under
CC BY-NC-SA 4.0.
Background & Expertise
Technical Portfolio
Machine Learning & AI
- Deep Learning (PyTorch)
- LLMs & Prompt Engineering
- Causal Inference & GNNs
- Explainable AI (XAI)
Scientific Computing
- Molecular Dynamics (LAMMPS)
- DFT (Quantum ESPRESSO)
- HPC (MPI, CUDA)
- Materials Informatics
Programming & Tools
- Python, MATLAB, R
- Git, Docker, Linux
- SQL, Distributed Computing
Education & Research Timeline
Sept 2023 - Present
PhD in ChemBE
Johns Hopkins University
Nov 2023 - Present
Graduate Researcher
Clancy Lab, JHU
Jun 2024 - Jul 2024
CADD Intern
Viva Biotech
2019 - 2023
B.S. Pharmaceutical Sci.
Fudan University
Dec 2022 - Feb 2023
Quality Culture Intern
Boehringer Ingelheim
Feb 2022 - Jun 2023
Undergrad Researcher
ISTBI, Fudan University
Aug 2021 - Dec 2021
Visiting Scholar
UC Berkeley
Work Experience
Computational Drug Design Intern
Viva Biotech | Shanghai, China | Jun - Jul 2024
Conducted Computer-Aided Drug Design (CADD) research using co-solvent MD simulations, optimizing drug discovery through protein-ligand interaction analysis.
Learn More
Scientific Advisor
Fudan iGEM Team | Shanghai, China | Dec 2022 - Nov 2023
Guided experimental design and scientific documentation. Led brainstorming sessions resulting in Gold Medal and Best Environmental Project.
View Project
Quality Culture Intern
Boehringer Ingelheim | Shanghai, China | Dec 2022 - Feb 2023
Led a team in developing a white paper on quality culture through research and interviews, resulting in improved company-wide quality guidelines.
Company Website
Science Communication & Teaching
Conference Talks, Posters, and More
Selected for PHM Society Doctoral Symposium 2025, presented at MRS Fall Meeting 2024, and completed JHU Teaching Institute certification.
View All
Build a closed-loop system linking AI + simulations + experiments.
My long-term goal is to bridge the gap between computational simulations and experimental materials science, enabling a closed-loop design process.
With prior wet-lab training in biomaterials, I’ve seen how tedious trial-and-error methods are.
My vision is to build systems that minimize experiments by learning from past data and simulations—so materials discovery becomes faster, deeper, and smarter.
▲ Collapse
Use ML to extract maximum insight from minimal experiments.
I aim to merge simulation data and historical experiments using advanced ML techniques like active learning and transfer learning to uncover hidden patterns. This allows us to optimize material design with fewer experiments, while gaining more knowledge—accelerating understanding of atomic-level interactions and enabling better materials in fewer cycles.
▲ Collapse
ML is not a black box—it’s a transparent, evolving partner in science.
To me, ML is not magic—it’s a dynamic tool that gains strength when guided by domain knowledge. With strong grounding in materials science, I see ML as a transparent, explainable collaborator.
CPUs and GPUs are extensions of human thought. AI and humans co-evolve, inspiring each other.
As Marie Curie once said, "Nothing in life is to be feared, it is only to be understood." Through interdisciplinary research in ML and materials, I hope to help people understand—and therefore face—the world with greater confidence and curiosity.
▲ Collapse
Get In Touch
I'm always interested in discussing computational materials science, machine learning applications in materials discovery, and potential collaborations. Feel free to reach out if you'd like to connect!
-
Address
3400 N. Charles Street
Baltimore, MD 21218
United States
-
Phone
+1 (443) 278-3766
-
Email
ycao73@jh.edu