Data Science Transdisciplinary Area of Excellence
Mini-Symposium: Data Science and AI for STEM
Friday, May 9th 2025
LOCATION CHANGE: Atrium of Old Champlain (Room OH-133) and by Zoom

Schedule

11:30 AM – 12:30 PM Lunch
12:30 PM – 12:40 PM Opening remarks
Session 1
12:40 PM – 1:10 PM Advances in computational cognition: Toward more 'human-like' machine learning Ken Kurtz
Abstract: I will discuss several projects in which design principles inspired by the psychology of human cognition are used as the basis for novel approaches to machine learning tasks. Rather than relying on massive architectures and training sets, insights from computational cognition are used to imbue simple computational modules with increased robustness or flexibility for more human-like learning and generalization performance.
1:10 PM – 1:40 PM Charting ideas in motion through science, patents, and law Sadamori Kojaku
Abstract: Understanding how knowledge is created and evolves requires mapping the dynamic interplay between ideas and the communities that shape them. This talk introduces the "Knowledge Atlas,” a computational framework for mapping the emergence, evolution, and diffusion of scientific, technological, and legal concepts. By integrating large-scale, diverse datasets—including text and citations in scientific publications, patents, and legal documents—into an embedding space, we approach how ideas propagate through social systems. We explore how this dynamic mapping can shed light on the processes underlying innovation, distinguishing between incremental advancements (interpolation) and potentially disruptive leaps (extrapolation). Furthermore, we investigate how computational models can help understand historical patterns of discovery. The resulting visualizations and analytical tools offer powerful new ways to track intellectual trajectories, identify hidden linkages between fields, understand barriers to knowledge diffusion, and ultimately foster environments fro creativity and impactful discovery.
1:40 PM – 2:10 PM Solving ordinary differential equations using AI (or not) Minghao Rostami
Abstract: Ordinary Differential Equations (ODEs) are often used to model the evolution of a quantity of interest over time. Examples include the predator-prey model used to describe the change in the population of a predator and that of a prey in a biological system where the two interact. I will review conventional and data-driven approaches for solving ODEs. The former requires the equations to be fully known and is based on numerical integration. The latter has the advantage of not requiring complete knowledge of the equations but instead requires abundant data on the trajectory of the quantity of interest through time. As an example, we show that a Deep Neural Network (DNN) can be trained to predict the path traced out by a particle in a fluid flow without knowing the fluid velocity.
2:10 PM – 2:30 PM Break
Session 2
2:30 PM – 3:00 PM Challenges for machine learning in condensed matter physics Michael Lawler
Abstract: Machine learning has made important advances in a variety of scientific enterprises, but has made little impact in the study of condensed matter physics. In this talk, I will, through several examples, argue that the lack of interpretability of neural networks hinders progress in science. Some of my examples will use interpretable methods outside of neural networks, and some will involve neural networks where the struggle is more obvious. I will end the talk with an idea that doesn't require interpretability to succeed: a plan for a foundation model to improve the reproducibility of science by beginning with the materials growth problem.
3:00 PM – 3:30 PM Cargo recognition mechanism of the dynein adapter BicD2 Sozanne Solmaz
Abstract: The dynein adapter BicD2 recognizes cargo for transport and facilitates three cellular transport pathways that are essential for brain development. The importance of BicD2 for brain and muscle development is underscored by the fact that human disease mutations cause devastating brain- and muscle developmental diseases including spinal muscular atrophy, which is the most common genetic cause of death in infants. Here, we used AI-based structural prediction methods to establish structural models for three BicD2/cargo complexes that have essential roles in brain development and that are affected by the BicD2 disease mutations.
3:30 PM – 4:00 PM Data-Driven Discoveries from Large-scale and Complex Data Minjie Wang
Abstract: The development of statistical machine learning techniques has facilitated data-driven scientific discoveries from large-scale and complex data. In this talk, I will present several new statistical machine learning techniques we have developed to help make data-driven scientific discoveries. I will present novel approaches including integrative clustering for mixed multi-view data, graph learning for huge data via minipatch learning, and causal discovery for diverse types of outcomes and unmeasured confounders. We will highlight the utility of all the proposed methods on various genomic and neuroscience data for data-driven discoveries.
4:00 PM – 4:10 PM Break
Panel Discussion
4:10 PM – 4:40 PM Problems, Challenges, and Opportunities for AI and Science

This event is part of the 2024-2025 Data Science TAE’s Thematic Program under the official title: Data Science and Artificial Intelligence for Scientific Discoveries.