W poniedziałek 2 marca będziemy gościć w Instytucie Matematycznym dr Longxiu Huang z Michigan State University, specjalizującą się w matematyce obliczeniowej, która znajduje szerokie zastosowanie w analizie danych oraz uczeniu maszynowym.
O godz. 14:15 w sali 602 dr Huang wygłosi wykład pt. Gradient-Informed Metrics for Supervised Manifold Learning.
Zapraszamy wszystkich zainteresowanych.
Abstract:
High-dimensional datasets—common in biology, medicine, and the social sciences—often contain thousands of measured variables, even though only a small subset truly influences the outcome we care about. How can we uncover and visualize the meaningful structure hidden in such data, especially when labeled examples are limited?
In this talk, I present a geometric framework that reshapes how distances are measured between data points so that differences strongly related to the outcome are emphasized, while irrelevant variation is downplayed. The method estimates how the label changes locally across the dataset and uses this information to “stretch” distances along important directions. By computing path-based (geodesic) distances under this new geometry, we obtain low-dimensional embeddings that better reveal the underlying feature–label structure—often uncovering meaningful low-dimensional manifolds that standard visualization methods fail to capture.
Beyond visualization, the same geometry improves prediction. Incorporating the modified distances into graph-based semi-supervised learning leads to more accurate label estimation when labeled data are scarce or noisy. Experiments on synthetic examples and real biological datasets demonstrate clearer structure and stronger predictive performance compared to classical approaches and neural network baselines.