Department of Industrial and Systems Engineering
Texas A&M University
On Generalization Bounds for Learning from Batch and Streaming Data
In an era where Big Data has revolutionized information sciences, data analytics offers exciting opportunities to improve human life through applications such as healthcare and robotics. This talk centers around three fundamental challenges in data analytics, namely the scaling of the computational cost with data (scalability), the ability to predict the outcome of unseen data (generalizability), and online, generalizable learning of a data model in dynamic environments.
In the first part of the talk, I will discuss scalability and generalizability in statistical learning. Despite their success in nonlinear representation of data, kernel methods suffer from a prohibitive computational cost in large-scale machine learning. I present a greedy approximation technique to extract useful feature maps for improving generalization in supervised learning (with respect to the model complexity) compared to the state-of-the-art. The performance bound captures the trade-off between the sparsity of the underlying model and the number of required features to explain the model. The result is verified on several practical datasets.
In the second part, I will discuss online learning/tracking of a time-varying data model using streaming data. The problem is viewed as an optimization of a time-varying objective function encapsulating the model sequence. Prior works established performance bounds in terms of either “regularity of the model pattern” or “temporal variability of the objective function”. I present the first algorithm that automatically adapts to the best of both worlds without prior knowledge of the environment. I also present an application of the algorithm in object recognition via tactile sensing.
Shahin Shahrampour is an Assistant Professor in the Department of Industrial and Systems Engineering at Texas A&M University. He was previously a Postdoctoral Fellow in the School of Engineering and Applied Sciences at Harvard University. Prior to that, he received the Ph.D. degree in Electrical and Systems Engineering, the M.A. degree in Statistics (The Wharton School), and the M.S.E. degree in Electrical Engineering, all from the University of Pennsylvania, in 2015, 2014, and 2012, respectively. His research interests include machine learning, optimization, sequential decision-making, and distributed learning, with a focus on developing computationally efficient methods for data analytics.
Friday, 3/22/2019, 11:30 AM, BLOC 113