Skip to Main Content
College Home Page
E C E Home Page

EE Seminars

Robust Loss Functions for Classification: An information theoretic perspective


  Add to Google Calendar
Date:  Thu, July 27, 2023
Time:  10:30am - 11:30am
Location:  Holmes Hall 389; online available, check your email or contact us
Speaker:  Lalitha Sankar, Arizona State University

Abstract

Machine learning (ML) has dramatically enhanced the role of automated decision making across a variety of do- mains. There are three ingredients at the heart of designing sound ML algorithms: data, learning architectures, and loss functions. For the canonical supervised learning problem of classification, ideally learning algorithms would directly optimize classification accuracy on the training data, as quantified by the 0-1 loss, but unfortunately this is an NP-hard problem. Hence, in practice, surrogate loss functions of the 0-1 loss are employed to help shape the learning process, guiding the learning algorithm towards predicting the class labels correctly from the training features.

However, data in modern machine learning is often noisy, due to either the collection or collation process, and worse yet, is also susceptible to adversarial influences. In short, data can be twisted, i.e., it presents a different distribution than the ground truth. As a result, optimizing a proper loss function on twisted data could perilously lead the learning algorithm towards the twisted posterior, rather than to the desired clean posterior.

In an effort to rectify this issue with properness, this talk presents the role of parametrized loss functions on ensuring robust and reliable ML. In particular, we will focus on a hyperparameterized loss function called α-loss, that historically resulted from operationally motivated information-theoretic leakage measures. Tuning the hyperparameter α from 0 to ∞ allows continuous interpolation between known and oft-used losses: log-loss (α = 1), exponential loss (α = 1/2), and the soft 0-1 loss (α = ∞).

We conclude by highlighting how the core information-theoretic properties of this loss function class allow it to unify a range of generative adversarial network (GAN) models. Here, we will show that a large class of GANs from the original (oft-called vanilla GAN) GAN to f-GANs to Wasserstein and other IPM GANs are captured by using α-loss to write the value function of GANs, and thus, present a mechanism to enable meaningful comparisons of GANs. Throughout the tutorial, the technical results will be accompanied by results on publicly available large datasets and deep learning models.

Biography

Lalitha Sankar is a Professor in the School of Electrical, Computer, and Energy Engineering at Arizona State University where she started as an Assistant Professor in 2012 and as Associate in 2018. She received her doctorate from Rutgers University, her masters from the University of Maryland, and her bachelors degree from the Indian Institute of Technology, Bombay. Her research is at the intersection of information theory and learning theory and also its applications to the electric grid. She received the NSF CAREER award in 2014.


Return to EE Seminars