Dianne O'Leary
Distinguished University Professor Emerita
University of Maryland, College Park
Deep learning models for classification have been criticized for their lack of easy interpretation, which undermines confidence in their use for important applications. We demonstrate the power of flip points in interpreting and debugging these models.
A flip point is a point on the boundary between two output classes. Finding the flip point that is closest to a given input is a tractable optimization problem.
We demonstrate the use of closest flip points to identify flaws in a neural network model, to generate synthetic training data to correct the flaws, to assess uncertainty in the model's output, and to provide individual and group-level interpretations of the model.
This is joint work with Roozbeh Yousefzadeh.