Making The Black Box Transparent: Lessons in Opacity
Deep Learning is all the rage now. It is powerful, it is cheap. To proponents of "explainable" machine learning however, this is not really good news - deep learning is essentially a black box that one can't look into.
To be sure, there are efforts to peek inside the black box to see what it's learning - saliency maps and various visualization tools are useful to understand what is going on in deep learning neural networks. The question of course, is whether it's worth it?
In this talk I shall cover the basics of looking into a deep neural network and share a different approach of looking into neural networks.
Outline/Structure of the Talk
First I shall go through the current state of looking into neural networks: from the good old fashioned visualization of matrices (i.e. Andrej Karpathy's stunning work) to saliency maps to attention focus. Along the way I shall explore concepts like information bottlenecks and gradient topologies. These are tools that we currently use to better understand a neural network.
Then I shall provide abstract summaries of these states and how they may be thought of as tools for further exploration.
Following this I shall present my method of using neural networks to understand neural networks. At the time of writing, there have been some encouraging signs that this is feasible.
De-opaquing a black box comes with some interesting costs. Hence I shall finish up with some thoughts on the nature of looking into the black box.
I want audience members to walk out of this talk having a deeper understanding of what goes on inside a neural network. It is not enough to just build tools. We should also seek to understand our tools
People who are interested in knowing what goes on in deep neural networks, people who are interested in the notions of computation.
Prerequisites for Attendees
Attendees should be comfortable with basic linear algebra - at the very least with basic matrix multiplication and vector transforms.
I'm going to be eliding a lot of linear algebra things in this talk. But one cannot escape the fact that all deep learning boils down to is essentially computational linear algebra.