Some Brief Notes: On the nature of Differentiable Intelligence and The Pitfalls of a Neural-Symbolic Paradigm in Solving the problem of Blackbox Feature Learning
In our field since the 1980s, we have well understood the potentialities of Neural Nets with differentiable activations and backward learning. As we are all aware this breakthrough, combined with the discovery of automated feature learning through the CNN revolution, set the foundations of deep learning that would then become exponential in the wake of the RELU, the discovery of GPU training and the era of deep RL.
As such, we now know the potential of a paradigm that includes deeper differentiable learning, and we have understood these potentialities across a range of fields now-Transformers, Deep RL, Deep Tabular Learning, Neural ODE learning, etc. Now, at the same time, I have heard much talk about the Neural Symbolic paradigm as some kind of potential fix to the blackbox problem that results when representation learning is non-explicit, however in this piece I intend to pour cold water on this idea.
I think the large problem we have in attempting to use NeSys to solve the blackbox problem is that we must change the way feature learning works, otherwise we have essentially just stuck some code to a weighted matrix and hope that it will do something - it will not. The blackbox problem is an unavoidable element of our field's greatest breakthroughs. The data the modern AI deals with and sees is far too large or massive for any number of explicit feature engineering, and the moment we hand back the feature learning to human labeling, we immediately lose the greatest power of deep AI, that they will learn predictive representations alien to the human eye, and yet by the same token immensely accurate. A mistake many are prone to making is the assumption that AI's greatness lies in the ability to conduct a predictive data analysis on elements much faster than a human. In many ways, this is hardly the major advantage - the major advantage is that they often learn representations from data that are more predictively accurate than the features a human sees.
In this regard, the Symbolic paradigm has a great trouble in doing anything to fix this state of affairs, does it not? There seems to be no obvious elements whereupon the hardcoded rules of the Symbolic paradigm aids us in solving the blackbox issue, as the perceptron learning of the neural net is still vast and equally as difficult to gauge. Consider the data-space a neural net can traverse, it is as one knows, rather large. Now consider the data-space a symbolic system can traverse. It is much smaller, right? As such consider here the problem: how can the symbolic process keep up with the space the neural process can handle? There is no doubt a useful application in this combination, to a degree that one can imagine a symbolic network reasoning on the outputs of a neural net - but in no way does this do anything to solve the blackbox problem. Because although the symbolic process will reason onto the outputs of the neural net, it will not understand why the neural net outputted what it did.
In addition, another issue one would have to solve is if we attempt to make Neural nets themselves logical (Differentiable Logic Gates for example) or if the idea is to combine both Symbolic and Neural AI themselves. The issue in the first instance is what we potentially lose in representation learning by making the differentiation boolean, and in the second instance the problem is the communication problem - how do we get both systems to “talk” to each other in a sufficient way.
Neural-Symbolic AI has potential to be another tool in the vast toolkit of the ML Engineer, it however does not provide an immediate solution to a problem that might genuinely be impossible to solve if we are unwilling to give up the greatest asset this field has attained - the deep learning of representations.