A recursive neural network is a kind of deep neural network created by applying the same set of weights recursively over a structured input, to produce a structured prediction over variable-size input structures, or a scalar prediction on it, by traversing a given structure in topological order. Recursive neural networks, sometimes abbreviated as RvNNs, have been successful, for instance, in learning sequence and tree structures in natural language processing, mainly phrase and sentence continuous representations based on word embedding. RvNNs have first been introduced to learn distributed representations of structure, such as logical terms.[1]
Models and general frameworks have been developed in further works since the 1990s.[2][3]
Architecturesedit
Basicedit
In the most simple architecture, nodes are combined into parents using a weight matrix that is shared across the whole network, and a non-linearity such as tanh. If c1 and c2 are n-dimensional vector representation of nodes, their parent will also be an n-dimensional vector, calculated as
Where W is a learned weight matrix.
This architecture, with a few improvements, has been used for successfully parsing natural scenes, syntactic parsing of natural language sentences,[4] and recursive autoencoding and generative modeling of 3D shape structures in the form of cuboid abstractions.[5]
Recursive cascade correlation (RecCC)edit
RecCC is a constructive neural network approach to deal with tree domains[2] with pioneering applications to chemistry[6] and extension to directed acyclic graphs.[7]
Unsupervised RNNedit
A framework for unsupervised RNN has been introduced in 2004.[8][9]
Tensoredit
Recursive neural tensor networks use one, tensor-based composition function for all nodes in the tree.[10]
Universal approximation capability of RNN over trees has been proved in literature.[11][12]
Related modelsedit
Recurrent neural networksedit
Recurrent neural networks are recursive artificial neural networks with a certain structure: that of a linear chain. Whereas recursive neural networks operate on any hierarchical structure, combining child representations into parent representations, recurrent neural networks operate on the linear progression of time, combining the previous time step and a hidden representation into the representation for the current time step.
Tree Echo State Networksedit
An efficient approach to implement recursive neural networks is given by the Tree Echo State Network[13] within the reservoir computing paradigm.
^Goller, C.; Küchler, A. (1996). "Learning task-dependent distributed representations by backpropagation through structure". Proceedings of International Conference on Neural Networks (ICNN'96). Vol. 1. pp. 347–352. CiteSeerX10.1.1.52.4759. doi:10.1109/ICNN.1996.548916. ISBN 978-0-7803-3210-2. S2CID 6536466.
^ abSperduti, A.; Starita, A. (1997-05-01). "Supervised neural networks for the classification of structures". IEEE Transactions on Neural Networks. 8 (3): 714–735. doi:10.1109/72.572108. ISSN 1045-9227. PMID 18255672.
^Frasconi, P.; Gori, M.; Sperduti, A. (1998-09-01). "A general framework for adaptive processing of data structures". IEEE Transactions on Neural Networks. 9 (5): 768–786. CiteSeerX10.1.1.64.2580. doi:10.1109/72.712151. ISSN 1045-9227. PMID 18255765.
^Socher, Richard; Lin, Cliff; Ng, Andrew Y.; Manning, Christopher D. "Parsing Natural Scenes and Natural Language with Recursive Neural Networks" (PDF). The 28th International Conference on Machine Learning (ICML 2011).
^Hammer, Barbara; Micheli, Alessio; Sperduti, Alessandro; Strickert, Marc (2004-03-01). "A general framework for unsupervised processing of structured data". Neurocomputing. 57: 3–35. CiteSeerX10.1.1.3.984. doi:10.1016/j.neucom.2004.01.008.
^Socher, Richard; Perelygin, Alex; Y. Wu, Jean; Chuang, Jason; D. Manning, Christopher; Y. Ng, Andrew; Potts, Christopher. "Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank" (PDF). EMNLP 2013.
^Hammer, Barbara (2007-10-03). Learning with Recurrent Neural Networks. Springer. ISBN 9781846285677.