Capítulo de livro Revisado por pares

Long Short-Term Memory

2012; Springer Nature; Linguagem: Inglês

10.1007/978-3-642-24797-2_4

ISSN

1860-9503

Autores

Alex Graves,

Tópico(s)

Neural Networks and Applications

Resumo

As discussed in the previous chapter, an important benefit of recurrent neural networks is their ability to use contextual information when mapping between input and output sequences. Unfortunately, for standard RNN architectures, the range of context that can be in practice accessed is quite limited. The problem is that the influence of a given input on the hidden layer, and therefore on the network output, either decays or blows up exponentially as it cycles around the network’s recurrent connections. This effect is often referred to in the literature as the vanishing gradient problem (Hochreiter, 1991; Hochreiter et al., 2001a; Bengio et al., 1994). The vanishing gradient problem is illustrated schematically in Figure 4.1

Referência(s)