The need for long term memory
I will discuss an example of a problem motivated by optical character recognition. In this problem we study decoding messages written with characters 'X' and 'O', but we can only see the upper half of the characters. For instance, the message 'XOOXXOO' may look like 'VMWM'. This simplifying writing system has the property that it cannot be decoded by looking at a fixed window of characters, but requires remembering something from a distant past. I will also show a hand-crafted neural network which solves the decoding problem for these messages, which is a Recurrent Neural Network (RNN) with long and short term memory, and is closely related to LSTM (Long-Short Term Memory) RNNs known from literature. This example appears to be the only known example where the need for long memory is demonstrated, and yet the neural network solving the problem can be found by hand.