Long short term memory (LSTM) models are variant of RNNs. They are modified to try to remember short-term and long-term dependencies . The purpose of memory cells is to remember things for a long time.
LSTMs were the practical analogy of convolutional neural networks for RNNs
- Forget gate
- If element ‘j’ of is 0, then we clear element from the memory (set it to 0).
- If it is 1, then we keep the old value.
- “Given the input and previous activation, are the elements in memory still relevant?”
- Input gate
- If element ‘j’ of is 0, then we do not add any new information to (no input).
- If it is 1, then we “value” to the memory (where “value” is also a function of input and previous at ).
- “Given the input and previous activation, should I write something new to memory?”
- Output gate
- If element ‘j’ of is 0, then we do not read value from the memory (no output).
- If it is 1, then we load from the memory.
- “Given the input and previous activation, should I read what is in memory?”