Long short term memory (LSTM) models are variant of RNNs. They are modified to try to remember short-term and long-term dependencies . The purpose of memory cells is to remember things for a long time.

LSTMs were the practical analogy of convolutional neural networks for RNNs

  • Forget gate
    • If element ‘j’ of is 0, then we clear element from the memory (set it to 0).
    • If it is 1, then we keep the old value.
    • “Given the input and previous activation, are the elements in memory still relevant?”
  • Input gate
    • If element ‘j’ of is 0, then we do not add any new information to (no input).
    • If it is 1, then we “value” to the memory (where “value” is also a function of input and previous at ).
    • “Given the input and previous activation, should I write something new to memory?”
  • Output gate
    • If element ‘j’ of is 0, then we do not read value from the memory (no output).
    • If it is 1, then we load from the memory.
    • “Given the input and previous activation, should I read what is in memory?”